Qiu Kexin, Lee JoongHo, Kim HanByeol, Yoon Seokhyun, Kang Keunsoo
Department of Computer Science, Dankook University, Yongin 16890, Korea.
Department of Electronics and Electrical Engineering, Dankook University, Yongin 16890, Korea.
Genomics Inform. 2021 Mar;19(1):e10. doi: 10.5808/gi.20076. Epub 2021 Mar 26.
Although many models have been proposed to accurately predict the response of drugs in cell lines recent years, understanding the genome related to drug response is also the key for completing oncology precision medicine. In this paper, based on the cancer cell line gene expression and the drug response data, we established a reliable and accurate drug response prediction model and found predictor genes for some drugs of interest. To this end, we first performed pre-selection of genes based on the Pearson correlation coefficient and then used ElasticNet regression model for drug response prediction and fine gene selection. To find more reliable set of predictor genes, we performed regression twice for each drug, one with IC50 and the other with area under the curve (AUC) (or activity area). For the 12 drugs we tested, the predictive performance in terms of Pearson correlation coefficient exceeded 0.6 and the highest one was 17-AAG for which Pearson correlation coefficient was 0.811 for IC50 and 0.81 for AUC. We identify common predictor genes for IC50 and AUC, with which the performance was similar to those with genes separately found for IC50 and AUC, but with much smaller number of predictor genes. By using only common predictor genes, the highest performance was AZD6244 (0.8016 for IC50, 0.7945 for AUC) with 321 predictor genes.
尽管近年来已经提出了许多模型来准确预测细胞系中的药物反应,但了解与药物反应相关的基因组也是完成肿瘤精准医学的关键。在本文中,基于癌细胞系基因表达和药物反应数据,我们建立了一个可靠且准确的药物反应预测模型,并找到了一些感兴趣药物的预测基因。为此,我们首先基于皮尔逊相关系数对基因进行预选,然后使用弹性网络回归模型进行药物反应预测和精细基因选择。为了找到更可靠的预测基因集,我们对每种药物进行了两次回归,一次使用IC50,另一次使用曲线下面积(AUC)(或活性面积)。对于我们测试的12种药物,皮尔逊相关系数方面的预测性能超过0.6,最高的是17-AAG,其IC50的皮尔逊相关系数为0.811,AUC的皮尔逊相关系数为0.81。我们确定了IC50和AUC的共同预测基因,使用这些基因的性能与分别为IC50和AUC找到的基因的性能相似,但预测基因数量要少得多。仅使用共同预测基因时,性能最高的是AZD6244(IC50为0.8016,AUC为0.7945),有321个预测基因。