Phan John H, Young Andrew N, Wang May D
IEEE Trans Biomed Eng. 2013 Dec;60(12):3364-7. doi: 10.1109/TBME.2012.2212438. Epub 2012 Aug 8.
We have developed omniBiomarker, a web-based application that uses knowledge from the NCI Cancer Gene Index to guide the selection of biologically relevant algorithms for identifying biomarkers. Biomarker identification from high-throughput genomic expression data is difficult because of data properties (i.e., small-sample size compared to large-feature size) as well as the large number of available feature selection algorithms. Thus, it is unclear which algorithm should be used for a particular dataset. These factors lead to instability in biomarker identification and affect the reproducibility of results. We introduce a method for computing the biological relevance of feature selection algorithms using an externally validated knowledge base of manually curated cancer biomarkers. Results suggest that knowledge-driven biomarker identification can improve microarray-based clinical prediction performance. omniBiomarker can be accessed at http://omnibiomarker.bme.gatech.edu/.
我们开发了omniBiomarker,这是一个基于网络的应用程序,它利用来自美国国立癌症研究所癌症基因索引的知识来指导选择用于识别生物标志物的生物学相关算法。从高通量基因组表达数据中识别生物标志物很困难,这是由于数据特性(即与大特征数量相比样本量小)以及大量可用的特征选择算法。因此,不清楚对于特定数据集应使用哪种算法。这些因素导致生物标志物识别的不稳定性,并影响结果的可重复性。我们介绍了一种使用经过外部验证的人工策划癌症生物标志物知识库来计算特征选择算法的生物学相关性的方法。结果表明,知识驱动的生物标志物识别可以提高基于微阵列的临床预测性能。可通过http://omnibiomarker.bme.gatech.edu/访问omniBiomarker。