Poole William, Gibbs David L, Shmulevich Ilya, Bernard Brady, Knijnenburg Theo A
Institute for Systems Biology, Seattle, WA 98109-5263, USA.
Bioinformatics. 2016 Sep 1;32(17):i430-i436. doi: 10.1093/bioinformatics/btw438.
Combining P-values from multiple statistical tests is a common exercise in bioinformatics. However, this procedure is non-trivial for dependent P-values. Here, we discuss an empirical adaptation of Brown's method (an extension of Fisher's method) for combining dependent P-values which is appropriate for the large and correlated datasets found in high-throughput biology.
We show that the Empirical Brown's method (EBM) outperforms Fisher's method as well as alternative approaches for combining dependent P-values using both noisy simulated data and gene expression data from The Cancer Genome Atlas.
The Empirical Brown's method is available in Python, R, and MATLAB and can be obtained from https://github.com/IlyaLab/CombiningDependentPvalues UsingEBM The R code is also available as a Bioconductor package from https://www.bioconductor.org/packages/devel/bioc/html/EmpiricalBrownsMethod.html
Theo.Knijnenburg@systemsbiology.org
Supplementary data are available at Bioinformatics online.
在生物信息学中,合并多个统计检验的P值是一项常见操作。然而,对于相关的P值,此过程并非易事。在此,我们讨论一种对布朗方法(费舍尔方法的扩展)的经验性调整,用于合并相关的P值,该方法适用于高通量生物学中发现的大型相关数据集。
我们表明,经验性布朗方法(EBM)在使用噪声模拟数据和来自癌症基因组图谱的基因表达数据合并相关P值时,优于费舍尔方法以及其他替代方法。
经验性布朗方法在Python、R和MATLAB中可用,可从https://github.com/IlyaLab/CombiningDependentPvaluesUsingEBM获取。R代码也可作为一个生物导体包从https://www.bioconductor.org/packages/devel/bioc/html/EmpiricalBrownsMethod.html获取。
Theo.Knijnenburg@systemsbiology.org
补充数据可在《生物信息学》在线获取。