Suppr超能文献

错误发现率:一项新举措。

False discovery rates: a new deal.

作者信息

Stephens Matthew

出版信息

Biostatistics. 2017 Apr 1;18(2):275-294. doi: 10.1093/biostatistics/kxw041.

Abstract

We introduce a new Empirical Bayes approach for large-scale hypothesis testing, including estimating false discovery rates (FDRs), and effect sizes. This approach has two key differences from existing approaches to FDR analysis. First, it assumes that the distribution of the actual (unobserved) effects is unimodal, with a mode at 0. This "unimodal assumption" (UA), although natural in many contexts, is not usually incorporated into standard FDR analysis, and we demonstrate how incorporating it brings many benefits. Specifically, the UA facilitates efficient and robust computation-estimating the unimodal distribution involves solving a simple convex optimization problem-and enables more accurate inferences provided that it holds. Second, the method takes as its input two numbers for each test (an effect size estimate and corresponding standard error), rather than the one number usually used ($p$ value or $z$ score). When available, using two numbers instead of one helps account for variation in measurement precision across tests. It also facilitates estimation of effects, and unlike standard FDR methods, our approach provides interval estimates (credible regions) for each effect in addition to measures of significance. To provide a bridge between interval estimates and significance measures, we introduce the term "local false sign rate" to refer to the probability of getting the sign of an effect wrong and argue that it is a superior measure of significance than the local FDR because it is both more generally applicable and can be more robustly estimated. Our methods are implemented in an R package ashr available from http://github.com/stephens999/ashr.

摘要

我们引入了一种用于大规模假设检验的新经验贝叶斯方法,包括估计错误发现率(FDR)和效应大小。这种方法与现有的FDR分析方法有两个关键区别。首先,它假设实际(未观察到的)效应的分布是单峰的,峰值在0处。这种“单峰假设”(UA)虽然在许多情况下很自然,但通常没有纳入标准的FDR分析中,我们展示了纳入它如何带来许多好处。具体来说,UA有助于高效且稳健的计算——估计单峰分布涉及解决一个简单的凸优化问题——并且只要成立就能实现更准确的推断。其次,该方法将每个检验的两个数字作为输入(一个效应大小估计值和相应的标准误差),而不是通常使用的一个数字(p值或z分数)。如果有可用数据,使用两个数字而不是一个有助于考虑不同检验中测量精度的差异。它还便于效应估计,并且与标准FDR方法不同,我们的方法除了显著性度量外,还为每个效应提供区间估计(可信区域)。为了在区间估计和显著性度量之间架起一座桥梁,我们引入了“局部错误符号率”一词来指代效应符号错误的概率,并认为它是比局部FDR更好的显著性度量,因为它更具普遍适用性且能更稳健地估计。我们的方法在一个名为ashr的R包中实现,可从http://github.com/stephens999/ashr获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1f32/5379932/affbe27f0a5d/kxw041F1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验