Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands.
Department of Economics, Management and Statistics, University of Milano-Bicocca, Milan, Italy.
Biom J. 2024 Jul;66(5):e202300075. doi: 10.1002/bimj.202300075.
Closed testing has recently been shown to be optimal for simultaneous true discovery proportion control. It is, however, challenging to construct true discovery guarantee procedures in such a way that it focuses power on some feature sets chosen by users based on their specific interest or expertise. We propose a procedure that allows users to target power on prespecified feature sets, that is, "focus sets." Still, the method also allows inference for feature sets chosen post hoc, that is, "nonfocus sets," for which we deduce a true discovery lower confidence bound by interpolation. Our procedure is built from partial true discovery guarantee procedures combined with Holm's procedure and is a conservative shortcut to the closed testing procedure. A simulation study confirms that the statistical power of our method is relatively high for focus sets, at the cost of power for nonfocus sets, as desired. In addition, we investigate its power property for sets with specific structures, for example, trees and directed acyclic graphs. We also compare our method with AdaFilter in the context of replicability analysis. The application of our method is illustrated with a gene ontology analysis in gene expression data.
封闭测试最近被证明是同时进行真实发现比例控制的最佳方法。然而,构建真正的发现保证程序是具有挑战性的,因为它将重点放在用户根据特定兴趣或专业知识选择的某些特征集上。我们提出了一种允许用户针对特定特征集(即“焦点集”)集中功率的程序。不过,该方法还允许对事后选择的特征集(即“非焦点集”)进行推断,我们通过插值推导出该特征集的真实发现下限置信区间。我们的程序是由部分真实发现保证程序与 Holm 程序相结合构建的,是封闭测试程序的保守捷径。一项模拟研究证实,对于焦点集,我们的方法的统计功效相对较高,而对于非焦点集,则以所需的功效为代价。此外,我们还研究了其在具有特定结构的集合(例如树和有向无环图)中的功效特性。我们还将我们的方法与 AdaFilter 在可重复性分析方面进行了比较。我们的方法在基因表达数据中的基因本体分析中得到了应用。