Liu Bing, de la Fuente Alberto, Hoeschele Ina
Department of Statistics, Virginia Polytechnic Institute and State University, Blacksburg, Virginia 24061-0477, USA.
Genetics. 2008 Mar;178(3):1763-76. doi: 10.1534/genetics.107.080069. Epub 2008 Feb 3.
Our goal is gene network inference in genetical genomics or systems genetics experiments. For species where sequence information is available, we first perform expression quantitative trait locus (eQTL) mapping by jointly utilizing cis-, cis-trans-, and trans-regulation. After using local structural models to identify regulator-target pairs for each eQTL, we construct an encompassing directed network (EDN) by assembling all retained regulator-target relationships. The EDN has nodes corresponding to expressed genes and eQTL and directed edges from eQTL to cis-regulated target genes, from cis-regulated genes to cis-trans-regulated target genes, from trans-regulator genes to target genes, and from trans-eQTL to target genes. For network inference within the strongly constrained search space defined by the EDN, we propose structural equation modeling (SEM), because it can model cyclic networks and the EDN indeed contains feedback relationships. On the basis of a factorization of the likelihood and the constrained search space, our SEM algorithm infers networks involving several hundred genes and eQTL. Structure inference is based on a penalized likelihood ratio and an adaptation of Occam's window model selection. The SEM algorithm was evaluated using data simulated with nonlinear ordinary differential equations and known cyclic network topologies and was applied to a real yeast data set.
我们的目标是在遗传基因组学或系统遗传学实验中进行基因网络推断。对于有序列信息可用的物种,我们首先通过联合利用顺式、顺式-反式和反式调控来进行表达定量性状位点(eQTL)定位。在使用局部结构模型识别每个eQTL的调控因子-靶标对之后,我们通过组装所有保留的调控因子-靶标关系来构建一个包含性的有向网络(EDN)。EDN具有对应于表达基因和eQTL的节点,以及从eQTL到顺式调控靶基因、从顺式调控基因到顺式-反式调控靶基因、从反式调控基因到靶基因、从反式eQTL到靶基因的有向边。对于在EDN定义的强约束搜索空间内进行网络推断,我们提出了结构方程模型(SEM),因为它可以对循环网络进行建模,而EDN确实包含反馈关系。基于似然分解和约束搜索空间,我们的SEM算法可以推断涉及数百个基因和eQTL的网络。结构推断基于惩罚似然比和奥卡姆窗口模型选择的一种变体。使用非线性常微分方程模拟的数据和已知的循环网络拓扑结构对SEM算法进行了评估,并将其应用于一个真实的酵母数据集。