Pirayre Aurélie, Couprie Camille, Bidard Frédérique, Duval Laurent, Pesquet Jean-Christophe
IFP Energies Nouvelles, 1-4 avenue de Bois-Préau, Rueil-Malmaison, 92852, France.
Université Paris-Est, Laboratoire d'Informatique Gaspard-Monge, 5 boulevard Descartes - Champs-sur-Marne, Marne-la-Vallée, 77454, France.
BMC Bioinformatics. 2015 Nov 4;16:368. doi: 10.1186/s12859-015-0754-2.
Inferring gene networks from high-throughput data constitutes an important step in the discovery of relevant regulatory relationships in organism cells. Despite the large number of available Gene Regulatory Network inference methods, the problem remains challenging: the underdetermination in the space of possible solutions requires additional constraints that incorporate a priori information on gene interactions.
Weighting all possible pairwise gene relationships by a probability of edge presence, we formulate the regulatory network inference as a discrete variational problem on graphs. We enforce biologically plausible coupling between groups and types of genes by minimizing an edge labeling functional coding for a priori structures. The optimization is carried out with Graph cuts, an approach popular in image processing and computer vision. We compare the inferred regulatory networks to results achieved by the mutual-information-based Context Likelihood of Relatedness (CLR) method and by the state-of-the-art GENIE3, winner of the DREAM4 multifactorial challenge.
Our BRANE Cut approach infers more accurately the five DREAM4 in silico networks (with improvements from 6% to 11%). On a real Escherichia coli compendium, an improvement of 11.8% compared to CLR and 3% compared to GENIE3 is obtained in terms of Area Under Precision-Recall curve. Up to 48 additional verified interactions are obtained over GENIE3 for a given precision. On this dataset involving 4345 genes, our method achieves a performance similar to that of GENIE3, while being more than seven times faster. The BRANE Cut code is available at: http://www-syscom.univ-mlv.fr/~pirayre/Codes-GRN-BRANE-cut.html.
BRANE Cut is a weighted graph thresholding method. Using biologically sound penalties and data-driven parameters, it improves three state-of-the art GRN inference methods. It is applicable as a generic network inference post-processing, due to its computational efficiency.
从高通量数据推断基因网络是发现生物细胞中相关调控关系的重要一步。尽管有大量可用的基因调控网络推断方法,但该问题仍然具有挑战性:可能解空间中的不确定性需要额外的约束条件,这些条件要纳入关于基因相互作用的先验信息。
通过边存在的概率对所有可能的成对基因关系进行加权,我们将调控网络推断表述为图上的离散变分问题。通过最小化编码先验结构的边标记泛函,我们强制基因的组和类型之间存在生物学上合理的耦合。使用图割进行优化,这是一种在图像处理和计算机视觉中流行的方法。我们将推断出的调控网络与基于互信息的相关性上下文似然(CLR)方法以及多因素挑战DREAM4的获胜者、最先进的GENIE3所得到的结果进行比较。
我们的BRANE Cut方法更准确地推断出了五个DREAM4虚拟网络(改进幅度从6%到11%)。在真实的大肠杆菌数据集中,在精确召回率曲线下面积方面,与CLR相比提高了11.8%,与GENIE3相比提高了3%。对于给定的精度,比GENIE3多获得了多达48个经过验证的相互作用。在这个包含4345个基因的数据集上,我们的方法实现了与GENIE3相似的性能,同时速度快了七倍多。BRANE Cut代码可在以下网址获取:http://www-syscom.univ-mlv.fr/~pirayre/Codes-GRN-BRANE-cut.html。
BRANE Cut是一种加权图阈值化方法。通过使用生物学上合理的惩罚和数据驱动的参数,它改进了三种最先进的基因调控网络推断方法。由于其计算效率,它可作为一种通用的网络推断后处理方法应用。