He Dan, Furlotte Nicholas A, Hormozdiari Farhad, Joo Jong Wha J, Wadia Akshay, Ostrovsky Rafail, Sahai Amit, Eskin Eleazar
Department of Computer Science, University of California, Los Angeles, Los Angeles, California 90095,USA;
Genome Res. 2014 Apr;24(4):664-72. doi: 10.1101/gr.153346.112. Epub 2014 Mar 10.
The development of high-throughput genomic technologies has impacted many areas of genetic research. While many applications of these technologies focus on the discovery of genes involved in disease from population samples, applications of genomic technologies to an individual's genome or personal genomics have recently gained much interest. One such application is the identification of relatives from genetic data. In this application, genetic information from a set of individuals is collected in a database, and each pair of individuals is compared in order to identify genetic relatives. An inherent issue that arises in the identification of relatives is privacy. In this article, we propose a method for identifying genetic relatives without compromising privacy by taking advantage of novel cryptographic techniques customized for secure and private comparison of genetic information. We demonstrate the utility of these techniques by allowing a pair of individuals to discover whether or not they are related without compromising their genetic information or revealing it to a third party. The idea is that individuals only share enough special-purpose cryptographically protected information with each other to identify whether or not they are relatives, but not enough to expose any information about their genomes. We show in HapMap and 1000 Genomes data that our method can recover first- and second-order genetic relationships and, through simulations, show that our method can identify relationships as distant as third cousins while preserving privacy.
高通量基因组技术的发展已经影响了基因研究的许多领域。虽然这些技术的许多应用集中于从群体样本中发现与疾病相关的基因,但基因组技术在个体基因组或个人基因组学方面的应用最近引起了广泛关注。其中一个应用是从遗传数据中识别亲属。在这个应用中,一组个体的遗传信息被收集到一个数据库中,并且对每对个体进行比较以识别遗传亲属。在亲属识别中出现的一个固有问题是隐私。在本文中,我们提出了一种利用为安全和私密地比较遗传信息而定制的新型加密技术来识别遗传亲属而不损害隐私的方法。我们通过允许一对个体发现他们是否有亲属关系而不损害他们的遗传信息或向第三方透露该信息,来证明这些技术的实用性。其理念是个体彼此仅共享足够的经过特殊用途加密保护的信息以确定他们是否有亲属关系,但不足以暴露任何关于其基因组的信息。我们在HapMap和千人基因组数据中表明,我们的方法可以恢复一阶和二阶遗传关系,并且通过模拟表明,我们的方法可以识别远至第三代堂表亲的关系,同时保护隐私。