Herzig Anthony F, Rubinacci Simone, Marenne Gaëlle, Perdry Hervé, Deleuze Jean-François, Dina Christian, Barc Julien, Redon Richard, Delaneau Olivier, Génin Emmanuelle
Inserm, Université de Bretagne-Occidentale, EFS, UMR 1078, GGB, Brest F-29200, France.
Institute for Molecular Medicine Finland, University of Helsinki, Helsinki 00290, Finland.
G3 (Bethesda). 2025 Apr 17;15(4). doi: 10.1093/g3journal/jkae287.
Genotype-phenotype association tests are typically adjusted for population stratification using principal components that are estimated genome-wide. This lacks resolution when analyzing populations with fine structure and/or individuals with fine levels of admixture. This can affect power and precision, and is a particularly relevant consideration when control individuals are recruited using geographic selection criteria. Such is the case in France where we have recently created reference panels of individuals anchored to different geographic regions. To make correct comparisons against case groups, who would likely be gathered from large urban areas, new methods are needed. We present SURFBAT (a surrogate family based association test), which performs an approximation of the transmission-disequilibrium test. Our method hinges on the application of genotype imputation algorithms to match similar haplotypes between the case and control groups. This permits us to approximate local ancestry informed posterior probabilities of un-transmitted parental alleles of each case individual. This is achieved by assuming haplotypes from the imputation panel are well-matched for ancestry with the case individuals. When the first haplotype of an individual from the imputation panel matches that of a case individual, it is assumed that the second haplotype of the same reference individual can be used as a locally ancestry matched control haplotype and to approximately impute un-transmitted parental alleles. SURFBAT provides an association test that is inherently robust to fine-scale population stratification and opens up the possibility of efficiently using large imputation reference panels as control groups for association testing. In contrast to other methods for association testing that incorporate local-ancestry inference, SURFBAT does not require a set of ancestry groups to be defined, nor for local ancestry to be explicitly estimated. We demonstrate the interest of our tool on simulated datasets, as well as on a real-data example for a group of case individuals affected by Brugada syndrome.
基因型-表型关联测试通常使用全基因组估计的主成分对群体分层进行校正。在分析具有精细结构的群体和/或具有精细混合水平的个体时,这缺乏分辨率。这可能会影响检验效能和精度,并且在使用地理选择标准招募对照个体时是一个特别需要考虑的因素。法国就是这种情况,我们最近创建了锚定到不同地理区域的个体参考面板。为了与可能从大城市地区收集的病例组进行正确比较,需要新的方法。我们提出了SURFBAT(基于替代家族的关联测试),它执行传递不平衡测试的近似。我们的方法依赖于基因型填充算法的应用,以匹配病例组和对照组之间相似的单倍型。这使我们能够近似估计每个病例个体未传递的亲本等位基因的局部祖先信息后验概率。这是通过假设填充面板中的单倍型在祖先上与病例个体良好匹配来实现的。当填充面板中一个个体的第一个单倍型与病例个体的单倍型匹配时,假设同一参考个体的第二个单倍型可以用作局部祖先匹配的对照单倍型,并近似推断未传递的亲本等位基因。SURFBAT提供了一种关联测试,该测试对精细尺度的群体分层具有内在的稳健性,并开辟了有效使用大型填充参考面板作为关联测试对照组的可能性。与其他纳入局部祖先推断的关联测试方法相比,SURFBAT不需要定义一组祖先群体,也不需要明确估计局部祖先。我们在模拟数据集以及一组受Brugada综合征影响的病例个体真实数据示例上展示了我们工具的优势。