Clifford Robert J, Edmonson Michael N, Nguyen Cu, Buetow Kenneth H
Laboratory of Population Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA.
Bioinformatics. 2004 May 1;20(7):1006-14. doi: 10.1093/bioinformatics/bth029. Epub 2004 Jan 29.
Single nucleotide polymorphisms (SNPs) are the most common form of genetic variant in humans. SNPs causing amino acid substitutions are of particular interest as candidates for loci affecting susceptibility to complex diseases, such as diabetes and hypertension. To efficiently screen SNPs for disease association, it is important to distinguish neutral variants from deleterious ones.
We describe the use of Pfam protein motif models and the HMMER program to predict whether amino acid changes in conserved domains are likely to affect protein function. We find that the magnitude of the change in the HMMER E-value caused by an amino acid substitution is a good predictor of whether it is deleterious. We provide internet-accessible display tools for a genomewide collection of SNPs, including 7391 distinct non-synonymous coding region SNPs in 2683 genes.
单核苷酸多态性(SNP)是人类中最常见的基因变异形式。导致氨基酸替换的SNP作为影响复杂疾病(如糖尿病和高血压)易感性的基因座候选者,特别令人关注。为了有效地筛选SNP与疾病的关联,区分中性变异和有害变异很重要。
我们描述了使用Pfam蛋白质基序模型和HMMER程序来预测保守结构域中的氨基酸变化是否可能影响蛋白质功能。我们发现,氨基酸替换引起的HMMER E值变化幅度是其是否有害的良好预测指标。我们提供了可通过互联网访问的显示工具,用于全基因组SNP的收集,包括2683个基因中的7391个不同的非同义编码区SNP。