School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094,China.
College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023,China.
Comb Chem High Throughput Screen. 2022;25(1):38-52. doi: 10.2174/1386207323666201204140438.
Missense mutation (MM) may lead to various human diseases by disabling proteins. Accurate prediction of MM is important and challenging for both protein function annotation and drug design. Although several computational methods yielded acceptable success rates, there is still room for further enhancing the prediction performance of MM.
In the present study, we designed a new feature extracting method, which considers the impact degree of residues in the microenvironment range to the mutation site. Stringent cross-validation and independent test on benchmark datasets were performed to evaluate the efficacy of the proposed feature extracting method. Furthermore, three heterogeneous prediction models were trained and then ensembled for the final prediction. By combining the feature representation method and classifier ensemble technique, we reported a novel MM predictor called TargetMM for identifying the pathogenic mutations from the neutral ones.
Comparison outcomes based on statistical evaluation demonstrate that TargetMM outperforms the prior advanced methods on the independent test data. The source codes and benchmark datasets of TargetMM are freely available at https://github.com/sera616/TargetMM.git for academic use.
错义突变 (MM) 可能通过使蛋白质失活导致各种人类疾病。准确预测 MM 对于蛋白质功能注释和药物设计都非常重要且具有挑战性。尽管有几种计算方法取得了可接受的成功率,但仍有进一步提高 MM 预测性能的空间。
在本研究中,我们设计了一种新的特征提取方法,该方法考虑了微环境范围内残基对突变位点的影响程度。在基准数据集上进行了严格的交叉验证和独立测试,以评估所提出的特征提取方法的功效。此外,还训练了三个异构预测模型,然后进行集成以进行最终预测。通过结合特征表示方法和分类器集成技术,我们报告了一种名为 TargetMM 的新型 MM 预测器,用于从中性突变中识别致病性突变。
基于统计评估的比较结果表明,TargetMM 在独立测试数据上优于先前的先进方法。TargetMM 的源代码和基准数据集可在 https://github.com/sera616/TargetMM.git 上免费获取,供学术使用。