Suppr超能文献

SAMP:基于比例分割氨基酸组成的集成学习模型鉴定抗菌肽

SAMP: Identifying antimicrobial peptides by an ensemble learning model based on proportionalized split amino acid composition.

作者信息

Feng Junxi, Sun Mengtao, Liu Cong, Zhang Weiwei, Xu Changmou, Wang Jieqiong, Wang Guangshun, Wan Shibiao

机构信息

Department of Biostatistics, School of Public Health, Harvard University, Boston, MA 02115, United States.

Department of Genetics, Cell Biology and Anatomy, College of Medicine, University of Nebraska Medical Center, Omaha, NE 68198, United States.

出版信息

Brief Funct Genomics. 2024 Dec 6;23(6):879-890. doi: 10.1093/bfgp/elae046.

Abstract

It is projected that 10 million deaths could be attributed to drug-resistant bacteria infections in 2050. To address this concern, identifying new-generation antibiotics is an effective way. Antimicrobial peptides (AMPs), a class of innate immune effectors, have received significant attention for their capacity to eliminate drug-resistant pathogens, including viruses, bacteria, and fungi. Recent years have witnessed widespread applications of computational methods especially machine learning (ML) and deep learning (DL) for discovering AMPs. However, existing methods only use features including compositional, physiochemical, and structural properties of peptides, which cannot fully capture sequence information from AMPs. Here, we present SAMP, an ensemble random projection (RP) based computational model that leverages a new type of feature called proportionalized split amino acid composition (PSAAC) in addition to conventional sequence-based features for AMP prediction. With this new feature set, SAMP captures the residue patterns like sorting signals at both the N-terminal and the C-terminal, while also retaining the sequence order information from the middle peptide fragments. Benchmarking tests on different balanced and imbalanced datasets demonstrate that SAMP consistently outperforms existing state-of-the-art methods, such as iAMPpred and AMPScanner V2, in terms of accuracy, Matthews correlation coefficient (MCC), G-measure, and F1-score. In addition, by leveraging an ensemble RP architecture, SAMP is scalable to processing large-scale AMP identification with further performance improvement, compared to those models without RP. To facilitate the use of SAMP, we have developed a Python package that is freely available at https://github.com/wan-mlab/SAMP.

摘要

据预测,到2050年,耐药菌感染可能导致1000万人死亡。为了解决这一问题,识别新一代抗生素是一种有效的方法。抗菌肽(AMPs)作为一类天然免疫效应物,因其能够消除包括病毒、细菌和真菌在内的耐药病原体的能力而受到广泛关注。近年来,计算方法尤其是机器学习(ML)和深度学习(DL)在发现抗菌肽方面得到了广泛应用。然而,现有方法仅使用包括肽的组成、理化和结构特性等特征,无法充分捕捉抗菌肽的序列信息。在此,我们提出了SAMP,这是一种基于集成随机投影(RP)的计算模型,除了用于抗菌肽预测的传统基于序列的特征外,还利用了一种称为比例化分割氨基酸组成(PSAAC)的新型特征。有了这个新的特征集,SAMP可以捕捉N端和C端类似分选信号的残基模式,同时还保留中间肽片段的序列顺序信息。在不同的平衡和不平衡数据集上的基准测试表明,在准确性、马修斯相关系数(MCC)、G-度量和F1分数方面,SAMP始终优于现有最先进的方法,如iAMPpred和AMPScanner V2。此外,通过利用集成RP架构,与没有RP的模型相比,SAMP在处理大规模抗菌肽识别时具有可扩展性,并且性能进一步提高。为了便于使用SAMP,我们开发了一个Python包,可在https://github.com/wan-mlab/SAMP上免费获取。

相似文献

3
4
Ensemble Machine Learning and Predicted Properties Promote Antimicrobial Peptide Identification.
Interdiscip Sci. 2024 Dec;16(4):951-965. doi: 10.1007/s12539-024-00640-z. Epub 2024 Jul 7.
6
PIP-EL: A New Ensemble Learning Method for Improved Proinflammatory Peptide Predictions.
Front Immunol. 2018 Jul 31;9:1783. doi: 10.3389/fimmu.2018.01783. eCollection 2018.
8
ECAmyloid: An amyloid predictor based on ensemble learning and comprehensive sequence-derived features.
Comput Biol Chem. 2023 Jun;104:107853. doi: 10.1016/j.compbiolchem.2023.107853. Epub 2023 Mar 23.
10
deep-AMPpred: A Deep Learning Method for Identifying Antimicrobial Peptides and Their Functional Activities.
J Chem Inf Model. 2025 Jan 27;65(2):997-1008. doi: 10.1021/acs.jcim.4c01913. Epub 2025 Jan 10.

引用本文的文献

1
Accelerating antimicrobial peptide design: Leveraging deep learning for rapid discovery.
PLoS One. 2024 Dec 20;19(12):e0315477. doi: 10.1371/journal.pone.0315477. eCollection 2024.

本文引用的文献

1
E-CLEAP: An ensemble learning model for efficient and accurate identification of antimicrobial peptides.
PLoS One. 2024 May 9;19(5):e0300125. doi: 10.1371/journal.pone.0300125. eCollection 2024.
3
Identification of potent antimicrobial peptides via a machine-learning pipeline that mines the entire space of peptide sequences.
Nat Biomed Eng. 2023 Jun;7(6):797-810. doi: 10.1038/s41551-022-00991-2. Epub 2023 Jan 12.
4
Epinecidin-1, a marine antifungal peptide, inhibits Botrytis cinerea and delays gray mold in postharvest peaches.
Food Chem. 2023 Mar 1;403:134419. doi: 10.1016/j.foodchem.2022.134419. Epub 2022 Sep 27.
5
Do deep learning models make a difference in the identification of antimicrobial peptides?
Brief Bioinform. 2022 May 13;23(3). doi: 10.1093/bib/bbac094.
6
Machine Learning Prediction of Antimicrobial Peptides.
Methods Mol Biol. 2022;2405:1-37. doi: 10.1007/978-1-0716-1855-4_1.
7
Identification of antimicrobial peptides from the human gut microbiome using deep learning.
Nat Biotechnol. 2022 Jun;40(6):921-931. doi: 10.1038/s41587-022-01226-0. Epub 2022 Mar 3.
8
Antibiotic resistance and persistence-Implications for human health and treatment perspectives.
EMBO Rep. 2020 Dec 3;21(12):e51034. doi: 10.15252/embr.202051034. Epub 2020 Dec 8.
9
Antimicrobial Peptides: Classification, Design, Application and Research Progress in Multiple Fields.
Front Microbiol. 2020 Oct 16;11:582779. doi: 10.3389/fmicb.2020.582779. eCollection 2020.
10
The Dual Role of Antimicrobial Peptides in Autoimmunity.
Front Immunol. 2020 Sep 2;11:2077. doi: 10.3389/fimmu.2020.02077. eCollection 2020.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验