Franzo Giovanni, Fusaro Alice, Snoeck Chantal J, Dodovski Aleksandar, Van Borm Steven, Steensels Mieke, Christodoulou Vasiliki, Onita Iuliana, Burlacu Raluca, Sánchez Azucena Sánchez, Chvala Ilya A, Torchetti Mia Kim, Shittu Ismaila, Olabode Mayowa, Pastori Ambra, Schivo Alessia, Salomoni Angela, Maniero Silvia, Zambon Ilaria, Bonfante Francesco, Monne Isabella, Cecchinato Mattia, Bortolami Alessio
Department of Animal Medicine, Production and Health (MAPS), Padua University, 35020 Legnaro, Italy.
Division of Comparative Biomedical Sciences (DSBIO), Istituto Zooprofilattico Sperimentale delle Venezie, Viale dell'Università 10, 35020 Legnaro, Italy.
Viruses. 2025 Apr 14;17(4):567. doi: 10.3390/v17040567.
Newcastle disease virus (NDV) continues to present a significant challenge for vaccination due to its rapid evolution and the emergence of new variants. Although molecular and sequence data are now quickly and inexpensively produced, genetic distance rarely serves as a good proxy for cross-protection, while experimental studies to assess antigenic differences are time consuming and resource intensive. In response to these challenges, this study explores and compares several machine learning (ML) methods to predict the antigenic distance between NDV strains as determined by hemagglutination-inhibition (HI) assays. By analyzing F and HN gene sequences alongside corresponding amino acid features, we developed predictive models aimed at estimating antigenic distances. Among the models evaluated, the random forest (RF) approach outperformed traditional linear models, achieving a predictive accuracy with an R value of 0.723 compared to only 0.051 for linear models based on genetic distance alone. This significant improvement demonstrates the usefulness of applying flexible ML approaches as a rapid and reliable tool for vaccine selection, minimizing the need for labor-intensive experimental trials. Moreover, the flexibility of this ML framework holds promise for application to other infectious diseases in both animals and humans, particularly in scenarios where rapid response and ethical constraints limit conventional experimental approaches.
新城疫病毒(NDV)因其快速进化和新变种的出现,继续给疫苗接种带来重大挑战。尽管现在能够快速且低成本地生成分子和序列数据,但遗传距离很少能很好地代表交叉保护作用,而评估抗原差异的实验研究既耗时又耗费资源。为应对这些挑战,本研究探索并比较了几种机器学习(ML)方法,以预测通过血凝抑制(HI)试验确定的NDV毒株之间的抗原距离。通过分析F和HN基因序列以及相应的氨基酸特征,我们开发了旨在估计抗原距离的预测模型。在所评估的模型中,随机森林(RF)方法优于传统线性模型,预测准确率的R值达到0.723,而仅基于遗传距离的线性模型的R值为0.051。这一显著改进表明,应用灵活的ML方法作为疫苗选择的快速可靠工具非常有用,可最大限度减少对劳动密集型实验试验的需求。此外,这种ML框架的灵活性有望应用于动物和人类的其他传染病,特别是在快速反应和伦理限制阻碍传统实验方法的情况下。