Institute of Information Science, Academia Sinica, Taipei, Taiwan.
PLoS One. 2009 Dec 7;4(12):e8116. doi: 10.1371/journal.pone.0008116.
Selecting an appropriate substitution model and deriving a tree topology for a given sequence set are essential in phylogenetic analysis. However, such time consuming, computationally intensive tasks rely on knowledge of substitution model theories and related expertise to run through all possible combinations of several separate programs. To ensure a thorough and efficient analysis and avert tedious manipulations of various programs, this work presents an intuitive framework, the phylogenetic reconstruction with automatic likelihood model selectors (PALM), with convincing, updated algorithms and a best-fit model selection mechanism for seamless phylogenetic analysis.
As an integrated framework of ClustalW, PhyML, MODELTEST, ProtTest, and several in-house programs, PALM evaluates the fitness of 56 substitution models for nucleotide sequences and 112 substitution models for protein sequences with scores in various criteria. The input for PALM can be either sequences in FASTA format or a sequence alignment file in PHYLIP format. To accelerate the computing of maximum likelihood and bootstrapping, this work integrates MPICH2/PhyML, PalmMonitor and Palm job controller across several machines with multiple processors and adopts the task parallelism approach. Moreover, an intuitive and interactive web component, PalmTree, is developed for displaying and operating the output tree with options of tree rooting, branches swapping, viewing the branch length values, and viewing bootstrapping score, as well as removing nodes to restart analysis iteratively.
The workflow of PALM is straightforward and coherent. Via a succinct, user-friendly interface, researchers unfamiliar with phylogenetic analysis can easily use this server to submit sequences, retrieve the output, and re-submit a job based on a previous result if some sequences are to be deleted or added for phylogenetic reconstruction. PALM results in an inference of phylogenetic relationship not only by vanquishing the computation difficulty of ML methods but also providing statistic methods for model selection and bootstrapping. The proposed approach can reduce calculation time, which is particularly relevant when querying a large data set. PALM can be accessed online at http://palm.iis.sinica.edu.tw.
在系统发育分析中,选择合适的替代模型并为给定的序列集推导出树拓扑结构是至关重要的。然而,这些耗时且计算密集的任务依赖于对替代模型理论和相关专业知识的了解,以便运行多个单独程序的所有可能组合。为了确保彻底高效的分析并避免对各种程序的繁琐操作,这项工作提出了一个直观的框架,即具有自动似然模型选择器的系统发育重建(PALM),它具有令人信服的、更新的算法和最佳拟合模型选择机制,用于无缝的系统发育分析。
作为 ClustalW、PhyML、MODELTEST、ProtTest 和几个内部程序的集成框架,PALM 会根据各项标准的得分来评估 56 种核苷酸序列替代模型和 112 种蛋白质序列替代模型对序列的适合程度。PALM 的输入可以是 FASTA 格式的序列或 PHYLIP 格式的序列比对文件。为了加速最大似然和引导的计算,这项工作整合了 MPICH2/PhyML、PalmMonitor 和 Palm 作业控制器,以便在多台具有多个处理器的机器上跨多个任务使用并行处理的方法。此外,还开发了一个直观且交互式的 Web 组件,即 PalmTree,用于显示和操作输出树,并提供了树的根节点选择、分支交换、查看分支长度值以及查看引导分数等选项,也可以删除节点以迭代地重新开始分析。
PALM 的工作流程简单明了且连贯一致。通过简洁易用的界面,不熟悉系统发育分析的研究人员可以轻松地使用此服务器提交序列、检索输出结果,并在需要删除或添加序列进行系统发育重建时,基于之前的结果重新提交作业。PALM 通过克服 ML 方法的计算难度,不仅提供了推断系统发育关系的方法,还提供了统计方法来选择模型和进行引导。所提出的方法可以减少计算时间,在查询大型数据集时这一点尤为相关。PALM 可在 http://palm.iis.sinica.edu.tw 在线访问。