Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA.
Genome Biol. 2010;11(8):128. doi: 10.1186/gb-2010-11-8-128. Epub 2010 Aug 25.
The Galaxy package empowers regular users to perform rich DNA sequence analysis through a much-needed and user-friendly graphical web interface. See research article http://genomebiology.com/2010/11/8/R86 RESEARCH HIGHLIGHT: With the advent of affordable and high-throughput DNA sequencing, sequencing is becoming an essential component in nearly every genetics lab. These data are being generated to probe sequence variations, to understand transcribed, regulated or methylated DNA elements, and to explore a host of other biological features across the tree of life and across a range of environments and conditions. Given this deluge of data, novices and experts alike are facing the daunting challenge of trying to analyze the raw sequence data computationally. With so many tools available and so many assays to analyze, how can one be expected to stay current with the state of the art? How can one be expected to learn to use each tool and construct robust end-to-end analysis pipelines, all while ensuring that input formats, command-line options, sequence databases and program libraries are set correctly? Finally, once the analysis is complete, how does one ensure the results are reproducible and transparent for others to scrutinize and study?In an article published in Genome Biology, Jeremy Goecks, Anton Nekrutenko, James Taylor and the rest of the Galaxy Team (Goecks et al. 1) make a great advance towards resolving these critical questions with the latest update to their Galaxy Project. The ambitious goal of Galaxy is to empower regular users to carry out their own computational analysis without having to be an expert in computational biology or computer science. Galaxy adds a desperately needed graphical user interface to genomics research, making data analysis universally accessible in a web browser, and freeing users from the minutiae of archaic command-line parameters, data formats and scripting languages. Data inputs and computational steps are selected from dynamic graphical menus, and the results are displayed in intuitive plots and summaries that encourage interactive workflows and the exploration of hypotheses. The underlying data analysis tools can be almost any piece of software, written in any language, but all their complexity is neatly hidden inside of Galaxy, allowing users to focus on scientific rather than technical questions.
Galaxy 软件包通过一个急需的用户友好的图形化网络界面,使普通用户能够进行丰富的 DNA 序列分析。详见研究论文:http://genomebiology.com/2010/11/8/R86
随着经济实惠、高通量的 DNA 测序技术的出现,测序成为几乎每个遗传学实验室的基本组成部分。这些数据被用来探测序列变异,了解转录、调控或甲基化的 DNA 元件,并探索生命之树以及各种环境和条件下的其他生物特征。面对如此庞大的数据洪流,新手和专家都面临着计算分析原始序列数据的艰巨挑战。有如此多的工具可供选择,需要分析的检测方法也如此之多,如何才能跟上最新技术的步伐?如何才能学会使用每种工具并构建稳健的端到端分析管道,同时确保输入格式、命令行选项、序列数据库和程序库设置正确?最后,一旦分析完成,如何确保结果可重现,以便其他人进行审查和研究?
在发表于《基因组生物学》的一篇文章中,杰里米·戈克斯(Jeremy Goecks)、安东尼·内克伦科(Anton Nekrutenko)、詹姆斯·泰勒(James Taylor)和 Galaxy 团队的其他成员(Goecks 等人,1)解决了这些关键问题,发布了 Galaxy 项目的最新更新。Galaxy 的雄心壮志是让普通用户在无需成为计算生物学或计算机科学专家的情况下,自行进行计算分析。Galaxy 为基因组学研究添加了急需的图形用户界面,使数据分析在网络浏览器中普遍可用,并使用户摆脱陈旧的命令行参数、数据格式和脚本语言的繁琐细节。数据输入和计算步骤从动态图形菜单中选择,结果以直观的图表和摘要形式显示,鼓励交互式工作流程和假设探索。基础数据分析工具可以是任何语言编写的任何软件,但它们的所有复杂性都巧妙地隐藏在 Galaxy 内部,使用户能够专注于科学问题而不是技术问题。