UC Santa Cruz Genomics Institute is collaborating with Amazon Web Services (AWS) to enable researchers to quickly and efficiently run bioinformatics pipelines on AWS global cloud infrastructure. AWS and UCSC are committed to accelerating genomics research by integrating Project Dockstore, a premier repository for science and biomedical workflows created in part by researchers at the Genomics Institute, with the recently published Amazon Genomics Command Line Interface (CLI).
Researchers at the Genomics Institute seek to understand the mechanisms of human disease, and this work typically requires huge amounts of data and processing capacity. Cloud-based platforms that store and run workflows enable analysis of this data and are equally available to all researchers, regardless of wealth and location.
Dockstore, a joint development between the UC Santa Cruz Genomics Institute and the Ontario Institute for Cancer Research, acts as an app store for bioinformatics analysis tools and is used by scientists around the world. It provides a global cloud library of analytical workflows, so researchers can easily find and use existing analytical tools, facilitating large-scale biomedical research collaborations. Dockstore follows the principles of Findability, Accessibility, Interoperability, and Reusability (FAIR) to support the reproducibility of complex bioinformatics analyses.
Integration with AWS’s new open source tool for genomics and life sciences customers enables rapid deployment and execution of Dockstore-based workflows on Amazon Genomics CLI with minimal installation and configuration. In addition to all Dockstore-based workflows, the Amazon Genomics CLI natively supports Cromwell, miniWDL, Nextflow, and SnakeMake.
“Dockstore’s ability to share bioinformatics workflows has already proven essential in federally funded projects such as NHLBI BioData Catalyst and NHGRI ANVIL that enable secure, cloud-based genomic analyses,” said Benedict Paten, associate director of the Genomics Institute and professor in the Department of Biomolecular Engineering at UCSC Baskin School of Engineering. “We are excited about this new collaboration, as it opens up a whole new category of users who can quickly use the workflows available in the cloud to accelerate their research.”
Dockstore’s new integration with Amazon Genomics CLI aligns with the technical standards established by the Global Alliance for Genomics and Health (GA4GH), a global organization that ensures common standards across genomics research projects to enable portable genomic analysis and life-saving, data-driven therapies faster. These standards include application programming interfaces (APIs) that can be used to enable interconnectivity between different computing platforms, overcoming barriers that limit productivity.
Specifically, the new integration allows Dockstore to execute workflows using the GA4GH Workflow Execution Service (WES) API. The Amazon Genomics CLI provides the WES endpoint used by Dockstore, allowing researchers to efficiently launch analyzes on AWS cloud resources with little coding or intervention.
“Amazon Genomics CLI promises to simplify genomic analysis in the cloud,” said Dr. Taha Kass-Hout, director of machine learning at Amazon Web Services. “Our new collaboration with UCSC enables rapid use of existing bioinformatics workflows through the Dockstore repository, and will further enhance opportunities for computational biologists to rapidly accelerate new directions of research while utilizing the proven global infrastructure of ‘AWS.
For more information on setting up Dockstore with WES servers, visit this Dockstore blog post.