Centre for Cellular and Molecular Biology Innovates on AWS to Advance Genomics Research in India

Centre for Cellular and Molecular Biology Innovates on AWS to Advance Genomics Research in India

3 mins readComment FOLLOW US
Pallavi
Pallavi Pathak
Assistant Manager Content
New Delhi, Updated on Sep 25, 2023 14:53 IST

AWS helps reduce the time taken for genomics analysis by up to 98%, accelerating research efforts in the study of genetics and human diseases.

Centre for Cellular and Molecular Biology Innovates on AWS to Advance Genomics Research in India

Amazon Web Services (AWS) India Private Limited today announced that the Centre for Cellular and Molecular Biology (CCMB), a premier research organisation focused on modern molecular biology and population-scale genomics, has chosen AWS as its preferred cloud provider to accelerate its genomics research projects. 

Operating under the direction of the Council of Scientific and Industrial Research (CSIR), one of CCMB’s focus areas is the study of genetic material, how it varies among populations, and how the variance leads to disparities in human health and disease.

Life sciences and genomics research organisations need to access, store, and analyse large amounts of data, generated from next-generation high-throughput sequencers. Previously, these organisations have relied on on-premises servers to meet their storage and computing needs. The data-intensive nature of genomics research meant that CCMB had to procure more on-premises storage frequently to manage petabyte-scale datasets and store the raw data and the resultant output files generated from secondary and tertiary analysis. CCMB was also relying on on-premises high-performance computing (HPC) clusters to perform this analysis, which was prone to downtime, impacting research timelines and output. Using on-premises servers created challenges for scalability and performance, so CCMB turned to cloud computing to seamlessly scale up its data storage and analysis needs. 

CCMB moved 83 terabytes of genomics data from on-premises servers to AWS using AWS Snowball, an offline data transport service that uses secure devices to transfer large amounts of data into and out of the AWS Cloud without traversing the internet. It then migrated its genomic analysis toolkit and bioinformatics data pipelines for secondary analysis to Amazon Genomics CLI, an open source tool that enables genomics organisations to process raw genomics and biological data. CCMB also successfully accessed multiple genomics databases from the Registry of Open Data on AWS (RODA) without having to download these locally for processing, saving months of data download time, and benefiting from the access to documented sources of truth.

Running on AWS, CCMB performed short tandem repeat (STR) genotyping — an analysis to determine a person’s DNA profile — on 3,200 samples from the 1000 Genomes Project, an international research effort to establish a detailed catalogue of human genetic variation. Using services such as Amazon Aurora, Amazon Elastic Compute Cloud (Amazon EC2), EC2 Auto Scaling, Amazon Simple Storage Service (Amazon S3), and AWS Batch, CCMB was able to reduce the time taken for research analysis by up to 98%, from 550 days to just nine days on average.

In another project, CCMB has started analysing breast cancer samples to identify molecular signatures of triple negative breast cancers among the Indian population. Using CPU and GPU-accelerated computing on AWS Cloud, CCMB brought down the time taken of analysis per sample by 50 to 70%.

CCMB also used AWS graphics processing unit (GPU) instances to train and test machine learning (ML) neural network models on long-read data[1] sequenced using Oxford Nanopore sequencers to detect DNA modifications associated with various diseases, including cancer, neurodegenerative disorders, and cardiovascular diseases. It achieved an accuracy of more than 91%, and reduced the time taken to train these models from several days on their on-premise servers to approximately three to four hours per dataset on AWS.

CCMB joins a list of premier genomics research initiatives around the world running their genomics research on AWS, including organisations such as AstraZeneca, CSIRO, GRAIL, Illumina, Melbourne Genomics Health Alliance, National Institutes of Health, Regeneron, and Stanford University.

Read more:

Follow Shiksha.com for latest education news in detail on Exam Results, Dates, Admit Cards, & Schedules, Colleges & Universities news related to Admissions & Courses, Board exams, Scholarships, Careers, Education Events, New education policies & Regulations.
To get in touch with Shiksha news team, please write to us at news@shiksha.com

About the Author
author-image
Pallavi Pathak
Assistant Manager Content

Pallavi is a versatile writer with around eight years of experience in digital content. She has written content for both Indian and International publications and has a solid background in journalism and communicati... Read Full Bio