Publication
Big Data 2015
Conference paper

Big Data: Cloud computing in genomics applications

View publication

Abstract

Healthcare applications typically require big data management as well as intensive computation. This is especially true with recently developed next generation sequencing technology which increases interests in processing the huge amount of information in a timely fashion. In this paper, we focus on testing whether the healthcare applications can scale well on commercial big data platforms that implement MapReduce framework. We selected short read sequence alignment and assembly workloads in genome analysis workloads, and chose Bowtie, Blast and Contrail-bio which are publically available applications designed to run on the Hadoop MapReduce framework. To speed-up the processes we compressed the intermediate data using various compression schemes the compression schemes are compared. The test results are very promising and indicate that the wide range of genomic analysis workflows can be optimized on MapReduce frameworks with great computational efficiency and scalability.

Date

Publication

Big Data 2015

Authors

Share