Skip to content

ADAM standard - processing of genomic big data with Spark

Photo of Wolfram Willuhn
Hosted By
Wolfram W. and 2 others
ADAM standard - processing of genomic big data with Spark

Details

Event Agenda

18:30 - 19:00

Arrival and socializing

19:00 - 20:00

Michal Okoniewski, Scientific IT Services, ETH Zürich

"ADAM standard and other ways of scalable cloud-processing of genomic big data with Spark"

Next generation sequencing (NGS) technology has become a serious computational challenge since its commercial introduction in 2008. Currently, thousands of machines worldwide produce daily billions of sequenced nucleotide base pairs of data. Due to continuous development of faster and economical sequencing technologies, processing the large amounts of data produced by high throughput sequencing technologies became the main challenge in bioinformatics. It can be solved by the new generation of software tools based on the paradigms and principles developed within the Hadoop ecosystem. This talk presents the overall perspective for data analysis software for genomics and prospects for the emerging applications with a particular emphasis on the ADAM (http://bdgenomics.org/projects/adam/) standard.

The presentation will include an introduction to analysing the genomic big data in a scalable form with ADAM. It will also include examples of using Spark, SparkR and Parquet that can be more generic in the data science.

Photo of Zürich Apache Spark Meetup group
Zürich Apache Spark Meetup
See more events
ETH Zurich / Zentrum / HG E 33.1
Rämistrasse 101 · Zürich