Skip to content

Details

Host-associated microbiomes—e.g. the microorganisms in the human gut—profoundly impact host health and development. Microbiologists have long studied host-associated microbial life but advances in DNA sequencing transformed modern microbiome science into a data-rich discipline. Many data science tools ubiquitous in the data science community (Pandas, Tidyverse packages, GGPlot2, Matplotlib, etc.) are commonly used by microbiome scientists. As such, data scientists from non-biology backgrounds would be valuable additions to any data-rich microbiome science operation. In this talk, I will discuss how studying microbiomes went from an exercise largely done in the laboratory to one done at the keyboard. We will learn just how microbiomes are measured via DNA sequencing and how genomics data scientists use workflow DSLs and cloud computing to author data pipelines. By the end, I hope you will see that microbiome data presents a great opportunity and challenge for the data science/engineering community in general.

Doctor Chuck Pepe-Ranney works as a genomics data scientist for AgBiome in RTP where he led the development of AgBiome's Genomic Data Platform which manages genomic data from a massive culture collection of soil microorganisms. Prior to AgBiome, Chuck studied carbon cycling in soil microbial communities at Cornell University, and hot-spring cyanobacteria at the Colorado School of Mines. Chuck also teaches the "Data Science Basics" course at the UNC-Chapel Hill Gillings School of Public Health. You can reach Chuck on Twitter (@chuck_pr) or LinkedIn.

You may also like