A guide to Python frameworks for Hadoop

Details
Distributed computing frameworks like Hadoop have revolutionized our ability to process large amounts of data. Using these tools typically requires writing complex programs in lower-level languages like Java; however, data scientists and analysts prefer to spend time in higher-level languages, such as Python. In order to address this gap, multiple open-source Python frameworks have been built to enable simple, user-friendly access to Hadoop’s underlying systems. This talk will review the different available frameworks, including a comparison of performance, ease of use/installation, differences in implementation, and other features.
Bio
Uri Laserson is a data scientist at Cloudera. Previously, he received his PhD from MIT developing applications of high-throughput DNA sequencing to immunology. During that time, he co-founded Good Start Genetics, a next-generation diagnostics company focused on genetic carrier screening. In 2012 he was selected to Forbes's list of 30 under 30.

A guide to Python frameworks for Hadoop