Ibis: operating the Python data ecosystem at Hadoop scale


Details
We are excited to have Wes McKinney give a demo and discuss the roadmap of Ibis, a new data analytics framework.
Presentation: While Python is a de-facto language for modern data engineering and data science, Python development has been confined to local data processing—thereby limiting its users to smaller data sets. Historically, to address bigger data workloads, Python developers have had to extract samples or aggregates, forcing compromises in data fidelity, adding ETL costs, and ultimately leading to a loss of productivity and addressable use cases.
Ibis, a new open source data analytics framework for Python developers, has the goal of enabling the Python data ecosystem (NumPy, pandas, etc.) to operate efficiently at Hadoop scale. To enable high performance Python at scale without the age-old JVM interoperability problems, Ibis takes advantage of unique synergies between Python and Impala, the leading open source MPP analytical query engine. In this talk, Ibis creator Wes McKinney, who was also the creator of pandas, will demo the current capabilities of Ibis as well as explain its roadmap.
Schedule
6:30 - 7:00pm Social (snacks + drinks served)
7:00 - 8:00pm Talk
8:00 - 8:45pm Social
Bio: Wes McKinney is a software engineer at Cloudera and lead developer and co-creator of Ibis. Prior to that, Wes was co-founder of DataPad, and CTO and Cofounder of Lambda Foundry, Inc. From 2010 to 2012, he served as a Python consultant to hedge funds and banks while developing pandas, a widely used Python data analysis library. From 2007 to 2010, he researched global macro and credit trading strategies at AQR Capital Management. He graduated from MIT with an S.B. in Mathematics. Wes is author of the O'Reilly book Python for Data Analysis.
Sponsors: Many thanks to LinkedIn for providing the venue and supplying snacks and drinks.
NDA requirement: Please note that our hosts require signing an NDA in order to access the venue.
Presentation recording: We are working on making arrangements to have the presentation recoded, but not streamed. We will update when plans are finalized.

Ibis: operating the Python data ecosystem at Hadoop scale