Spark DataFrames: Data Science and Engineering at Scale for Python


Details
We've had a super exciting last-minute opportunity come up!
Reynold Xin, co-founder of Databricks and co-creator of Apache Spark, is in town for ApacheCon and graciously offered to speak to the local Austin Python and Austin Data communities!
The nice folks at WeWork stepped up and offered to host and is providing beer and soda. Continuum Analytics will provide the pizza. Note that we have a limited number of spots available, so please RSVP and if you can't make it, please updated your response so others can take your spot.
Detailed information about Reynold's talk:
Spark DataFrames: Data Science and Engineering at Scale for Python
Inspired by R and Pandas, DataFrame in Spark provides concise, powerful programmatic interfaces designed for structured data manipulation. In particular, it features:
• Ability to scale from kilobytes of data on a single laptop to petabytes on a large cluster
• Support for a wide array of data formats and storage systems
• State-of-the-art optimization and code generation through the Spark SQL Catalyst optimizer
• Seamless integration with all big data tooling and infrastructure via Spark
• APIs for Python, Java, Scala, and R (in development via SparkR)
One notable thing about the new DataFrame API is that we have taken a Python first approach -- i.e. many of the APIs were designed with Python as a first-class citizen, even before we consider the JVM variants. We think this will be an important addition to the PyData ecosystem.
In this talk, we will introduce this new abstraction on Spark, and also dive into the internal implementations of DataFrame abstraction, and end with how we implemented the Python API.
WeWork provides small businesses, startups, and freelancers with beautiful workspace, inspiring community, and meaningful services. With weekly events, personalized support, flexibility, and access to thousands of like-minded entrepreneurs around the world - WeWork is the perfect place to grow your business in 2015.
The WeWork Congress location sits in the heart of downtown Austin at 6th St. and Congress Ave. To learn more about joining the community, email joinus@wework.com or call 855.593.9675.

Spark DataFrames: Data Science and Engineering at Scale for Python