Agile Data Science 2.0: Full-Stack Analytics app dev using Spark and Kafka

Hosted by Metis: San Francisco Data Science

Public group


NOTE: There will be a raffle for three free digital copies of Agile Data Science 2.0!

Join us Thursday, April 13th to hear from Russell Jurney, author of Agile Data Science 2.0 ( from O'Reilly.

Food and drinks will be served!

Event Schedule:
6:00 - 6:30 -- Guests Arrive, Enjoy Food & Drinks
6:30 - 7:00 -- Russell Presents
7:00 - 7:30 -- Q&A + Networking

Abstract: Agile Data Science 2.0 (O'Reilly 2017) defines a methodology and a software stack with which to apply the methods. *The methodology* seeks to deliver data products in short sprints by going meta and putting the focus on the applied research process itself. *The stack* is but an example of one meeting the requirements that it be utterly scalable and utterly efficient in use by application developers as well as data engineers. It includes everything needed to build a full-blown predictive system: Apache Spark, Apache Kafka, Apache Incubating Airflow, MongoDB, ElasticSearch, Apache Parquet, Python/Flask, JQuery. This talk will cover the full lifecycle of large data application development and will show how to use lessons from agile software engineering to apply data science using this full-stack to build better analytics applications. The entire lifecycle of big data application development is discussed. The system starts with plumbing, moving on to data tables, charts and search, through interactive reports, and building towards predictions in both batch and realtime (and defining the role for both), the deployment of predictive systems and how to iteratively improve predictions that prove valuable by building an experimental setup.

Bio: Russell Jurney is principal consultant at Data Syndrome, a product analytics consultancy dedicated to advancing the adoption of the development methodology Agile Data Science, as outlined in the book Agile Data Science 2.0 (O'Reilly, 2017). He has worked as a data scientist building data products for over a decade, starting in interactive web visualization and then moving towards full-stack data products, machine learning and artificial intelligence at companies such as Ning, LinkedIn, Hortonworks and Relato. He is a self taught visualization software engineer, data engineer, data scientist, writer and most recently, he's becoming a teacher. In addition to helping companies build analytics products, Data Syndrome offers live and video training courses.

Connect with us on Twitter before the event @thisismetis