Skip to content

Agile Data Science 2.0: Agile and Iterative Machine Learning

Photo of Luke Hosking
Hosted By
Luke H.
Agile Data Science 2.0: Agile and Iterative Machine Learning

Details

Hi Everyone,

Our next Cognitive Computing Meetup will be a free pre-event for the 2017 Data Works Summit (https://dataworkssummit.com/san-jose-2017/agenda/#20170612). Our speaker, Russell Jurney, will be presenting on Agile and Iterative Machine Learning from a Data Science perspective.

We hope that you will come see his presentation and join in the discussion.

A number of other meetups and pre-event activities will be taking place in the convention center, so this should be a good networking opportunity.

Abstract

Agile Data Science 2.0 (O'Reilly 2017) defines a methodology and a software stack with which to apply the methods. The methodology seeks to deliver data products in short sprints by going meta and putting the focus on the applied research process itself. The stack is but an example of one meeting the requirements that it be utterly scalable and utterly efficient in use by application developers as well as data engineers. It includes everything needed to build a full-blown predictive system: Apache Spark, Apache Kafka, Apache Incubating Airflow, MongoDB, ElasticSearch, Apache Parquet, Python/Flask, JQuery. This talk will cover the full lifecycle of large data application development and will show how to use lessons from agile software engineering to apply data science using this full-stack to build better analytics applications. The entire lifecycle of big data application development is discussed. The system starts with plumbing, moving on to data tables, charts and search, through interactive reports, and building towards predictions in both batch and realtime (and defining the role for both), the deployment of predictive systems and how to iteratively improve predictions that prove valuable by building an experimental setup.

Speaker Bio

Russell Jurney is principal consultant at Data Syndrome, a product analytics consultancy dedicated to advancing the adoption of the development methodology Agile Data Science, as outlined in the book Agile Data Science 2.0 (O'Reilly, 2017). He has worked as a data scientist building data products for over a decade, starting in interactive web visualization and then moving towards full-stack data products, machine learning and artificial intelligence at companies such as Ning, LinkedIn, Hortonworks and Relato. He is a self taught visualization software engineer, data engineer, data scientist, writer and most recently, he's becoming a teacher. In addition to helping companies build analytics products, Data Syndrome offers live and video training courses.

Photo of Cognitive Computing Enthusiasts group
Cognitive Computing Enthusiasts
See more events
San Jose Convention Center
150 West San Carlos Street · San Jose, CA