Data Science is a young and effervescent field that holds much promise for solving the 21st century's biggest problems. While more people are starting to recognize the power of data driven analyses, much of the insight of how to build scalable, consistent, and repeatable systems remains siloed in large organizations. In this whirlwind tour de force of data science, Jonathan will highlight data science best practices and the gotchas to avoid by walking you through building a robust machine learning pipeline. He will cover how (and why) to measure everything, choosing the right ML algorithm for the question at hand (and the tradeoffs of each), scaling algorithms with MapReduce to run on all your data, and how to productionize your model to get predictions in realtime. After all... there is no need to invent the future if you can predict it.
About the Speaker:
Jonathan is the co-founder of Zipfian Academy (http://zipfianacademy.com/), an immersive 12-week training program aimed at creating the next generation of data scientists through a hands-on project based curriculum. He first discovered his love of all things data while studying Computer Science and Physics at UC Berkeley. In a former life, he worked for Alpine Data Labs (http://www.alpinedatalabs.com/) developing distributed machine learning algorithms for predictive analytics on Hadoop.
Jonathan has always had a passion for sharing the things he has learned in the most creative ways he can. He has been a mentor at Dev Bootcamp (http://devbootcamp.com/), taught classes at General Assembly (https://generalassemb.ly/), and was an instructor/curriculum at Hack Reactor (http://hackreactor.com/) where he combined his two favorite things: humans and code.