Large-scale, non-linear learning on a single CPU-Andreas Mueller

Details
http://photos1.meetupstatic.com/photos/event/1/4/0/1/600_442085121.jpeg
Large-scale, non-linear learning on a single CPU - Andreas Mueller
In the days of the "big data" buzz, many people build data driven applications on clusters from the start. However, working with distributed computing is not only pricey, but also requires a large engineering effort and removes interactivity from the data exploration process. In this talk I will demonstrate how to learn powerful nonlinear models on a single machine, even with large data sets. This can be achieved using the partial_fit interface provided by scikit-learn, that implements stochastic updates. Together with stateless transformation of the data, such as hashing, kernel approximation and random projections, these allow incrementally building a model without the need to store all the data in memory, or even on disk.
Andreas Mueller is an Research Engineer at the NYU Center for Data Science,building open source software for data science. Previously he worked as a Machine Learning Scientist at Amazon, developing solutions for computer vision and forecasting problems. He is one of the core developers of the scikit-learn machine learning library, and has co-maintained it for several years.
His mission is to create open tools to lower the barrier of entry for machine learning applications, promote reproducible science and democratize the access to high-quality machine learning algorithms.
Compiled, Auto-Parallel Python with Pyfora (Tom Peters)
Writing performant Python code is hard. Pyfora makes it easier with a JIT compiler and adaptive parallelism. We'll give an introduction to the Pyfora open source project, and some of its applications in data science.
Thomas Peters is a software engineer at Ufora (http://www.ufora.com/), where he works on the Python compiler and on expanding the platform's machine learning capabilities. He has a PhD in mathematics from Columbia University, where he specialized in low-dimensional topology, using Heegaard Floer homology to compute invariants of manifolds, and has a BA in mathematics from Rutgers University.

Large-scale, non-linear learning on a single CPU-Andreas Mueller