Mar 7, 2013 · 6:30 PM
This location is shown only to members
Possibly one of the best open source machine learning projects around.
scikit-learn http://scikit-learn.org/stable/ is a well established Open Source machine learning library mostly written in Python with the help of NumPy, SciPy and Cython.
scikit-learn is very user-friendly designed with a simple and consistent API and extensive documentation. Strictly enforced coding standards and high test coverage guarantee high quality implementations. Behind Scikit-learn is a very active community, steadily improving the library.
Come and meet Olivier & Andreas, two smart hackers-by-night who are main contributors to the project.
7pm talks start
“Learning in Python with scikit-learn" by Andreas Mueller
This talk will give an overview of the library and introduce general machine learning concepts such as supervised and unsupervised learning, feature extraction, cross validation for model evaluation and hyper parameter selection. We will also touch some more advanced yet practically useful concepts such as feature hashing and ensemble learning.
Andreas is a PhD student in machine learning an computer vision at Bonn University (Germany). He is one of the core developers and the maintainer of scikit-learn and the author of the blog peekaboo-vision. His interests include principles and applications of machine learning and open science.
Beer break, Networking and Community Update
"Parallel and large scale learning with scikit-learn" by Olivier Grisel
This talk will give a introduce practical tools and concepts to better leverage multicore machines and small clusters to perform interactive yet scalable predictive modeling with scikit-learn and IPython.parallel. In particular we will introduce:
- A short introduction to the parallel features of IPython from thenotebook interface
- How to perform scalable text feature extraction with the Hashing Trick
- How to parallelize or distribute model evaluation (cross validation) and hyper parameters tuning
- How to optimize memory usage with memory mapping
- How to approximate kernel Support Vector Machines for large scale datasets
- A short introduction to Ensembles with model averaging and Random Forests
Olivier is a R&D Software Engineer working in Java by day and a Python machine learning hacker by night. He is interested in applications to Natural Language Processing, Computer Vision and predictive modelling in general
More beer + networking
9.30pm-ish meetup ends