Mark your calendar for the next session of the PyData Paris Meetup on June 19th 2018.
The speakers for this session are Tim Head and Tom Dupré La Tour
7:00pm - 7:15pm: Community announcements
7:15pm - 8:00pm: Binder - one click sharing of your data science, by Tim Head
When other people want to run the code of the cool data project you did last week you usually think: “Great someone cares!” and then “Oh no, now I need to play support desk till they get it running.”
The Binder project lets anyone run the contents of a git repository by clicking a link. For example try out the latest JupyterLab demo by clicking this link. Binder lets you describe the dependencies of your repository in a way that we can automatically create a Docker container from it. Removing the need for you to spend a lot of time to help others who are trying to get your code to run.
Some example uses:
- Reproduce and explore Right for the "Right Reasons: Training Differentiable Models by Constraining their Explanations" by Ross et al (https://mybinder.org/v2/gh/dtak/rrr/master?urlpath=lab).
- Learn about "Foundations of numerical computing" (https://mybinder.org/v2/gh/ssanderson/foundations-of-numerical-computing/master?filepath=notebooks) with Scott Sanderson.
- Dive into Julia Evans’ "Pandas cookbook" (https://mybinder.org/v2/gh/jvns/pandas-cookbook/master).
I will tell you about the Binder project, how to use it to share work, what the tools behind it are, and how you can join the team working on Binder.
8:00pm - 8:45pm: Nearest neighbors in scikit-learn estimators, API challenges, by Tom Dupré la Tour
Scikit-learn is a very popular machine learning library in Python.
It is well known for its simple and elegant API, which has been reused in multiple other Python libraries.
However, some parts of the library could still benefit from a better API.
In particular, several scikit-learn estimators rely internally on some nearest neighbors computations.
Yet, they use different API, they can't use custom neighbors estimators, and during a grid-search they recompute the nearest neighbors graph for each hyper-parameter.
We will present ongoing work on improving their API, discussing implementation and deprecation challenges.
Tim Head builds data driven products for clients all around the world, from startups to UN organisations. His company www.wildtreetech.com specialises in digital products that leverage machine-learning and deploying custom JupyterHub setups.
Tim contributes to the Binder project and helped create scikit-optimize. When he isn’t travelling he trains for triathlons.
Tom Dupré la Tour is a third-year PhD student at Télécom ParisTech, interested in signal processing, machine learning and neural oscillations.
He joined the core developer team of scikit-learn in 2015.