Mark your calendar for the next session of the PyData Paris Meetup on October 8th 2018. This Meetup will be hosted at CFM, rue de l'Université.
The speakers for this session are Jessica Hamrick and Nicolas Thiéry, with an introduction by Laurent Laloux, Chief Product Officer at CFM.
7:00pm - 7:15pm: Community announcements
7:15pm - 8:00pm: Nicolas Thiéry
Modeling mathematics in Python & SageMath: some fun challenges
8:00pm - 8:45pm Jessica Hamrick
Nbgrader: a tool for creating and grading assignments in the Jupyter notebook
Jessica Hamrick is a Research Scientist at DeepMind in London, having recently completed her Ph.D. in Psychology at the University of California, Berkeley working with Tom Griffiths. Previously, she received her M.Eng. in Computer Science from MIT working with Josh Tenenbaum. Jessica's research focuses on model-based reasoning and planning, situated at the intersection of cognitive science, machine learning, and AI. In addition to research, Jessica is involved in several open source projects including Project Jupyter. Jessica is both a member of the Project Jupyter steering committee, and is the lead maintainer of nbgrader, a tool for grading Jupyter notebook assignments.
Nicolas M. Thiéry is Professor at the Laboratoire de Recherche en
Informatique of Université Paris Sud. His teaching ranges from
introductory programming (with C++, in Jupyter) to computational
methods in algebra (with SageMath, in Jupyter). His research at the
borderline between Math and Computer Science, studying algebraic
combinatorics with the help of computer exploration. He has been
promoting software sharing for algebraic Combinatorics since 2000 and
contributing to SageMath since 2008. To help fund the computational
math software and Jupyter ecosystems, he leads the OpenDreamKit
European project [masked]).
Mark your calendar for the next session of the PyData Paris Meetup on June 19th 2018.
The speakers for this session are Tim Head and Tom Dupré La Tour
7:00pm - 7:15pm: Community announcements
7:15pm - 8:00pm: Binder - one click sharing of your data science, by Tim Head
When other people want to run the code of the cool data project you did last week you usually think: “Great someone cares!” and then “Oh no, now I need to play support desk till they get it running.”
The Binder project lets anyone run the contents of a git repository by clicking a link. For example try out the latest JupyterLab demo by clicking this link. Binder lets you describe the dependencies of your repository in a way that we can automatically create a Docker container from it. Removing the need for you to spend a lot of time to help others who are trying to get your code to run.
Some example uses:
- Reproduce and explore Right for the "Right Reasons: Training Differentiable Models by Constraining their Explanations" by Ross et al (https://mybinder.org/v2/gh/dtak/rrr/master?urlpath=lab).
- Learn about "Foundations of numerical computing" (https://mybinder.org/v2/gh/ssanderson/foundations-of-numerical-computing/master?filepath=notebooks) with Scott Sanderson.
- Dive into Julia Evans’ "Pandas cookbook" (https://mybinder.org/v2/gh/jvns/pandas-cookbook/master).
I will tell you about the Binder project, how to use it to share work, what the tools behind it are, and how you can join the team working on Binder.
8:00pm - 8:45pm: Nearest neighbors in scikit-learn estimators, API challenges, by Tom Dupré la Tour
Scikit-learn is a very popular machine learning library in Python.
It is well known for its simple and elegant API, which has been reused in multiple other Python libraries.
However, some parts of the library could still benefit from a better API.
In particular, several scikit-learn estimators rely internally on some nearest neighbors computations.
Yet, they use different API, they can't use custom neighbors estimators, and during a grid-search they recompute the nearest neighbors graph for each hyper-parameter.
We will present ongoing work on improving their API, discussing implementation and deprecation challenges.
Tim Head builds data driven products for clients all around the world, from startups to UN organisations. His company www.wildtreetech.com specialises in digital products that leverage machine-learning and deploying custom JupyterHub setups.
Tim contributes to the Binder project and helped create scikit-optimize. When he isn’t travelling he trains for triathlons.
Tom Dupré la Tour is a third-year PhD student at Télécom ParisTech, interested in signal processing, machine learning and neural oscillations.
He joined the core developer team of scikit-learn in 2015.
On March 10th, the Pandas community is organizing a worldwide documentation sprint! https://python-sprints.github.io/pandas/
The Paris event will be held at the CFM (Capital Fund Management) headquarters.
Please sign up here for the Paris sprint. Seats are limited! Only 14 attendees will be selected to participate in the event (knowing Python, Pandas and Git is required to be able to contribute).
• What we'll do
Contributors throughout the world are going to improve Pandas' documentation. Each contributed hour has the potential to transform countless collective hours of difficulties into as many hours of productive work. This is a great opportunity to learn from fellow programmers, to learn more about Pandas and to have a significant impact on data science.
• What to bring
A laptop. Ideally with Python (2.7, 3.5 or 3.6) installed, along with the Pandas library, and Git.
Capital Fund Management will provide breakfast (served upon arrival), coffee and tea, pizzas for lunch, and a wifi connection.
Pair programming will be encouraged.
Mark your calendar for the next installment of the PyData Paris Meetup on January 31st.
The guest speaker for this session is Jason Grout.
7:00pm - 7:30pm: Community announcements and lightning talks
7:30pm - 8:30pm: JupyterLab: Building blocks for interactive computing
Project Jupyter provides building blocks for interactive and exploratory computing, which make science and data science reproducible across over 40 programming languages (Python, Julia, R, etc.). Central to the project is the Jupyter Notebook, a web-based interactive computing platform that allows users to author “computational narratives” that combine live code, equations, narrative text, visualizations, interactive dashboards, and other media. We will give an overview of JupyterLab, the next generation of the Jupyter Notebook, demonstrate how to use third-party plugins to extend and customize many aspects of JupyterLab, and explain how it fits within the overall vision of Project Jupyter.
JupyterLab goes beyond the classic Jupyter Notebook by providing a flexible and extensible web application with a set of reusable components. Users can arrange multiple notebooks, text editors, terminals, output areas, and custom components using tabs and collapsible sidebars. These components are carefully designed to enable the user to use them together or separately (for example, a user can send code from a file to a console with a keystroke, or can pop out an output from a notebook to work with it alone).
JupyterLab is based on a flexible application plugin system provided by PhosphorJS that makes it easy to customize existing components or extend it with new components. For example, users can install or write third-party plugins to view custom file formats, such as GeoJSON, interact with external services, such as Dask or Apache Spark, or display their data in effective and useful ways, such as interactive maps, tables, or plots.
Jason Grout is a Jupyter developer at Bloomberg, working primarily on JupyterLab and the interactive Jupyter widgets library. He has also been a major contributor to the open source Sage mathematical software system. Previously, Jason was an assistant professor of mathematics at Drake University in Des Moines, Iowa. He holds a PhD in mathematics from Brigham Young University.
We invite you to the next installment of our community meetup, once more at La Défense.
We will be hosting two talks, one focused on the scikit-image project, and the other one about the Jupyter/IPython project.
Image Processing with Scikit-Image and the Scientific Python Ecosystem - Emmanuelle Gouillart
Affiliation: Joint Unit CNRS/Saint-Gobain (Paris)
Emmanuelle is a researcher in materials science, working at the intersection between industrial and academic research.
She has been a core contributor of scikit-image for several years, and her interest in image processing was triggered by her frequent use of in-situ tomographic imaging of materials, especially glass at high temperature.
In software development, besides image processing she is interested in documentation and teaching scientific Python. She has been a co-organizer of the Euroscipy conference for several years.
Jupyter and the IPython kernel, Hidden and Upcoming Features.
One of the foundations of Jupyter is a protocol that explicitly defines all the actions and results that comprise the workflow of interactive computing, across a wide range of programming languages. This abstracts and decouples the process of code execution (performed by a kernel) from code input and results rendering.
Despite the protocol being relatively simple; most kernels and frontend and libraries do not make full use of the available features; even the IPython kernel (the reference implementation) has a number of rarely known feature, and opportunities to make better use of the Jupyter protocol.
We'll scratch the surface to see what can be done with the display protocol, with magics and completions, and peek at what could be possible if we restrict ourselves to python 3.6+ and poke at async/await (beyond asyncio) and how we could make use of await in the REPL.
Matthias Bussonnier is Postdoctoral Scholar at UC Berkeley BIDS and has been a core developer of the Jupyter and IPython projects for more than 5 years.
He is one the current maintainer of IPython and the IPython kernel. Matthias is also a string advocate for Python 3.
The PyData Paris Meetup is back!
Please join us on the 12th of September at the Allianz One Tower. We've got a dynamic lineup of speakers in store for you!
This installment of the PyData Paris meetup will feature Serge Guelton and Maarten Breddels.
7:00pm - 7:45pm Serge Guelton. Pythran: OpenMP and SIMD for Python Numerical Kernels
Pythran is an ahead of time compiler for numeric Python Kernels: it turns plain Python modules, with a few restriction, into native ones that run Python-free code.
Pythran is focused on numerical kernels, based on Numpy, and aims at turning *high-level* Python code into efficient, parallel, vectorized native code. This goal is not an easy one, but many milestones havealready been reached, I'm going to present them, as well as a few forthcoming ones!
7:45pm - 8:30pm Maarten Breddels. A Billion Stars in the Jupyter Notebook
With large astronomical catalogues containing more than a billion stars becoming common, we are preparing for methods to visualize and explore these large datasets. Data volumes of this size requires different visualization techniques, since scatter plots become too slow and meaningless due to overplotting. The vaex python library solves these problems by calculating statistics on N-dimensional grids, processing over a billion rows per second. Visualization of these grids allows exploration of these large catalogues. The 3d visualization is done using ipyvolume, a new library for the Jupyter notebook.
PyData - Paris is proud to host its first meetup!
The first installment will be held at Université Paris VI (Pierre et Marie Curie) - we will provide more information about the meet up location to people who register for this event.
The meet up will feature the CEO of Continuum Analytics, Travis Oliphant, most notably as the primary developer of the NumPy package, and as a founding contributor of the SciPy package, as well as Gaël Varoquaux, an INRIA faculty researcher working on data science for brain imaging. Gael is a core developer of scikit-learn, joblib, Mayavi and nilearn, and a nominated member of the PSF.
7:00pm - 7:45pm Travis Oliphant. Building an Open Source Company
There are several business models that have been used to build a company around open source. Travis will provide an overview of these as well as describe the fundamentals behind both open-source as well as company building that lead to the both the opportunities and challenges of mixing open source with commercial activity. Views on what it means to be an "open-source company" versus a company that uses open source will also be discussed along with some of the opportunities that are currently available for today's entrepreneur. The long-term success of open-source relies on company participation and support which is why it is important to have many thriving companies whose business models rely on open-source success. Along the way, Travis will provide a high-level overview of the technology Continuum is creating that emphasizes the purpose of Anaconda and Continuum Analytics to empower people to solve the world's greatest challenges and sustainably grow open source ecosystem.
7:45pm - 8:30pm Gaël Varoquaux. Enabling open science and data science via software: scikit-learn
"Data science", with sophisticated data processing, is having a transformational impact on many facets of science and society. It is driven by a technological revolution based on statistical models, and software implementing them. Outreach, bridging the technical gap outside of the ivory tower of research lab and high-margin tech ventures, is
crucial to see data-science applications of the beaten track.
Scikit-learn is a machine-learning software that strives to reach many users and applications. Via the rich Python data ecosystem it can be embedded any domain or workflow. It has hundreds of thousands of users in a variety of field in the industry or in academia. I will discuss how we built scikit-learn to be easy-to-use and didactic; how we grew a community of open-source developers with a focus on collaboration; how we ensure quality in a statistical-learning codebase; how we try to distill the most important progress from the rapid pace of academic publishing; and how we are struggling to make the development sustainable.