PyData Amsterdam: a new correlation coefficient Phi_k and a Dash demo

First meetup of 2019

## Phi_k correlation

The calculation of correlations between paired data variables is a standard tool of analysis for every data analyst. This presentation will be about a new and practical correlation coefficient, phi_K, which works consistently between categorical, ordinal and interval variables. It is based on several refinements to Pearson's hypothesis test of independence of two variables and captures non-linear dependency.

Emphasis is paid to the proper evaluation of statistical significance of correlations and to the interpretation of variable relationships in a contingency table, in particular in case of low statistics samples. Two practical applications are discussed. The presented algorithms are easy to use and available through a public Python library.


pip install phik


Rose Koopman is a data scientist at KPMG, working on data driven solutions for clients in different domains. Before joining KPMG she did a PhD in high energy physics at CERN.

## Dash

Dash in a python framework utilized to create interactive web applications for doing and showing analyses. It is a powerful tool which will allow users to create anything from a single interactive plot to a full blown dashboard.
This talk aims to show you how to get started using dash in your daily work, making a template data exploration tool that can aid in exploring new data or showing results to clients.
In addition we will demo our dash-app that uses phi_k!

Susanne is a Data Scientist working at KPMG in all different sectors, having developed an interest in NLP, Deep learning and visualizations in the recent months. Before joining KPMG she worked in IT and completed a masters in Medical Physics.

## Third talk

*Postponing work in an optimal way*

Performance evaluation and optimal control of processes are crucial challenges for any business. One common approach would be to use Machine Learning (ML) on a past data to uncover important patterns and only afterwards model this within an Operations Research (OR) framework. In this talk, I will present a novel method to combine ML and OR. Although, the technique is highly generic, in this session we will concentrate only on one process - “postponing work in an optimal way”. After discussing the pros and cons of both traditional OR and ML, we will see how one can benefit from their synergy.

### Bio:

Asparuh Hristov, data scientist


