BDM#44 - Spark Meetup and LinkedIn's Pinot


Details
Big Data Montreal would like to invite you to its 44th meeting!
Join us on Tuesday January 5th 2016 at 18h00 to attend a conference, as well as to network with other Big Data enthusiasts from Montreal!
The meeting will take place at the Cloud.ca Center (formerly RPM Startup Centre (http://centre.cloud.ca/)), which is located at 420 Guy street.
All are welcome, no matter if you already have some experience with Big Data technologies or if you're simply curious to learn more.
We have 3 presentations scheduled:
• PatchWork by Thomas Triplet, Researcher – Data Scientist at the CRIM
While hundreds of clustering algorithms have been proposed, many are complex and do not scale well as more data become available, making then inadequate to analyze very large datasets: many clustering algorithms are sequential, thus inherently difficult to parallelize. We propose PatchWork, a novel clustering algorithm to address those issues. PatchWork is a distributed density clustering algorithm with linear computational complexity and linear horizontal scalability. It relies on the map/reduce paradigm to parallelize computations and was implemented using Apache Spark. On our experiments using commodity hardware, we could cluster a billion points in a few minutes only, a 40x improvement over the k-means implementation in Spark MLLib.
• Towards a time series library for Apache Spark by Simon Ouellette, CEO of Nabla Analytics, inc.
spark-timeseries (https://github.com/cloudera/spark-timeseries) is a financial and time series library for Apache Spark that is currently in development. We will go over the current design and functionality with examples, and we will discuss challenges and future developments that are expected.
• LinkedIn's Pinot by Jean-François Im, Data analytics infrastructure engineer at LinkedIn
Finally, you are also welcome to join us for some casual networking, in the same room, after the presentations, followed by a bear at Brasseurs de Montreal.
Please tell your friends and colleagues :) !
=====================================
Big Data Montréal vous invite à sa 44e rencontre!
Joignez-vous à nous le mardi 5 janvier 2016 à 18h00 pour assister à une conférence, ainsi que pour réseauter avec les autres enthousiastes montréalais du Big Data!
La rencontre aura lieu au Centre Cloud.ca (anciennement le RPM Startup Centre (http://centre.cloud.ca/)), qui est situé au 420 rue Guy.
Tous sont bienvenus, que vous ayez déjà de l'expérience avec les technologies de Big Data ou que vous soyez simplement curieux d'en apprendre plus.
Nous avons 3 présentations complètes à l'horaire:
• PatchWork by Thomas Triplet, Researcher – Data Scientist at the CRIM
While hundreds of clustering algorithms have been proposed, many are complex and do not scale well as more data become available, making then inadequate to analyze very large datasets: many clustering algorithms are sequential, thus inherently difficult to parallelize. We propose PatchWork, a novel clustering algorithm to address those issues. PatchWork is a distributed density clustering algorithm with linear computational complexity and linear horizontal scalability. It relies on the map/reduce paradigm to parallelize computations and was implemented using Apache Spark. On our experiments using commodity hardware, we could cluster a billion points in a few minutes only, a 40x improvement over the k-means implementation in Spark MLLib.
• Towards a time series library for Apache Spark by Simon Ouellette, CEO of Nabla Analytics, inc.
spark-timeseries (https://github.com/cloudera/spark-timeseries) is a financial and time series library for Apache Spark that is currently in development. We will go over the current design and functionality with examples, and we will discuss challenges and future developments that are expected.
• LinkedIn's Pinot by Jean-François Im, Data analytics infrastructure engineer at LinkedIn
Finalement, vous êtes invités à vous joindre à nous après les présentations, dans la même salle, pour continuer à réseauter, ensuite de quoi nous pourrons aller prendre un verre aux aux Brasseurs de Montréal.
Passez le mot et venez en grand nombre :) !

BDM#44 - Spark Meetup and LinkedIn's Pinot