Przejdź do treści

Data Science Warsaw #25

Zdjęcie użytkownika Dominik Batorski
Hosted By
Dominik B. i B.Twardowski
Data Science Warsaw #25

Szczegóły

On Tuesday, April 11th, we have the following two talks at Data Science Warsaw Meetup:

• 18:00-19:00 Marcin Kosinski: sparklyr: R interface to Apache Spark machine learning algorithms with dplyr back-end

• 19:00-20:00 Isaac Reyes: The Art of Data Storytelling

We look forward to seeing you there!

Details:

  1. sparklyr: R interface to Apache Spark machine learning algorithms with dplyr back-end (Marcin Kosinski)

sparklyr: R interface to Apache Spark, a fast and general engine for big data processing (http://spark.apache.org (http://spark.apache.org/)). This package supports connecting to local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end, and provides an interface to Spark's built-in machine learning algorithms.

During my talk I will present how R integrates with Spark with the R sparkapi package, on which the sparklyr package is based. I'll breifly explain dplyr data analysis methodology that is widely used in sparklyr. Moreover I'll summary the machine learning functionalities presented in Spark that are available via R sparklyr interface. If there will be time, in the end I'll describe sparklyr use case applied to the articles that I web scraped from polish news portals.

About the Speaker: Marcin Kosinski, R Data Scientist http://r-addict.com (http://r-addict.com/)Marcin has a master degree in Mathematical Statistics and Data Analysis specialty and for the last 30 months he was working in the Research and Development Department at the biggest polish news web portal, wp.pl (http://wp.pl/)(Virtual Poland Group). Challenges seeker and big R package enthusiast. Currently keen on the field of large-scale online learning and various approaches to personalized news article recommendation. Co-organizer of the +1300 members R Enthusiasts meetups in Warsaw and main organizer of the Polish R Users Conference 2017 called 'Why R? 2017' (whyr.pl (http://whyr.pl/)). Interested in R packages development and survival analysis models. He worked as a subject matter expert at +3000 members Data Crunchers Online R Course at The Warsaw School of Data Analysis. In January 2017, Marcin has started his own R+stats freelancing company.

  1. The Art of Data Storytelling (Isaac Reyes)

Forbes Magazine calls data storytelling 'the essential data science skill everyone needs'. And with good reason – well told data stories are change drivers within the modern organization. But how do we find the most important insights in our business data and communicate them in a compelling way? How do we connect the data that we have to the key underlying business issue? The session will cover:

  • The essential elements of a good data story
  • Chart design and why it matters
  • Common chart design errors
  • The Gestalt principals of visual perception and how they can be used to tell better stories with data
  • Some before and after data story 'make overs'

About the Speaker: Isaac Reyes is a Data Scientist, TEDx speaker and lead trainer at DataSeer. He spends his time delivering training courses in machine learning and data science to companies in Silicon Valley, Singapore, Australia, New Zealand and the Philippines. Prior to DataSeer, Isaac lectured in analytics and statistical theory at the Australian National University and worked as a data scientist at Quantium.

Photo of Data Science Warsaw group
Data Science Warsaw
Zobacz więcej wydarzeń
Biblioteka Uniwersytecka w Warszawie
ul. Dobra 56/66 · Warsaw