• #39 WhyR pre-meeting: Segmentation in Surveys using NMF (talk+workshop)

    This pre-meeting promotes Why R? 2019 Conference whyr.pl/2019 (regular registration ends August 31st!) Working with high dimensional data? Often facing the need to group observations? This presentation is for you. Segmentation should be balanced and distinctive, the discovered over- and under-indexed features within segments should create a meaningful story, and, ideally, the amount of differentiative factors that drives segmentation should be small. The last requirement often becomes a bottleneck in a survey where respondents are asked an enormous amount of questions. One solution is the non-negative matrix factorization that, in one attempt, segments respondents and their features! The concept of the NMF decomposition and applications in R will be presented with the explanation of diagnostic plots. The 30-minute presentation will be followed by a 60-minute workshop, so bring your laptops! As usual, short contributions and announcements from the audience are always welcome. We want to be a diverse and inclusive group, please come, bring your friends. We're hoping for a nice crowd enjoying interesting talks and chats. Our generous host Europace AG will prepare drinks and snacks for us! See you there! About the Speaker: Marcin Kosiński has a master degree in Mathematical Statistics and Data Analysis specialty. Challenges seeker and devoted R language enthusiast. In the past, keen on the field of large-scale online learning and various approaches to personalized news article recommendation. Community events host: organizer of Why R? conferences whyr.pl. Interested in R packages development and survival analysis models. Currently explores and improves methods for quantitative marketing analyses and global surveys at Gradient Metrics.

  • #38 Explainable Machine Learning with R

    EUROPACE

    Join our next meetup! Boyan Angelov | Explainable Machine Learning with R Modern machine learning algorithms tend to behave as black boxes. We have traded understanding of the inner workings of a model for increases in accuracy and performance. In the era of GDPR and global concerns about data privacy and algorithm bias, understanding how a machine learning model makes a decision has increased importance. Historically there have been numerous efforts in the field, and it is still rapidly developing. This talk is a tour of what recent xAI (explainable AI) methods and associated R tools are available, how they work and some practical stack integration advice. We will cover the packages mlr, lime, iml, dalex, shap, and the sdmexplain (one of his packages). As usual, short contributions and announcements from the audience are always welcome. We want to be a diverse and inclusive group, please come, bring your friends. We're hoping for a nice crowd enjoying interesting talks and chats. Our generous host Europace AG will prepare drinks and snacks for us! See you there ---- Boyan Angelov Senior Data Scientist and startup advisor, currently working in management consulting (DAIN Studios). Academic background in bioinformatics. Experience in different fields such as e-commerce, clinical trials, HRTech and open source. Favourite topics in machine learning: natural language processing and spatial models.

    4
  • #37 Kirill Müller | {dm} facilitates working with multiple tables

    Quite spontaneous like that, next Tuesday (16th July) we will have Kirill Müller presenting a {dm}, a new package that facilitates working with multiple tables. He gives a motivation for using multiple tables in the first place, outlines the features of this package, and discusses future development. The presentation will be interactive with live coding and a script that attendants can run during or after the presentation. Check out https://krlmlr.github.io/dm As usual, short contributions and announcements from the audience are always welcome. We want to be a diverse and inclusive group, please come, bring your friends. We're hoping for a nice crowd enjoying interesting talks and chats. Our generous host Europace AG will prepare drinks and snacks for us. See you next Tuesday! ---- Kirill Müller Very active R developer, collaborator in more than 30 packages on CRAN... https://www.rdocumentation.org/collaborators/name/Kirill%20M%C3%BCller

    2
  • satRday Berlin

    Hertie School

    satRday Berlin is an R-focused conference that will be held on Saturday June 15th, 2019. Our goal is to help and grow the local R user community. We want to make this event as accessible as possible - by not requiring time off work, by not costing more than a day's wages, and by being supportive to new and under-represented community members. More info: https://berlin2019.satrdays.org/"

    1
  • #36 R as a workhorse for presenting financial data to Bitcoin enthusiasts

    At our next meetup, Philip Giese will talk about using R in finance. More specifically, he will talk about running R "in the cloud" for gathering data from financial market APIs in the cryptocurrency sphere, and for automatic article generation. Abstract of the talk: For media professionals in this field, daily reporting on price developments in the finance-sector is an arduous task. Companies like Stockpulse prove that this task can be automated. However, to increase trustlessness and to decrease costs, a self-developed solution would be desirable. Our approach therefore is as follows: Data from an API is evaluated according to its sentiment and used for an automatically posted article on a Wordpress page. The R-code for this is executed within a Heroku app. In this talk the interaction between Heroku as a platform, the APIs as information sources and R as an evaluation tool will be presented. Bio of the speaker: Dr. Philipp Giese works as chief analyst for BTC-ECHO and is specialized in research, fundamental analysis and chart analysis. As part of his work at BTC-ECHO, he has implemented automatic price updates and set up several actively managed sample portfolios. After some experience in Mathematica, Python and Igor Pro, R has proven to be his analysis tool of choice. Before joining BTC-ECHO, the physicist gained many years of professional experience as a project manager, product developer and technological consultant. During the break, or after the talk: Open microphone session. Announcements of any kind: - other interesting events - interesting Open Source projects - call for new talks at upcoming R User Group events Our generous host Europace AG will prepare drinks and snacks for us!

    2
  • #35 Beyond the tidyverse & `mlr` Machine Learning Framework

    In our next meetup Mikhail Balyasin will finish his series "Touring the tidyverse" and Prof. Gero Szepannek will present a framework for Machine Learning with the `mlr`package. Check the talk descriptions below. # Please note that this time we will start at 6:30pm! As usual, short contributions and announcements from the audience are always welcome. We want to be a diverse and inclusive group, please come, bring your friends. We're hoping for a nice crowd enjoying interesting talks and chats. Our generous host Europace AG will prepare drinks and snacks for us! See you there ---- Mikhail Balyasin | Beyond the tidyverse In the last talk of the series I'd like to round up with looking back at what we've covered, but also show you couple other projects that embraced "tidy" philosophy. They are not officially part of "tidyverse", but they are interesting to look at as it shows that tidy principles can be applied in a very broad spectrum of applications and still make sense. It also makes your life easier since you learn about one set of principles and then apply it in multiple other areas. Specifically, I'll go over couple of projects from different areas to show how they applied tidy way to their problems. ---- Prof. Gero Szepannek | `mlr` – A Framework for Machine Learning (and Automatic Hyperparameter Tuning) The `mlr` package offers a flexible interface to more than 170 machine learning algorithms for classification, regression, clustering as well as survival analysis. It has been one of the pioneer frameworks for automatic hyperparameter tuning allowing for computer-based optimization of ML models for a specific task. It further offers the possibility to integrate related modelling steps such as imputation of missing values, variable selection or class imbalance correction directly into the modelling process. The talk will give an introduction to the features of the package accompanied by a script to be run by audience during the talk. Bio: · Since 2016: Professor for Statistics, Business Mathematics and ML @ Stralsund University of Applied Sciences · 2009 - 2016: Head of Scoring & Rating Models @ Santander Consumer Bank / Santander Consumer Group · 2008: PhD @ Dortmund University of Technology (on Automatic Speech Recognition in collaboration with Fraunhofer IDMT, Ilmenau) · 2004: Diploma Statistics @ Dortmund University of Technology

    3
  • #34 INLA-package: Geostatistical analysis of health insurance claims

    Dr. Boris Kauhl | A practical overview how R and the INLA-package facilitate the geostatistical analysis of large, spatially referenced health insurance claims Health insurance claims provide a rich set of information at fine spatial scales. This is both, a great opportunity and a challenge, as such large, complex spatial data require particular methods to deal with location information. The main focus of this presentation is the application of spatial Bayesian modelling within the INLA package. Two applications will be explored: Visualizing spatial patterns of chronic diseases and examining risk factors at the individual and aggregated level. Of particular interest is explicitly spatial modelling as it enables us to see, which persons are where at major risk for chronic diseases. We will have an in depth discussion of current methodological and computational challenges and solutions to deal with large spatial data sets. Dr. Boris Kauhl works for the AOK Nordost health insurance where he focuses on spatial analyses to enhance planning and allocation of healthcare. He did his PhD about Geographic Information Systems (GIS) in public health at Maastricht University, the Netherlands. --- Short contributions and announcements from the audience are always welcome. We want to be a diverse and inclusive group, please come, bring your friends. We're hoping for a nice crowd enjoying interesting talks and chats. Our generous host Europace AG will prepare drinks and snacks for us! See you!

    6
  • TOPICS

    Needs a location

    This is not an event per se, but a place to share topic wishes and offers. Please send us a message ([masked]) if you want to present. Here are the results from the spring 2017 survey (n=34) as an inspiration, with the number of people selecting each topic at the start of the line: 20 Time series analysis 19 Advanced usage of ggplot 18 Out of memory handling of large datasets 17 Computationally efficient programming, code optimization 17 Interactive graphics, dashboarding (shiny) 16 Deep learning / Machine Learning 15 Decision trees, random forests 15 Web scraping with R 14 Linear Models (Regression, Anova) 14 Mixed Linear Models 14 R for Data Science (general intro) 14 R project workflow (directories, intermediate data/graphs) 14 Usage of data.table 13 Data visualization, Vis literacy 13 Spatial statistics with R (maps, shapefiles, kriging ...) 12 R and Docker (container) 11 Natural language processing 11 Simulation and Bayesian optimization, bayesian stats with rstan (or rstanarm) 10 Interactive maps (leaflet) 10 Intro to MCMC 10 knitr, rmarkdown, bookdown, how to build reports, R notebooks 9 Code parallelization 9 Jupyter Notebook, feather exchange format 9 R internals - How the interpreter works, C-API 8 Intro to the TidyVerse 8 Optimization, parameter estimation 8 Rccp 8 Unit testing (for package development) 7 Collaborative coding (github) 7 Installation of Rstudio Server 7 Regular expressions (character string management) 7 Tutorial: write an R package with Rstudio and github 5 (choice based) conjoint analysis 4 Intro to ggplot

    5
  • #33 Touring the tidyverse: tidy models

    EUROPACE

    Quite spontaneous like this, next week we'll have... Mikhail Balyasin | Touring the tidyverse: tidy models In the penultimate stop in "Touring the tidyverse" series of talks we are going to talk about `tidymodels`. It's a collection of packages that are built using tidy approach to make model fitting in R more predictable and extendable. The main force behind `tidymodels`-suite is Max Kuhn who you might know as an author and maintainer of `caret` package. `tidymodels` collection of packages is by far the least developed part of the `tidyverse` and it is under very active development at the moment. Nevertheless, even at this state the suite of packages is already usable and useful for certain use-cases. I will demonstrate this in the talk using materials from a 2-day "Applied Machine Learning" workshop that happened at rconf::2019 and was led by Max Kuhn himself. --- Short contributions and announcements from the audience are always welcome. We want to be a diverse and inclusive group, please come, bring your friends. We're hoping for a nice crowd enjoying interesting talks and chats. Our generous host Europace AG will prepare drinks and snacks for us! See you!

    1
  • #32 Code-centric websites with Blogdown & Touring the tidyverse: tidyeval

    In our next meetup Ilja Sperling will bring the best of both worlds: code-centric websites with Blogdown & R. Next in the lineup, Mikhail Balyasin will continue with his series of talks "Touring the tidyverse" and this time he'll talk about tidyeval. Check the description of the talks below. Short contributions and announcements from the audience are always welcome. We want to be a diverse and inclusive group, please come, bring your friends. We're hoping for a nice crowd enjoying interesting talks and chats. Our generous host Europace AG will prepare drinks and snacks for us! See you! -- Ilja Sperling | The best of both worlds: code-centric websites with Blogdown & R This talk will introduce you to the convenience of creating and running your own code-centric website with RStudio and R Markdown. No HTML needed. Whether as your personal homepage, a blog with all your R tricks, or just to build your data-science portfolio - Yihui Xie’s Blogdown package makes running a full-fledged website from within RStudio as easy as using Medium or Wordpress. All without ever interrupting your regular R workflow. We’ll set up a new website, write our first post, publish it on the web, and look into some Blogdown tricks and ways to customise our website’s theme. Check out Ilja's blog https://ellocke.github.io and his Twitter https://twitter.com/fubits -- Mikhail Balyasin | Touring the tidyverse: tidyeval It's time we talk about tidyeval. Tidy evaluation is probably one of the most interesting and equally confusing parts of the tidyverse. In the fourth installment of "Touring tidyverse" we will talk about what it is and why you might want to use it in your projects. Main part of the talk will be a step-by-step introduction of concepts in tidyeval (e.g., `enquo`, `enexpr`, `!!`, `!!!` and so on) with a worked example provided by the main developer of tidyeval Lionel Henry. This will hopefully provide enough motivation and clarity into this concept.

    3