• Bayesian Multilevel Modeling with {brms}

    Online event

    The {brms} package provides an interface to fit Bayesian generalized (non-)linear multivariate multilevel models using Stan, a C++ package for performing full Bayesian inference (see https://mc-stan.org/). The formula syntax is very similar to that of the {lme4} package to provide a familiar and simple interface for performing regression analyses. A wide range of response distributions are supported, allowing users to fit – among others – linear, robust linear, count data, survival, response times, ordinal, zero-inflated, and even self-defined mixture models all in a multilevel context. Further modeling options include non-linear and smooth terms, auto-correlation structures, censored data, missing value imputation, and quite a few more. In addition, all parameters of the response distribution can be predicted in order to perform distributional regression. Multivariate models, i.e., models with multiple response variables, can be fit as well. Prior specifications are flexible and explicitly encourage users to apply prior distributions that reflect their beliefs. Model fits can easily be assessed and compared with posterior predictive checks, cross-validation, and Bayes factors.

    [Bio]
    Paul is a statistician currently working as an Independent Junior Research Group Leader at the Cluster of Excellence SimTech at the University of Stuttgart (Germany). He is the author of the R package {brms} and member of the Stan Development Team. Previously, he studied Psychology and Mathematics at the Universities of Münster and Hagen (Germany) and did his PhD in Münster about optimal design and Bayesian data analysis. He has also worked as a Postdoctoral researcher at the Department of Computer Science at Aalto University (Finland).

    [About]
    This event is organized by the Oslo useR! Group and will be co-hosted by the Skåne useR! Group, the Copenhagen useR! Group, and the Stockholm useR! Group.

    1
  • celebRation 2020

    Maersk Tower

    CelebRation 2020
    R conference in Copenhagen, Denmark, February[masked] celebrating the 20th anniversary of R version v1.0.0

    The year 2020 marks the 20th anniversary of the release of R version 1.0.0!

    To celebrate this, we are inviting the community of R users and developers for a two-day celebRation 28-29th February 2020. We kick off on 28th February with hands-on workshops on two hot topics, namely data visualization using the ggplot2 package and making fast extensions of R using the rcpp package. The day of the anniversary – February 29 – presents a line-up of speakers who cover the past, the present, and the future of the R programming language.

    To get more information and register go to http://www.celebration2020.org/

    YOU CANNOT SIGN UP THROUGH MEETUP!

    1
  • Data R for Kids

    University of Copenhagen, CSS, room 1.1.18

    Exciting talk and co-creation event by Sine Zambach:

    Data R for Kids
    ==============

    We know from tobacco industry, that we should get’em while they are young. But if we should get young people using in R – what would it require? To a start [someone] should develop a new set of exercises that can engage kids and young people to explore data science and (R)-programming. Not sports results or airplane time tables. Exercises should be simple and funny and relate to the everyday of the young students.
    This event is a co-creation, and I will only entertain briefly. After this, we are all in center in designing and perhaps taking the first steps in having a library of R-exercises for kids.

    1
  • Smooth curves in R and Xaringan shenanigans

    University of Copenhagen, CSS, room 1.1.18

    Two exciting upcoming talks:

    Smooth curves in R by Niels Lundtorp Olsen
    ===========================================

    R is a great and flexible software but lacks an easy and fast package for dealing curves such as in functional data and vector graphics.
    This talk will not be about all the cool tools for making plots, but we will go one step behind and look at how smooth curves are represented as data in a computer, and how to implement this in R.
    Doing things fast in R often means doing it in C++, which also will bring us past object-oriented programming. In the end, we get a package that can easily be applied to functional data and elsewhere. This is work in progress and feedback will be appreciated.

    Xaringan shenanigans by Claus Thorn Ekstrøm
    =============================================
    R provides several possibilities for using Rmarkdown to quickly create
    excellent presentations combining R code, R output, mathematics,
    animations, and interactive widgets.

    In this code-along I will show how to use the package xaringan to
    build simple, elegant, Rmarkdown-driven presentations based on the
    remark.js presentation framework. We will cover formatting, widgets,
    caveats, show some tips and tricks, and how to customise your own
    presentation template.

    2
  • Bayesian Statistics in R / Full-stack data science in R

    University of Copenhagen, CSS, room 1.1.18

    Two exciting talks:

    Bayesian Statistics in R
    ==========================================

    by Jonas Lindeløv, Assistant Professor in Cognitive Neuroscience and Neuropsychology, Aalborg University

    This workshop will give a conceptual and practical introduction to Bayesian statistics in R. Bayesian statistics have a long been known to provide a larger flexibility than other approaches but it is only in recent years that it has become easy to apply this flexibility in practice. In this talk I will discuss Bayes Factors for model comparisons (as an alternative to p values) and Utility Theory as an approach for decision making. The presentation will be based notebooks referenced below but does not require that these have been studied before the talk.

    https://lindeloev.github.io/utility-theory/
    https://rpubs.com/lindeloev/bayes_factors

    R at scale on the Google Cloud Platform
    ======================================

    by Michał Burdukiewicz: bioinformatician affiliated with Warsaw University of Technology, founder of the Why R? Foundation and Wrocław R Users Group (STWUR), CEO of .prot.

    Data science requires more than just sufficient statistical knowledge to create a model. Data, often obtained from different sources, must be purified, combined and unified, dry analysis results visualized and the model itself made available in a form accessible to the client. The R environment provides tools to support every stage of this process: from data collection through model development to the development of web applications. During my talk, I will present package necessary the full stack, large-scale data science projects in R: drake, mlr and shinyproxy.

    3
  • Easy peasy massive parallel computing / R at scale on the Google Cloud Platform

    University of Copenhagen, CSS, room 1.1.18

    Two exciting talks:

    Easy peasy massive parallel computing in R
    ==========================================

    by Mikkel Krogsholm

    Wouldn’t it be nice to be able to write simple R-code that very simply scales to massive parallel computing?

    The future and the furrr package in R provides a framework that makes it possible for you to write code, that works seamlessly on your laptop or on a supercomputer. With these, R expressions can be evaluated on the local machine, in parallel a set of local machines, or distributed on a mix of local and remote machines.

    There is no need to modify any code in order switch from sequential on the local machine to distributed processing on a remote compute cluster. Global variables and functions are also automatically identified and exported as needed, making it straightforward to tweak existing code to make use of futures.

    This R-talk shows you how. We will run through a concrete example that we first execute on a local machine and then on a much more powerful server.

    R at scale on the Google Cloud Platform
    ======================================

    by Mark Edmonson

    This talk covers my current thinking on what I consider the optimal way to work with R on the Google Cloud Platform (GCP). It seems this has developed into my niche, and I get questions about it so would like to be able to point to a URL.

    Both R and the GCP rapidly evolve, so this will have to be updated I guess at some point in the future, but even as things stand now you can do some wonderful things with R, and can multiply those out to potentially billions of users with GCP. The only limit is ambition.

    The common scenarios I want to cover are: Scaling a standalone R script and
    Scaling Shiny apps and R APIs

    2
  • Introduction to Artificial Neural Networks in R using Keras/TensorFlow

    University of Copenhagen, CSS, room 1.1.18

    Introduction
    This hands-on workshop will be given by Leon Eyrich Jessen. Leon is an assistant professor of bioinformatics in the immunoinformatics and machine learning group, at the section of bioinformatics at DTU Health Tech.
    At the workshop, you will be introduced to some of the underlying theory of artificial neural networks (ANNs), how an ANN model is created and we will discuss some of the pitfalls. Finally, you will get to implement, train, tune and apply an ANN based predictor.

    Background
    In order for a technology to have a societal impact, it needs to be accessible. The Deep Learning hype we are currently experiencing is partially due to the release of open source computational frameworks like TensorFlow. Deep Learning has numerous applications for complex pattern recognition within virtually all branches of industries, e.g. customer churn, self-driving cars, cancer diagnostics, market price forecasting, molecular interactions, etc. The aim of this workshop is to empower you to use the TF technology.
    The backbone of the Deep Learning revolution is artificial neural networks (ANNs). Historically, ANNs were available to those with the skills to implement ANN algorithms or to compile existing code. TensorFlow, a general computational framework by google, was only available in Python. However, in 2018 at rstudio::conf in San Diego, RStudio CEO JJ Allaire announced that henceforth Keras and TF will by fully supported in R. JJ Allaire’s presentation included examples of invited blog post from the official RStudio website, one of which was written by Leon.

    Level: Beginner

    Prerequisites: Workshop participants should bring their own laptop, with the latest versions of R, RStudio and the R-packages ‘tidyverse’ and ‘keras’ installed. Alternatively, a free cloud account is available at https://rstudio.cloud.

    Note that the event is a 3 hours code-along feast!

    16
  • Cleaning up the data cleaning process + predicting Danish election outcomes

    University of Copenhagen, CSS, room 1.1.18

    Here's a little Christmas present: We're rebooting meetings in the CopenhagenR useRs group in the new year. We'll start with two very nice talks:

    First talk by *Anne Helby Petersen*:
    **Cleaning up the data cleaning process with the dataMaid package**

    Data cleaning and data validation are the first steps in practically any data analysis, as the validity of the conclusions from the analysis hinges on the quality of the input data. Mistakes in the data can arise for any number of reasons, including erroneous codings, malfunctioning measurement equipment, and inconsistent data generation manuals. However, data cleaning is in itself often a messy endeavor with little structure, direction or documentation – and worst of all: it is both tedious and time consuming. I will present an R package, dataMaid, that may not make the process less dull, but hopefully a lot quicker. We wrote the dataMaid package in order to 1) spend more time on data analysis (fun), less time on data validation (boring) by automating some of the validation steps that come up most often; 2) help document the data at all the different stages of the cleaning process; 3) make it easy to produce a document that non R-savvy collaborators can read, understand and use to decide “do these data look right?”. The dataMaid package includes both very user friendly one-liner commands that auto-generates data overview reports, as well as a highly customizable suite of data validation and documentation tools that can be molded to fit most data validation needs. And, perhaps most importantly, it was specifically build to make sure that documentation and validation go hand in hand, so we can clean up the mess that is an unstructured data cleaning process. Isn’t that neat?

    Second talk by *Mikkel Krogsholm*:
    **And the winner of the next Danish election is …**

    2019 is around the corner and that means that it is election season in Denmark. In this talk I will play around with Danish polling data and show you how to predict who will be Denmark's next Prime minister.

    I will discuss some methods used to create poll of polls in order to make more robust forecastings and different approaches to estimating uncertainty in polls.

    ---

    We currently have no sponsors for food and drink so if you know of anyone to sponsor a bunch of pizzas and drinks then let us know.

    2
  • Multi-state Churn Analysis + Unf*ck your code

    ITU, IT University, Auditorium 3

    • What we'll do
    Title: Multi-state churn analysis with a subscription product

    Subscriptions are no longer just for newspapers. The consumer product landscape, particularly among e-commerce firms, includes a bevy of subscription-based business models. Internet and mobile phone subscriptions are now commonplace and joining the ranks are dietary supplements, meals, clothing, cosmetics and personal grooming products.

    Standard metrics to diagnose a healthy consumer-brand relationship typically include customer purchase frequency and ultimately, retention of the customer demonstrated by regular purchases. If a brand notices that a customer isn’t purchasing, it may consider targeting the customer with discount offers or deploying a tailored messaging campaign in the hope that the customer will return and not “churn”.

    The churn diagnosis, however, becomes more complicated for subscription-based products, many of which offer multiple delivery frequencies and the ability to pause a subscription. Brands with subscription-based products need to have some reliable measure of churn propensity so they can further isolate the factors that lead to churn and preemptively identify at-risk customers.

    During the presentation I’ll show how to analyze churn propensity for products with multiple states, such as different subscription cadences or a paused subscription. If the time allows I’ll also present useful plots that provide deep insights during such modeling, that we have developed at Gradient Metrics - a quantitative marketing agency (http://gradientmetrics.com/).

    Bio:
    Marcin Kosiński has a master degree in Mathematical Statistics and Data Analysis specialty. Community events host: co-organizer of the +1600 members R Enthusiasts meetups in Warsaw and the main organizer of the Polish R Users Conference 2017 (‘Why R? 2017’ whyr.pl). Interested in R packages development and survival analysis models. Currently explores and improves methods for quantitative marketing analyses and global surveys at Gradient Metrics.

    Title: Unf*ck Your code

    This talk is about how to unf*ck your code. And by unf*ucking I mean making sure that it works every time, under every condition and is written in a way that makes sense to you and to others. Because if it doesn’t, then your code is f*cked.

    If you are a researcher then it means doing reproducible research. If you work in business it means writing production ready code. And if you are just writing code alone in the dark it means writing code your future self will understand.

    This talk is about coding styles, comments, documentation, packaging, tests and docker. This talk aims a making good programmers out of good data scientists.

    Bio:
    if(Mikkel && R){
    message("Amazing!")
    }

    News:
    - R-Ladies are in Copenhagen now! Check out their group: https://www.meetup.com/rladies-copenhagen

    • What to bring

    • Important to know

    6