• Making Extra Great Slides with xaringan, xaringanthemer and xaringanExtra

    This month we look at the xaringan presentation ecosystem.

    Thank you to EcoHealth Alliance for providing the Zoom link.

    Conversations during the meetup are encouraged in the monthly-meetup-chat channel in the nyhackr slack.

    About the Talk:
    The xaringan package by Yihui Xie lets R users and R Markdown authors easily blend data, text, plots and htmlwidgets into beautiful HTML presentations that look great on the web, in print, and on screens. In addition to demonstrating how to go from R Markdown to web-based slides with xaringan, in this talk I'll show you how to completely customize the appearance of your slides with xaringanthemer, a package that lets you quickly create a complete slide theme from only a few color choices. Then we'll go beyond appearances with a variety of addins and extensions from the xaringanExtra package, including: a tiled slide overview, editable slides, embedded webcam videos, tabbed panels, extra styles, shareable and embeddable slides, animations, and real time slide broadcasting.

    About Garrick:
    Garrick Aden-Buie is a Data Science Educator at RStudio and lives in sunny St. Petersburg, Florida. His passion is to combine creative coding with programming education, using code to build tools that teach coding to new and advanced R users alike. Like tidyexplain: a project that used ggplot2 and gganimate to reimagine database operations as colorful flying boxes instead of the typical Venn diagrams. Garrick has developed a number of open source addins and packages for RStudio—such as regexplain, shrtcts and rsthemes—and is always easily distracted by projects that combine R Markdown and online learning or teaching.

    The talk will begin at 7 PM EST and we will start admitting people to the event shortly before. Since this is completely remote there will be no pizza but everyone is encouraged to have pizza individually.

    3
  • Ballista: Distributed Compute with Rust and Apache Arrow

    This month we have a talk about Arrow and Rust for distributed computing.

    Thank you to EcoHealth Alliance for providing the Zoom link.

    Conversation during the meetup are encouraged in the Zoom chat and on the nyhackr slack.

    About the Talk:
    Andy will give an overview of Apache Arrow and the DataFusion query engine and explain how he is using these building blocks to implement Ballista, which is a distributed compute platform.

    About Andy:
    Andy Grove is a PMC member of Apache Arrow, where he donated the initial Rust implementation, and later donated DataFusion, an in-memory SQL/DataFrame query engine.

    The talk will begin at 7 PM EST and we will start admitting people to the event shortly before. Since this is completely remote there will be no pizza but everyone is encouraged to have pizza individually.

    3
  • Torch for R

    Online event

    The week after rstudio::global(2021) we have a talk about the latest deep learning framework in R, {torch}.

    Thank you to EcoHealth Alliance for providing the Zoom link.

    Conversation during the meetup are encouraged in the Zoom chat and on the nyhackr slack.

    About the Talk:
    In this talk we are going to discuss the implementation of torch, introduce its main components and show some exciting new features that we plan to implement in 2021. We will also discuss the ecosystem that we want to build around the torch project.

    About Daniel:
    Daniel is a software engineer at RStudio and co-author of the torch package. He is also the maintainer of the TensorFlow for R project.

    The talk will begin at 7 PM EST and we will start admitting people to the event shortly before. Since this is completely remote there will be no pizza but everyone is encouraged to have pizza individually.

  • Reproducible Computation at Scale in R with Targets

    Online event

    Following up our earlier meetup about drake, we have its creator talking about its successor, targets.

    Thank you to EcoHealth Alliance for providing the Zoom link.

    This meetup is the kickoff to the second 2020 R week with The R Conference for Government & Public Sector on December 2-4. Much like our recent NYR, this is virtual, so anyone around the world can attend. Visit https://rstats.ai/gov/ to learn more and use code nyhackr for a 20% discount on tickets, including workshops.

    About the Talk:
    Ambitious workflows in R, such as machine learning analyses, can be difficult to manage. A single round of computation can take several hours to complete, and routine updates to the code and data tend to invalidate hard-earned results. You can enhance the maintainability, hygiene, speed, scale, and reproducibility of such projects with the targets R package. targets resolves the dependency structure of your analysis pipeline, skips tasks that are already up to date, executes the rest with optional distributed computing, and manages data storage for you. It surpasses the permanent limitations of its predecessor, drake, and provides increased efficiency and a smoother user experience. This talk demonstrates how to create and maintain a Bayesian model validation project using targets-powered automation.

    About Will:
    Will is a statistician and software developer. He likes to solve scientific problems and engage with data analysis technologies.

    The talk will begin at 7 PM EST and we will start admitting people to the event shortly before. Since this is completely remote there will be no pizza but everyone is encouraged to have pizza individually.

    2
  • Future - Simple, Friendly Parallel Processing for R

    Online event

    This month's speaker comes to us from the R Foundation and R Consortium to talk about parallelism in R.

    Thank you to EcoHealth Alliance for providing the Zoom link.

    The Government & Public Sector R Conference is coming December 2-4, 2020. Much like our recent NYR, this is virtual, so anyone around the world can attend. Visit https://rstats.ai/gov/ to learn more and use code nyhackr for a 20% discount on tickets.

    About the Talk:
    The ‘future’ package provides a minimal and unifying framework for asynchronous, parallel, and distributed computing in R. It is being used to run R code in parallel on the local computer, on remote machines, in the cloud, and on high-performance computing. Popular packages such as 'shiny’, ‘plumber’, and ‘drake’ use futures internally. I will explain what futures are, discuss common feature requests, recent progress, and what is on the roadmap.

    About Henrik:
    Henrik Bengtsson is the author of over 30 R packages on CRAN and Bioconductor, e.g. 'future', 'matrixStats', 'progressr', and 'startup'. His research is on statistics and bioinformatics with an emphasis on high quality, reproducible method development, sustainable implementations, and large-scale processing. He is an Associate Professor in the Department of Epidemiology & Biostatistics at the University of California, San Francisco (UCSF), affiliated with the UCSF Helen Diller Family Comprehensive Cancer Center, and a member of the R Foundation and the R Consortium Infrastructure Steering Committee.

    The talk will begin at 7 PM EST and we will start admitting people to the event shortly before. Since this is completely remote there will be no pizza but everyone is encouraged to have pizza individually.

    2
  • Data Validation in R: From Principles to Tools and Packages

    Continuing to bring speakers from around the world, we have Caterina Constantinescu from Scotland talking about data validation.

    Please note the 6 PM start time.

    Thank you to EcoHealth Alliance for providing the Zoom link.

    We just launched our government-focused R conference and much like our recent NYR, this is virtual, so anyone around the world can attend. Visit https://rstats.ai/gov/ to learn more and use code nyhackr for a 20% discount on tickets.

    About the Talk:
    Although data cleaning is a frequent topic of conversation (and commiseration) in the world of data science, data validation---somewhat surprisingly---is discussed relatively less often. So in this talk, data validation will take centre stage, as we take a look at what it is (and is not), as well as some guiding principles, best practices and overall criteria to assess/ensure data validity. The talk will also cover several R packages aimed at this precise topic, for instance: {validate}, {assertr} and {ensurer}, as well as other related packages or functions, with examples provided as we go along. By the end of this talk, the aim is to have provided an overview on the principles and tools in this area, while highlighting the importance of the topic itself.

    About Caterina:
    Dr. Caterina Constantinescu is a data scientist working at Tesco Bank, whose past work ranges across areas such as research methods, national health data, occupational therapy, transport and data for good. Her academic background prior to this involved researching if various emotion-generating stimuli used in lab settings could approximate emotional states occurring in daily life. For several years she was also the organiser of the R meetup in Edinburgh (EdinbR), followed by organising the DataTech conference in 2019. Currently, her work focuses on writing Shiny apps that support data-driven decision-making across the bank.

    The talk will begin at 6 and we will start admitting people to the event shortly before. Since this is completely remote there will be no pizza but everyone is encouraged to have pizza individually.

    7
  • Creating a Command Line Focused Development Environment

    Continuing our virtual meetups we have Nick Janetakis coming to us from the far end of Long Island.

    Thank you to EcoHealth Alliance for providing the Zoom link.

    About the Talk:
    Configuring your terminal and learning how to use the command line from scratch can be a daunting task with a pretty high time investment initially. In this talk we're going to cover what makes the command line useful, practical work flows (live demos), picking a terminal and shell, customizing your prompt, using and configuring tools like tmux and Vim, the concept of dotfiles and if time permits using various Unix tools together to solve real world problems. By the end of the talk you'll be well on your way to having a tricked out command line focused development environment.

    About Nick:
    Nick Janetakis is a developer and sysadmin. He is an independent freelancer, course creator and hosts the Running in Production podcast. Nick also has hundreds of blog posts and YouTube videos focused on building and deploying web apps along with development environment tips and tricks. His website is at https://nickjanetakis.com/ and he can be found on Twitter at @nickjanetakis.

    The talk will begin at 7 and we will start admitting people to the event shortly before. Since this is completely remote there will be no pizza but everyone is encouraged to have pizza individually.

  • Virtual Hanging Out During COVID

    Online event

    While the focus for the meetup is the fantastic talks, another key part is the socializing and community building. So many great conversations took place at Rye House, Central Bar and other locales. While we wait for the days when we can go to the bar after a talk we can at least hang out through cameras and microphones. This worked really well after the conference so let's try it again.

    Bring your favorite beverage (alcoholic or not) and your best pizza (or other food) and join us on Zoom. We'll have a main session and can create breakout rooms for smaller conversations.

    Thank you to EcoHealth Alliance for providing the Zoom link. The password will be emailed ahead of the meetup to people who RSVP yes.

  • That Feeling of Workflowing

    Online event

    We are staying virtual to kick off R Week, with Miles McBain coming to us from Australia, following in the footsteps of Rob Hyndman, Earo Wang and Di Cook.

    Thank you to EcoHealth Alliance for providing the Zoom link.

    This is the first event of the week, with our Sixth Annual R Conference and Workshops taking place that Wednesday through Saturday, 12th-15th. The workshops cover machine learning in both R and Python, web scraping, shiny, GIS, git and the tidyverse. Speakers include Andrew Gelman, Emily Dodwell, Jon Krohn, Emily Robinson, Wes McKinney, Ludmila Janda, David Robinson and too many more to list here. Visit https://rstats.ai/nyr/ for more information and use code nyhackr for a 20% discount.

    About the Talk:
    When would you use an R notebook versus a project dependency graph solver? Should everything be an R package? What can your R project workflow optimise for, and which properties of your work can this affect?

    Despite the importance of questions like these to having a happy and productive time with R, there is little community consensus on the principles for establishing R project workflows.

    We will begin the session by examining these questions and mapping out workflow styles served by existing R tooling. In the second half I will demonstrate practical productivity boosters my team and I have implemented along the way to a workflow centered on {drake}.

    About Miles:
    I'm a Data Scientist at Queensland Fire and Emergency Services where I use R to identify and analyse investment options in our emergency services network. Since I started programming in R about 5 years ago I've become an active #rstats tweeter, blogger, and RWeekly editor. I was an organiser of a local meetup group, three rOpensci OzUnconferences, and useR! 2018 Brisbane. I release R packages that do extremely niche things. Everyone's favourite is {datapasta} but {fnmate} is probably the one I get the most personal joy out of. I am into workflows because of their capacity to change the way work feels.

    The talk will begin at 7 and we will start admitting people to the event shortly before. Since this is completely remote there will be no pizza but everyone is encouraged to have pizza individually.

  • Everything Not Tested will Eventually Fail—Testing Shiny Apps Before Production

    Continuing our virtual meetups we have another speaker from far away. This month we have Colin Fay from France.

    Due to COVID-19 the meetup will take place entirely online. Thank you to EcoHealth Alliance for providing the Zoom link.

    Our annual R Conference in New York is also virtual-only. We have an excellent lineup including Max Kuhn, Erin LeDell, Rob Hyndman, Jacqueline Nolis and Andrew Gelman. There are also two-days of workshops covering Machine Learning with tidymodels, git, Shiny, web scraping, GIS & Mapping, EDA with the tidyverse and even scikit-learn. Visit https://rstats.ai/nyr/ for more information and use code nyhackr for a 20% discount.

    Since we still can't be together for pizza, Scott's Pizza Tours is sponsoring again and offering everyone 25% off virtual tours booked by July 31. Use code NerdsLovePizza at https://www.scottspizzatours.com/ for the discount.

    About the Talk:
    You've probably already heard that "Everything not tested will eventually fail", or read that "If you're not writing unit tests, your users are the unit tests"... As extreme as they might sound, these quotes reveal one of the most important things when it comes to production-software: testing should be done as extensively as possible. Most R developers are already familiar with standard package testing, but things can get a little bit different when it comes to interactive web applications: what do we test, why, and how? In this talk, Colin will start by developing the good practices when it comes to testing Shiny, and present some of the tools you can use to test your own applications.

    About Colin:
    Colin Fay works at ThinkR, a French agency Focused on everything R-related. During the day, he helps companies to take full advantage of the power of R, by providing training, tools and infrastructure. His main areas of expertise are data & software engineering, back-end & infrastructure, and R in production. During the night, Colin is also a hyperactive open source developer and an open data advocate. You can find a lot of his work on his GitHub account (https://github.com/ColinFay) and on ThinkR's account (https://github.com/thinkr-open). He is also active in the Data Science community in France, and an international speaker.

    The talk will begin at 6 and we will start admitting people to the event shortly before. Since this is completely remote there will be no pizza but everyone is encouraged to have pizza individually.

    2