Skip to content

Details

The job of a data engineer is to build, manage, and optimize systems for transforming data into forms that facilitate analysis. Despite the broad adoption of R as a language for data science, it has taken a back seat to Python and other languages in the area of data engineering. But this is beginning to change. Data engineering tasks that were previously infeasible in R are becoming straightforward thanks to recent developments in the Apache Arrow project and the R package `arrow`. Arrow provides tools for working with tabular data that emphasize performance, efficiency, standardization, and interoperability with other languages and systems in the broader data ecosystem. Using the R package `arrow`, it is now possible to implement many data engineering and ETL tasks entirely in R, avoiding the overhead of switching to another language Python or using a framework like Spark.

All skill levels are welcomed.

Agenda:
6:30pm - 6:40pm Introductions
6:40pm - 7:20pm Topic Presentation
7:20pm - 7:30pm Closing Remarks

(Topic presentations sometimes run longer than 40 minutes)

This meetup will be 100% virtual! Check the "Location" section of the web page for the Zoom Meeting link.

Support graciously provided by the R Consortium (https://www.r-consortium.com) and Onebridge (https://www.onebridge.tech/)

Sponsors

R Consortium

R Consortium

R Consortium is sponsoring IndyUseR group

You may also like