Apache Arrow: Enabling Data Engineering in R - Ian Cook

Name: Apache Arrow: Enabling Data Engineering in R - Ian Cook
Start: 2021-05-18T18:30:00-04:00
End: 2021-05-18T20:30:00-04:00

Hosted by Anonymous_163479982

IndyUseR Group

Details

The job of a data engineer is to build, manage, and optimize systems for transforming data into forms that facilitate analysis. Despite the broad adoption of R as a language for data science, it has taken a back seat to Python and other languages in the area of data engineering. But this is beginning to change. Data engineering tasks that were previously infeasible in R are becoming straightforward thanks to recent developments in the Apache Arrow project and the R package `arrow`. Arrow provides tools for working with tabular data that emphasize performance, efficiency, standardization, and interoperability with other languages and systems in the broader data ecosystem. Using the R package `arrow`, it is now possible to implement many data engineering and ETL tasks entirely in R, avoiding the overhead of switching to another language Python or using a framework like Spark.

All skill levels are welcomed.

Agenda:
6:30pm - 6:40pm Introductions
6:40pm - 7:20pm Topic Presentation
7:20pm - 7:30pm Closing Remarks

(Topic presentations sometimes run longer than 40 minutes)

This meetup will be 100% virtual! Check the "Location" section of the web page for the Zoom Meeting link.

Support graciously provided by the R Consortium (https://www.r-consortium.com) and Onebridge (https://www.onebridge.tech/)

IndyUseR Group

R Consortium

Apache Arrow: Enabling Data Engineering in R - Ian Cook

IndyUseR Group

Details

Sponsors

R Consortium

You may also like