Skip to content

Efficiently Engineering Bigger Data with Arrow

Photo of Jared Lander
Hosted By
Jared L.
Efficiently Engineering Bigger Data with Arrow

Details

We are back to virtual only this month so Nic Crane can join us from overseas to talk about Arrow.

Questions are encouraged in the monthly-meetup-chat channel in the nyhackr slack. Likewise, people should list jobs in the job-postings channel.

After the talk we will give away free tickets to the R in Government Conference taking place October 18-20. For those that don't win, you can get a 20% discount with code nyhackr.

Thank you EcoHealth Alliance for providing the Zoom link.

About the Talk:
Data analysis pipelines with larger-than-memory data are becoming more and more commonplace. There are often blurred lines between data science and data engineering, and knowing a bit of both is a sure-fire way to make your life easier when working with big datasets. In this talk, I will give an overview of the arrow R package and best practices for getting the most out of your data when working with bigger datasets. I'll demo the dplyr interface to arrow, and give you some tips and tricks for getting the most out of arrow's functionality, as well as applying data engineering principles to speed things up even more.

About Nic:
Nic Crane is a software engineer with a background in data science. Nic is passionate about open source, and learning and teaching all things R, and is the maintainer of the Arrow R package.

The talk will begin at 7 PM America/New_York and we will start admitting people to the event shortly before. Since this is completely remote there will be no pizza but everyone is encouraged to have pizza individually.

Photo of New York Open Statistical Programming Meetup group
New York Open Statistical Programming Meetup
See more events