Skip to content

Big data with AWS Glue and Athena for understanding NYC taxi data

Photo of daghan acay
Hosted By
daghan a.
Big data with AWS Glue and Athena for understanding NYC taxi data

Details

In this workshop, we will learn cataloging datasets using AWS Glue crawlers. We will interactively author ETL scripts in SageMaker notebook on our local machine while being connected to an AWS Glue development endpoint. We will then deploy ETL scripts into production by turning your ETL script into managed AWS Glue jobs and add appropriate AWS Glue scheduling and triggering conditions. Finally, we will query these new datasets from Amazon Athena using AWS Glue Data Catalog.

Athena cost https://aws.amazon.com/athena/pricing/ ~ 6cents for 10 queries
AWS Glue Cost https://aws.amazon.com/glue/pricing/ ~ 50cents for one hour execution

# Prerequisites

We will use Cloud 9. Alternatively, you might want to set up your local machine. This is a longer process and is explained https://blog.programming-tools-meetup.cloud/dev-machine-setup/

We would like to thanks Vanguard for the venue and the catering.

This workshop is modified version of Reinvent 2018 talk https://www.slideshare.net/AmazonWebServices/serverless-data-prep-with-aws-glue-ant313-aws-reinvent-2018

Photo of Melbourne AWS Programming and Tools Meetup group
Melbourne AWS Programming and Tools Meetup
See more events
Vanguard Australia
Freshwater Place, 2 Southbank Blvd · Southbank, al