Building an ETL pipeline from scratch in 30 mins


Details
Agenda:
6 pm - 6:30 pm: Mingle, food, announcements
6:30 - 8:00 pm: Talks, hands on demo, Q&A
To expedite check in at Galvanize; register here (https://www.eventbrite.com/e/sf-data-science-building-from-scratch-an-etl-pipeline-in-30-mins-tickets-23007683601)
Talk 1: Building from scratch an ETL pipeline in 30 mins
In this talk we will build a simple batch pipeline that ingests data from text files into a BigQuery table. The talk will be a live coding demo with concepts introduced as we go along. No previous big data background needed although people familiar with Spark or Flink will easily identify the similar concepts. As always with Big Data frameworks the same code can be used to scale from GB files to TB files without missing a beat and with essentially zero knobs to tune.
What to Bring:
Live coding demo. Feel free to bring you laptop.
Meet the Speaker:
Silviu Calinoiu is the technical lead for Cloud Dataflow for Python at Google. His team works hard to catch up with the more advanced Java version which is now an Apache incubation project called Apache Beam. Before this project he worked for two years in the Google App Engine group and before that he spent an unmentionable number of years in the Microsoft Windows kernel group developing various tools to detect and analyze memory corruptions, deadlocks and other exotic bugs.
Talk 2: TBD
Second speaker confirmation and details coming soon.
Register here (https://www.eventbrite.com/e/sf-data-science-building-from-scratch-an-etl-pipeline-in-30-mins-tickets-23007683601) to expedite your check in at Galvanize
Thanks to Friends / Sponsors:
This event is in collaboration with our friends from the SF Data Engineering Meetup (https://www.meetup.com/SF-Data-Engineering/).
Special thanks toGalvanize (http://www.galvanize.com/courses/data-science/) for hosting this Meetup. Galvanize offers an immersive data science bootcamp (http://www.galvanize.com/courses/data-science/)that trains analysts and programmers to be job ready data scientists in 12 weeks. Learn about Galvanize's data science training here (http://www.galvanize.com/courses/data-science/).

Building an ETL pipeline from scratch in 30 mins