Skip to content

Princeton Tech Meetup #27 w/ Gilt Groupe (Big Data)

Photo of Venu Moola
Hosted By
Venu M.
Princeton Tech Meetup #27 w/ Gilt Groupe (Big Data)

Details

Topic: Big Data - How Gilt Manages Real-time Data Capturing with Kafka, Avro and Hadoop/Hive

Speaker:

Michael Hansen - Principal Data Engineer Gilt Groupe

Agenda:

7:00 - Arrival - Snacks, Pizza and Networking.

7:30 - Introduction / Announcements by the organizers.

7:35 - 7:50 - Demos / Pitches

Demo 1 - SuperDealyo - Sach Kangovi

Demo 2 - Gruberie - Sven Hermann

Demo 3 - Outdoor Exchange (OX) - Dariusz Jamiolkowski

7:50 - How Gilt Manages Real-time Data Capturing with Kafka, Avro and Hadoop/Hive - Michael Hansen - Principal Data Engineer - Gilt Groupe

8:45 - Open-mic to quickly promote your business or broadcast a need that someone in the group might be able to fill.

8:55 - Wrap-Up, discussion of Meetup, feedback and opportunities for improvement or future topics.

8:59 - End of formal part of meeting.

9:00 - Exit Venue and head to After Hours Party - Location: TBA

More about this Event:

SuperDealyo: Presentation and Demo by SuperDealyo team of a unique location based, shopping list driven platform bringing lowest prices to YOUR fingertips.

Gruberie: Your one stop gateway to great food deals.

Outdoor Exchange (OX): A trusted community based platform where the supply and demand for rental of outdoor gear is met.

Michael Hansen - Principal Data Engineer - Gilt Groupe

Large-scale, real-time (or near real-time) data capture of various clickstream and messaging events has become much more practical with the combination of Kafka and Hadoop. However, without some sort of backward compatible data structure for these data events, a lot of unnecessary transformation and formatting work is left to the data consumers. This is where protocol buffers, a data serialization system like Apache Avro, or frameworks like Apache Thrift can come to the rescue. This talk will focus on how Gilt uses the trio of Kafka, Avro, and Hadoop/Hive to manage and control data structure for real-time events passed into HDFS/Hive and/or consumed by other web services.

Photo of Princeton Tech Meetup group
Princeton Tech Meetup
See more events