Skip to content

Why Is All of My Data Stored in Parquet? (A Deep Dive into the Data Lake)

Photo of Aaron Stannard
Hosted By
Aaron S.
Why Is All of My Data Stored in Parquet? (A Deep Dive into the Data Lake)

Details

This is a great opportunity to create community, relationships, and learn about new technology!

SCHEDULE

5:30pm - Arrival/Registration + Food/Drinks + Networking

6:00pm - Speaker/Presentation

7:15pm - Food/Drinks + Networking

8:00pm - End

FOOD/DRINKS

Food and drinks will be provided! Please contact us if you have any dietary restrictions/food allergies.

DIRECTIONS AND PARKING INSTRUCTIONS

Meeting will be held in the offices of SmartDraw Software, 1780 Hughes Landing Blvd #1100 on the 11th floor.

Enter through the main doors of the building and take the elevator to the 11th floor; the door to SmartDraw's suite is in the elevator lobby on the 11th floor.

If you arrive at 1780 Hughes Landing after 6pm, a member of the SmartDraw team will help you enter the building then travel up to the 11th floor.

Parking:

Park in the garage directly opposite. You'll need to take a ticket to enter but there is no fee to leave - parking is free.

Speaker & Talk

Chris Bremer

Chris is the software simulation manager at NOV in Conroe TX, where he manages a team of engineers and data scientists.

Chris will be discussing the following:

Why Is All of My Data Stored in Parquet? (A Deep Dive into the Data Lake)

Apache Parquet is a data format for efficient storage and retrieval of massive quantities of data that has been popularized by widespread adoption of the "data lake" architecture. The title of this talk is a question I asked of a coworker (a data engineer) in the process of consolidating our reporting and inference models onto a new analytics platform. The short answer, "because Databricks", didn't satisfy me. I was accustomed to traditional SQL, executing ad hoc queries from any language or platform I chose.

The long answer comes from a months-long investigation into how to read and write Parquet files from my favorite languages (C# and F#) and a growing appreciation of the format's flexibility. I hope to convey more than just the Parquet specification, but also the ""how"" of parsing and writing Parquet and ""why"" this data format is so popular. I want to spark a discussion on the value of open-source data formats and how to avoid vendor lock-in for data access.

P.S. We're looking for speakers for 2025!

Do you have insights, projects, or expertise you'd like to share with the NHDNUG community? Whether you're an experienced speaker or presenting for the first time, we'd love to feature you in one of our upcoming meetups!

If you're interested in speaking at one of our 2025 meetups (held the 3rd Thursday of each month), email Shelby at shelby.franklin@petabridge.com to discuss your topic ideas. Thank you!

Photo of North Houston .Net Users Group group
North Houston .Net Users Group
See more events
1780 Hughes Landing Blvd #1100
1780 Hughes Landing Blvd #1100 · The Woodlands, TX