Skip to content

Intro to Data Base Columnation with Apache Parquet

Photo of Scott Cote
Hosted By
Scott C. and 4 others
Intro to Data Base Columnation with Apache Parquet

Details

Fellow Experimenters:

I'm very happy to announce that Ryan Blue (https://www.linkedin.com/in/rdblue) of Cloudera (http://www.cloudera.com/content/cloudera/en/home.html) has agreed to explain to us Columnar Data Stores and Apache Parquet (https://parquet.apache.org/). Let's give him and Cloudera a big thank you.

In the words of Ryan -Apache Parquet! In this talk, I'll give an overview of the file format and columnar storage in general. Then, we'll cover some practical use cases and what you need to change in your applications to avoid problems and get good performance out of the format.

You can follow him on twitter (https://twitter.com/6d352b5d3028e4b) and see his lovely source code on github: http://github.com/rdblue

Make sure so say thank you when opportunity presents.

As usual, we will be in the fabulous facilities of Improving Enterprises (http://www.improvingenterprises.com/) to enjoy pizza, soda, and beer. Thank you again IE. Make sure you also tell them thank you.

Don't forget that we are still looking for logos. So far, Don has been burning a streak with submissions.

Final note, I will most likely be livestreaming this event from my twitter (https://twitter.com/scottccote) account on Periscope (haven't yet figured out how to have two Periscope accounts). References will be made to @dfwdatascience (how to find it).

Bring your thinking caps and your chatter box - you will need both :)

Sincerely,

SCott

Photo of DFW Data Science group
DFW Data Science
See more events
DFW Data Science
Photo of DFW Data Science group
No ratings yet
Improving
5445 Legacy Dr, Plano, TX, Suite 100 · Frisco, TX