Skip to content

Advanced Spark Meetup

Photo of Brian Husted
Hosted By
Brian H.
Advanced Spark Meetup

Details

Overview

Please join us for a Spark meetup you won't soon forget! You will enjoy this "meetup-turned-mini-conference" (free, as always) covering many aspects of Apache Spark by Chris Fregly - https://www.linkedin.com/in/cfregly

Chris is a Principal Data Solutions Engineer for the newly-formed IBM Spark Technology Center, an Apache Spark Contributor, and a Netflix Open Source Committer. Chris is also the founder of the global Advanced Apache Spark Meetup and author of the upcoming book, Advanced Spark @ advancedspark.com (http://advancedspark.com/) Previously, Chris was a Data Solutions Engineer at Databricks and a Streaming Data Engineer at Netflix.

***Please Note: This is a privately funded event and Recruiting is NOT allowed

Meetup Agenda:

5:00 – 6:00 – Live DJ, Networking, Happy Hour, Pizza

6:00 – 8:30 - Interactive Spark lecture and exercises

Presented by Chris Fregly

Title:

Real-time, Advanced Analytics and Recommendations using Machine Learning, Natural Language Processing, Graph Processing, and Approximations with Apache Spark, Stanford CoreNLP, and Twitter Algebird

Agenda

Intro

Live, Interactive Recommendations Demo

Spark ML, GraphX, Streaming, Kafka, Cassandra, Docker

Types of Similarity

Euclidean vs. Non-Euclidean Similarity

User-to-User Similarity

Content-based, Item-to-Item Similarity (Amazon)

Collaborative-based, User-to-Item Similarity (Netflix)

Graph-based, Item-to-Item Similarity Pathway (Spotify)

Similarity Approximations at Scale

Twitter Algebird

MinHash and Bucketing

Locality Sensitive Hashing (LSH)

Netflix Recommendations: From Ratings to Real-Time

DVD-Ratings-based $1M Netflix Prize (2009)

Streaming-based "Trending Now" (2016)

Q & A

Related Links*

https://github.com/fluxcapacitor/pipeline/wiki

http://cdn.oreillystatic.com/en/assets/1/event/... (http://cdn.oreillystatic.com/en/assets/1/event/105/Algebra%20for%20Scalable%20Analytics%20Presentation.pdf)

http://static.echonest.com/BoilTheFrog/

http://www.netflixprize.com/assets/GrandPrize20... (http://www.netflixprize.com/assets/GrandPrize2009_BPC_BellKor.pdf)

http://blog.echen.me/2011/10/24/winning-the-net... (http://blog.echen.me/2011/10/24/winning-the-netflix-prize-a-summary/)

http://www.cc.gatech.edu/~zha/CSE8801/CF/kdd-fp... (http://www.cc.gatech.edu/~zha/CSE8801/CF/kdd-fp074-koren.pdf)

Photo of Distributed Computing Maryland group
Distributed Computing Maryland
See more events
Jailbreak Brewing Company
9445 Washington Blvd N Ste F · Laurel, MD