Skip to content

Real-time Aggregation, Approximation, Similarities, and Recommendations at Scale

Photo of Prasad Sripathi
Hosted By
Prasad S.
Real-time Aggregation, Approximation, Similarities, and Recommendations at Scale

Details

Agenda

Live, Interactive Recommendations Demo - NiFi, Kafka, Stanford CoreNLP, Docker, Word2Vec, LDA, Twitter Algebird, Spark Streaming, SQL, ML, GraphX.

Deep Dive (advancedspark.com)

Types of Similarity - Euclidean vs. Non-Euclidean Similarity, Jaccard Similarity, Cosine Similarity, LogLikelihood Similarity, Edit Distance

Text-based Similarities and Analytics - Word2Vec, LDA Topic Extraction, TextRank

Similarity-based Recommendations - User-to-User, Content-based, Item-to-Item (Amazon), Collaborative-based, User-to-Item (Netflix), Graph-based, Item-to-Item "Pathways" (Spotify)

Aggregations, Approximations, and Similarities at Scale - Twitter Algebird, MinHash and Bucketing, Locality Sensitive Hashing (LSH), BloomFilters, CountMin Sketch, HyperLogLog

Q & A

Bio

Chris Fregly is a Principal Data Solutions Engineer for the newly-formed IBM Spark Technology Center, an Apache Spark Contributor, and a Netflix Open Source Committer.

Chris is also the founder of the global Advanced Apache Spark Meetup and author of the upcoming book, Advanced Spark @ advancedspark.com (http://advancedspark.com/).

Previously, Chris was a Data Solutions Engineer at Databricks and a Streaming Data Engineer at Netflix.

When Chris isn’t contributing to Spark and other open source projects, he’s creating book chapters, slides, and demos to share knowledge with his peers at meetups and conferences throughout the world

Related Links

https://github.com/fluxcapacitor/pipeline/wiki

http://cdn.oreillystatic.com/en/assets/1/event/105/Algebra%20for%20Scalable%20Analytics%20Presentation.pdf

http://static.echonest.com/BoilTheFrog/

http://www.netflixprize.com/assets/GrandPrize2009_BPC_BellKor.pdf

http://blog.echen.me/2011/10/24/winning-the-netflix-prize-a-summary/

http://www.cc.gatech.edu/~zha/CSE8801/CF/kdd-fp074-koren.pdf

Photo of Large Language Models group
Large Language Models
See more events
Washington Road and Ivy Lane, Princeton, NJ 08544 · Princeton, NJ