addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscrossdots-three-verticaleditemptyheartexporteye-with-lineeyefacebookfolderfullheartglobegmailgooglegroupsimageimagesinstagramlinklocation-pinm-swarmSearchmailmessagesminusmoremuplabelShape 3 + Rectangle 1outlookpersonJoin Group on CardStartprice-ribbonShapeShapeShapeImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruseryahoo

Streaming for Personalization Datasets at Netflix

  • AMD Auditorium

    One AMD Place, Sunnyvale, CA (map)

    37.386753 -121.998604

  • Abstract:

    Streaming applications have historically been complex to design and implement because of the significant infrastructure investment. However, recent active developments in various streaming platforms provide an easy transition to stream processing, and enable analytics applications/experiments to consume near real-time data without massive development cycles.In this session, we will present our experience on stream processing unbounded datasets in the personalization space. The datasets consisted of -- but were not limited to -- the stream of playback events (all of Netflix’s plays worldwide) that are used as feedback for all personalization algorithms. These datasets when ultimately consumed by our machine learning models, directly affect the customer’s personalized experience, which means that the impact is high and tolerance for failure is low. We’ll talk about the experiments we did to compare Apache Spark and Apache Flink, the impact that we had on our customers, and (most importantly) the challenges we faced.

    Spearker bio

    Shriya is an engineer on the personalization analytics team at Netflix. She has been working on writing a framework on top of Spark batch processing that allows for a generic way of producing the various data-sets that are required for our machine learning algorithms. She is also now exploring streaming as a mechanism to provide data that is as accurate as batch, but is updated more frequently in order to refresh the personalized experience of Netflix users, more than once a day.


    6 - 6:30 pm  light dinner + networking

    6:30 pm -- 6:35 pm introduction 

    6:35 pm -- 7:40 pm main talk + QA

    7:40 pm -- 8 pm networking

    8 pm -- 8 :30 pm closing

    8:30 pm -- office closed

Join or login to comment.

Want to go?

Join and RSVP

Our Sponsors

  • Alpine Data Labs

    organize,venue, security, food, drink as well as travel expenses

  • Yelp

    They provide venue, food and drinks, security and video recording

  • BlackRock

    venue, video recording, food and drinks

  • AppDynamics

    Venue and food

  • GoPro

    venue, food/drink

  • Netflix

    organize event and venue; sponsor food and drink

  • Paxata

    venue, food, video taping,

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy