1) Netflix’s Personalization Infrastructure, 2) Secondary Index w/ Apache Drill

Name: 1) Netflix’s Personalization Infrastructure, 2) Secondary Index w/ Apache Drill
Start: 2018-11-14T18:30:00-08:00
End: 2018-11-14T20:30:00-08:00
Location: MapR Technologies

Hosted by Pritesh and Ellen F.

Bay Area Apache Drill User Group

Details

We’ve restarted the Bay Area Apache Drill User Group meetup!

Join us on Nov 14th for an evening of presentations and discussions on what’s new with Apache Drill, how users put it to work and what’s next.

We’ll have speakers, networking, food and, of course, Drillers!

We will also have a WebEx session - see the bottom of the description for WebEx details.

Talk #1: Netflix's Personalization Infrastructure

Netflix uses Machine Learning algorithms and A/B tests to drive all of the content recommendations for our members. To improve the quality of the personalized recommendations, we need historical facts data for a user, which are then used to generate features required by the machine learning models. Building a fact store at an ever-evolving Netflix scale is non-trivial. We use Spark and Scala extensively and a variety of compression techniques to store/retrieve data efficiently. Netflix is experimenting with Apache Drill to determine how it could help speed up some of the query patterns of our interest.

Speaker Bio:
Nitin Sharma works on the Personalization Infrastructure team at Netflix. His primary focus is on building distributed infrastructure at Netflix to enable personalized recommendations at scale. He is passionate about Large Scale Distributed Systems, Search Platforms, and Performance Optimizations. He is an active open source contributor for Apache Solr and a few other Apache projects.

Talk #2: Accelerating SQL queries in NoSQL databases using Apache Drill and Secondary Indexes

NoSQL databases that support secondary indexes have historically lagged the RDBMS world in exposing that functionality for ANSI SQL queries. In this talk, we will describe new functionality in Apache Drill that bridges this gap by providing index planning and execution against NoSQL databases that support secondary indexes. A reference implementation with MapR-DB JSON will be used to describe the use cases. Queries with indexed columns in the WHERE clause, ORDER BY, GROUP BY and Joins are able to leverage secondary indexes. Examples of all of these types of queries will be shown via a demo.

Speaker Bio:
Aman Sinha is a software engineer at MapR, a PMC member (and chair for the year 2017-18) of Apache Drill and PMC member of Apache Calcite. His work focuses on the areas of SQL query optimization and query processing in parallel/distributed relational databases, NoSQL and Hadoop based systems.

WebEx

Apache Drill Meetup Nov 2018
Hosted by Pritesh Maker

Wednesday 6:15 pm | 3 hours | (UTC-08:00) Pacific Time (US & Canada)
Meeting number: 282 618 904
Password: ApacheDrill
https://mapr.webex.com/mapr/j.php?MTID=m308db320096bad85acaf6b2cd6ebd4a6

Join by phone
1-877-668-4493 Call-in toll-free number (US/Canada)
1-650-479-3208 Call-in toll number (US/Canada)
Access code: 282 618 904

Bay Area Apache Drill User Group

1) Netflix’s Personalization Infrastructure, 2) Secondary Index w/ Apache Drill

Bay Area Apache Drill User Group

Details

Talk #1: Netflix's Personalization Infrastructure

Talk #2: Accelerating SQL queries in NoSQL databases using Apache Drill and Secondary Indexes

WebEx

Related topics

You may also like