1) Netflix’s Personalization Infrastructure, 2) Secondary Index w/ Apache Drill
Details
We’ve restarted the Bay Area Apache Drill User Group meetup!
Join us on Nov 14th for an evening of presentations and discussions on what’s new with Apache Drill, how users put it to work and what’s next.
We’ll have speakers, networking, food and, of course, Drillers!
We will also have a WebEx session - see the bottom of the description for WebEx details.
Talk #1: Netflix's Personalization Infrastructure
Netflix uses Machine Learning algorithms and A/B tests to drive all of the content recommendations for our members. To improve the quality of the personalized recommendations, we need historical facts data for a user, which are then used to generate features required by the machine learning models. Building a fact store at an ever-evolving Netflix scale is non-trivial. We use Spark and Scala extensively and a variety of compression techniques to store/retrieve data efficiently. Netflix is experimenting with Apache Drill to determine how it could help speed up some of the query patterns of our interest.
Speaker Bio:
Nitin Sharma works on the Personalization Infrastructure team at Netflix. His primary focus is on building distributed infrastructure at Netflix to enable personalized recommendations at scale. He is passionate about Large Scale Distributed Systems, Search Platforms, and Performance Optimizations. He is an active open source contributor for Apache Solr and a few other Apache projects.
Talk #2: Accelerating SQL queries in NoSQL databases using Apache Drill and Secondary Indexes
NoSQL databases that support secondary indexes have historically lagged the RDBMS world in exposing that functionality for ANSI SQL queries. In this talk, we will describe new functionality in Apache Drill that bridges this gap by providing index planning and execution against NoSQL databases that support secondary indexes. A reference implementation with MapR-DB JSON will be used to describe the use cases. Queries with indexed columns in the WHERE clause, ORDER BY, GROUP BY and Joins are able to leverage secondary indexes. Examples of all of these types of queries will be shown via a demo.
Speaker Bio:
Aman Sinha is a software engineer at MapR, a PMC member (and chair for the year 2017-18) of Apache Drill and PMC member of Apache Calcite. His work focuses on the areas of SQL query optimization and query processing in parallel/distributed relational databases, NoSQL and Hadoop based systems.
WebEx
Apache Drill Meetup Nov 2018
Hosted by Pritesh Maker
Wednesday 6:15 pm | 3 hours | (UTC-08:00) Pacific Time (US & Canada)
Meeting number: 282 618 904
Password: ApacheDrill
https://mapr.webex.com/mapr/j.php?MTID=m308db320096bad85acaf6b2cd6ebd4a6
Join by phone
1-877-668-4493 Call-in toll-free number (US/Canada)
1-650-479-3208 Call-in toll number (US/Canada)
Access code: 282 618 904
