ML Inside Presto Distributed SQL Query Engine


Details
Presto is an open source distributed SQL query engine used by Facebook, in our Hadoop warehouse. It's typically about 10x faster than Hive, and can be extended to a number of other use cases. One of these extensions adds SQL functions to create and make predictions with machine learning models. The aim of this is to significantly reduce the time it takes to prototype a model, by moving the construction and testing of the model to the database.
Christopher Berner works as a software engineer at Facebook on the Presto team. He wrote the ML functionality, and has worked on the query planner, type system, bytecode generator, and many other pieces of Presto. Before Presto he worked on the newsfeed ranking team developing machine learning models.

ML Inside Presto Distributed SQL Query Engine