Delta Lake w/ Michael Armbrust + Spark @ Salesforce

Name: Delta Lake w/ Michael Armbrust + Spark @ Salesforce
Start: 2019-09-25T18:00:00-07:00
End: 2019-09-25T20:00:00-07:00
Location: Salesforce.com Inc.

Hosted by Denny L. and Judy N.

Seattle Spark+AI Meetup

Details

Join us for a fun data engineering event focusing on the open-source project Delta Lake and Salesforce's search ranking powered by spark.

Delta Lake session will feature Michael Armbrust is a committer and PMC member of Apache Spark, the original creator of Spark SQL, and leads the team at Databricks that designed and built Structured Streaming and Delta Lake.

Salesforce's search relevance team will present on search ranking optimization at scale with spark. Utilizing spark and kubernetes to analyze billions of records, Salesforce is able to deliver accurate search results to millions of customers world wide everyday.

Agenda:
6:00pm-6:30pm: Welcome
6:30pm-7:00pm: Spark @ Salesforce: Our Search for Insights
7:00pm-7:45pm: Open Source Reliability for Data Lakes w/ Apache Spark
7:45pm-8:00pm: Q&A and Wrap up

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Session: Delta Lake: Open Source Reliability for Data Lakes w/ Apache Spark

Speaker: Michael Armbrust

Delta Lake is an open-source storage layer that brings reliability to data lakes. Delta Lake offers ACID transactions, scalable metadata handling, and unifies streaming and batch data processing. It runs on top of your existing data lake and is fully compatible with Apache Spark APIs.

In this talk, we will cover

All technical aspects of Delta Features
What’s coming
How to get started using it
How to contribute

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Session: Spark @ Salesforce: Our Search for Insights

Speaker: Rama Raman, Callie Anderson

Spark is an integral tool in developing Salesforce Search, our most used functionality across Salesforce’s platform and products and arguably the largest enterprise search implementation worldwide. Salesforce Search fields millions of queries for thousands of organizations over billions of records every day, and the computing efficiency and flexibility of Spark running on top of in-house Kubernetes cluster empowers model training at scale in order to optimize ranking the records for every query.

Seattle Spark+AI Meetup

Delta Lake w/ Michael Armbrust + Spark @ Salesforce

Seattle Spark+AI Meetup

Details

Related topics

You may also like