Skip to content

March Presentation Night - Spark 2.3, Cloud, and Use Cases

Photo of Joseph Kambourakis
Hosted By
Joseph K.
March Presentation Night - Spark 2.3, Cloud, and Use Cases

Details

Big thanks to our Sponsor Amazon for providing a location, pizza, and beer! Please bring a photo ID with you to get into the building. RSVP closes 48 hours before the meetup to provide them with the attendee list.

Quick Intro
What's new with Apache Spark 2.3

1st Speaker Ion Stoica
Title: Apache Spark, Past, Present, and Future

Abstract: In this talk Ion will recall the humble beginnings of Apache Spark, it's rapid raise in popularity and the reasons behind this rise, and will wrap up with some exciting future directions. Ion will also discuss some of the more interesting applications built on top of Apache Spark, such as large scale genomics data processing.

Bio: Ion Stoica is the Executive Chairman of Databricks, which he co-founded in 2013 to commercialize Spark and other related big data technologies developed at AMPLab (UC Berkeley). Between 2013 and 2016 he served as a founding CEO of Databricks. He is also a Professor in the EECS Department at UC Berkeley, where he currently leads RISELab, a 10+ faculty lab, which focusses on developing technologies for enabling intelligent, real-time decisions on live data with strong security (https://rise.cs.berkeley.edu/). Previously, he was a co-director of AMPLab (https://amplab.cs.berkeley.edu/), and the PI (Principal Investigator) for several highly successful open source projects, Apache Spark, Apache Mesos, and Alluxio (formerly Tachyon). He is an ACM Fellow and has received numerous awards, including the SIGOPS Hall of Fame Award (2015), the SIGCOMM Test of Time Award (2011), and the ACM doctoral dissertation award (2001).

2nd Talk:
AltaStata toolkit creates a secure and compliant Data Lake on top of organization's cloud account (AWS or Azure). Using AltaStata, end users and programs transparently collaboratively manage and process the encrypted Big Data. AltaStata is integrated with Spark and Hadoop for fast parallel processing and streaming.

2nd Speaker:
Serge Vilvovsky is a software architect with 18+ years of experience spanning Big Data analytical tools and sensitive data processing. Serge works at MIT LL as CyberSecurity Consultant and has been a guest speaker at an MIT courses for decision makers in cybersecurity.

3rd Talk:
Ed Huang (@c4pt0r) is the Co-Founder and CTO of PingCAP, the company behind TiDB, a popular open source NewSQL distributed database. His talk will focus on how TiDB integrates and leverages Spark (aka TiSpark) to provide an all-in-one hybrid transactional and analytical processing database for its users.

Photo of Boston Data Technology (Boston Data Group/BDT) group
Boston Data Technology (Boston Data Group/BDT)
See more events
Amazon Cambridge Offices
101 Main St. · Cambridge, MA