Fast big data analytics with Spark on Tachyon in Baidu

Name: Fast big data analytics with Spark on Tachyon in Baidu
Start: 2015-05-28T18:00:00-07:00
End: 2015-05-28T21:00:00-07:00
Location: IBM Foster City

Hosted By

Haoyuan L.

Fast big data analytics with Spark on Tachyon in Baidu

Details

This Tachyon Meetup (https://www.meetup.com/Tachyon) features a chance to interact with other Tachyon (http://tachyon-project.org/) users and the developers, as well as two presentations:

a) Shaoshan Liu from Baidu (http://baidu.com/) will share lessons they learned from Tachyon deployments in production.

b) Haoyuan Li from Tachyon Nexus (http://www.tachyonnexus.com/) will share exciting news related to Tachyon.

Food will be available starting at 6:00 PM, presentations will begin at 6:30PM. Special thanks to IBM for hosting this.

Fast big data analytics with Spark on Tachyon in Baidu

Abstract:

In this talk we will focus on how Tachyon can help improve big data analytics (ad-hoc query) efficiency within Baidu. In detail, we will explain:

Currently within Baidu, we have a production Tachyon cluster with 100 nodes and over 2 PB of storage space, this cluster mainly serves as the cache layer for our Big Data Analytics engine. In this talk, first we introduce the Big Data Analytic infrastructure within Baidu. Then, we explain why we started using Tachyon a few months ago, as well as the problems encountered when we started using Tachyon. Next, we delve into the details of how Tachyon help accelerate our Big Data Analytics pipeline at its current state. At the end, we discuss what new features we want to see and the plan to scale further.

Bio:

Shaoshan Liu is currently a Senior Architect at Baidu U.S.A. working on Big Data Infrastructure. Before Baidu, he worked at Linkedin and Microsoft. Shaoshan has a Ph.D. from UC Irvine.

Tachyon Recent Development

Abstract:

Tachyon 0.6.4 was recently released, with significant improvements to the overall system. This talk will touch on the features recently developed on Tachyon, including the alpha release of Tiered Storage in 0.6.4.

Bio:

Haoyuan Li is founder and CEO of Tachyon Nexus (http://www.tachyonnexus.com/). He is also a Computer Science Ph.D. candidate in AMPLab at UC Berkeley, where he co-created Tachyon, an open source memory-centric distributed storage system. He is also a founding committer of Apache Spark.

Agenda

6:00 – 6:30 Food & Networking
6:30 – 7:30 Talks
7:30 – 7:45 Q&A
7:45 – 8:00 Wind down

Tweet about this! https://twitter.com/TachyonProject/status/598234199305228288

Tachyon Survey: http://goo.gl/forms/dTwa9pRhqB

Events in Foster City, CA