An Introduction to Tachyon - The Next Evolution in Fast Big Data Processing

  • July 23 · 6:30 PM
  • Alpine

Memory is the key to fast big data processing. This has been realized by many, and frameworks, such as Spark and Shark, already leverage memory performance. With these advancement, big data storage is becoming a critical bottleneck in many workloads.

In this talk, we introduce Tachyon, a memory centric fault-tolerant distributed file system, which enables reliable file sharing at memory-speed across cluster frameworks, such as Spark and MapReduce. Tachyon achieves memory-speed and fault-tolerance by using memory aggressively and leveraging lineage information. Tachyon caches working set files in memory, and enables different jobs/queries and frameworks to access cached files at memory speed. Thus, Tachyon avoids going to disk to load datasets that are frequently read.

Tachyon is Hadoop compatible. Existing Spark and MapReduce programs can run on top of it without any code change. The project is open source and is deployed at multiple companies. It has more than 40 contributors from over 10 institutions, including Yahoo, Intel, Redhat, Alibaba etc. The project is also part of Fedora distribution.


Bio:

Haoyuan Li is a Computer Science Ph.D. candidate in the AMPLab at UC Berkeley, working with Prof. Scott Shenker and Prof. Ion Stoica on big data and cloud computing. He leads Tachyon, an open source memory-centric distributed file system enabling reliable file sharing at memory-speed across cluster frameworks. Before Berkeley, he worked at Conviva and Google, and studied at Cornell University and Peking University.

Join or login to comment.

  • James L.

    presentation was great.

    1 · July 24

  • Paul E.

    Presentation by HY was fast and efficient - just like Tachyon. I enjoyed learning more about the technology and look forward to seeing more uses of it with non-HDFS file systems.

    July 24

  • al f.

    July 24

  • Alex O.

    Hi Joel,
    Do you plan to record the session?
    Thx.

    July 21

    • AS

      It can broadcast it via laptop as a Fuzemeeting webinar so people can join live and then can record/publish it afterwards. Yeah, could have a streaming link as well as an MP4 to hand out.

      1 · July 23

    • Joel S H.

      We can try to record as well with gotowebinar.

      July 23

  • richard v.

    I'm not going to San Francisco - for that time of day.
    With traffic hour and half to two hours just to get there and than with parking. no way, whats wrong with the south bay.

    July 12

    • Kimo C.

      Why don't you try BART?

      1 · July 17

    • al f.

      How about 1900 S Norfolk St. in San Mateo or 599 Fairchild Dr. in Mountain View? But that's just good for me.

      1 · July 23

Our Sponsors

People in this
Meetup are also in:

Imagine having a community behind you

Get started Learn more
Bill

I started the group because there wasn't any other type of group like this. I've met some great folks in the group who have become close friends and have also met some amazing business owners.

Bill, started New York City Gay Craft Beer Lovers

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy