Bay Area Hadoop User Group (HUG) May Meetup

Hello Hadoopers Agenda is available for the May 19th meeting * 6:00 - 6:15 - Socializing and Beers (Gates open at 5:45) * 6:15 - 6:30 - What's new with Pig? Alan Gates, Yahoo! * 6:30 - 7:00 - HBase and Pig: The Hadoop ecosystem at Twitter, Dmitriy Ryaboy, Twitter * 7:00 - 7:30 - Extraordinarily rapid and robust data analysis with Cascalog, Nathan Marz, BackType * 7:30 - 7:45 - Apache Hadoop Release Plans for[masked], Tom White, Cloudera * QnA , Open Discussion, and a small surprise Session details are available below. Looking forward to seeing you there!
Register today for Hadoop Summit 2010 June 29th, Hyatt, Santa Clara, CA Dekel
HBase and Pig: The Hadoop ecosystem at Twitter Twitter makes extensive use of Hadoop, HBase, and Pig to power its analytics infrastructure. In this talk, we will describe our data flow pipeline, go over the new Pig-HBase integration, and introduce Elephant Bird, the recently open-sourced collection of libraries we use for working with Protocol Buffers, Hadoop, HBase, Pig, and Thrift. Dmitriy Ryaboy is an engineer at Twitter and a Pig committer; he previously worked at Lawrence Berkeley National Laboratory and at Dmitriy holds a bachelor's degree in Computer Science from UC Berkeley and a master's in Very Large Information Systems (it's a real thing) from Carnegie Mellon University. You can follow him on Twitter, where he goes by @squarecog.
Extraordinarily rapid and robust data analysis with Cascalog, Nathan, BackType Cascalog is an interactive query language for Hadoop with a focus on simplicity, expressiveness, and flexibility intended to be used by Analysts and Developers alike. Cascalog eschews the SQL syntax for a simpler and more expressive syntax based on Datalog. With this added expressiveness, Cascalog can query existing data stores "out of the box" with no required data "importing" or "under the hood" configuration necessary. Because Cascalog sits on top of Clojure, a powerful JVM based language and interactive shell, adding new operations to a query is as simple as defining a new function. In this presentation, Nathan will introduce Cascalog and how it's used at BackType. Nathan will show how the Datalog syntax provides more robustness and flexibility than SQL based languages. Finally, Nathan will demonstrate how the Cascalog, Clojure, and Cascading stack can be leveraged by advanced users who wish to build more complex queries and libraries in Java and Clojure for data processing, data mining, and machine learning. Nathan is the Lead Engineer at BackType where he is building technology for real-time search and analytics of online social media. He has been using Hadoop extensively since 2008, using Hadoop both for data warehousing and as the basis for scalable, data-intensive applications. Nathan makes use of technologies like Cascading and Clojure heavily in order to simplify the devlopment of complex applications on top of Hadoop. Nathan writes a blog at
Apache Hadoop Release Plans for 0.21.0 Tom will give a short update on the progress of the release, and explain the work that has been done on compatibility with 0.20.

Join or login to comment.

  • Daniel M.

    My name is Dan Murray and I work at Focus Ready Market Research in Chicago. We are holding 4 NoSQL Focus Groups, 2 in Santa Clara on 4/10 and 2 in San Francisco on 4/11. We are paying all attendees to any of these sessions a $250 cash honorarium. We are looking to talk with developers who work with any NoSQL Database. No one will try and sell you anything or ask about any confidential information. This is only about giving your opinion. anybody interested can call me at[masked] or fill out the info in the following link and we will contact you. https: //


    April 3, 2013

  • Shyam S.

    We need more in-depth presentation on Hadoop internals for developers.

    May 20, 2010

  • Gerald W.

    Great meeting -- especially liked the Cascalog presentation/demo. Would really like more chance to chat with the attendees.

    May 20, 2010

  • Stephen O.

    Yahoo, thank you for being such a great Host! Good updates and demo.

    May 20, 2010

  • KC L.

    Excellent presentations but missing sodas this time !

    May 19, 2010

  • KC L.

    I'm looking forward to meeting as much of the 260+ Hadoop attendees tomorrow !

    May 18, 2010

Our Sponsors

  • Yahoo! Inc.

    Meeting space, pizza and drinks are sponsored by the Yahoo! Hadoop team.

People in this
Meetup are also in:

Create your own Meetup Group

Get started Learn more

We just grab a coffee and speak French. Some people have been coming every week for months... it creates a kind of warmth to the group.

Rafaël, started French Conversation Group

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy