Location info: For those driving, there is street parking and a parking garage across from our building. Please enter through our main entrance (under the skybridge) or via the skybridge from the parking garage. Our reception is on the 2nd floor, and our team will direct you to our atrium. Our doors will remain open until 7:45pm to accommodate late comers.
This meetup focuses on Scalability and technologies to enable handling large amounts of data: Distributed Systems, Hadoop, HBase, distributed NoSQL databases, and more!
We are *heavily* focused on deep, technical talks. No marketing pitches, no light use case discussions, no pitches. We want to see architecture diagrams, code, and hear real stories from the trenches.
Besides distributed systems and Big Data, we're also interested in hearing about high-performance engineering techniques and futuristic technologies.
We've had great success in the past, and are growing quickly! Previous guests were from Facebook, Twitter, LinkedIn, Amazon, Cloudant, Microsoft, MongoDB, and others.
This month's guests:
Topic: Building Zulily’s Data Platform using Hadoop and Google Biq Query
Speakers: Sudhir Hasbe is director of big data, data services and BI at Zulily. ( https://www.linkedin.com/in/shasbe ). Also Paul Newson ( https://www.linkedin.com/profile/view?id=971812 )
Abstract: Zulily, with 4.1 million customers and projected 2014 revenues of over 1 billion dollars, is one of the largest e-commerce companies in the U.S. “Data driven decision making” is in part of our DNA. Growth in the business has triggered exponential growth in data, this required us to redesign our data platform. Zulily data platform is back bone for all analytics & reporting along with being backbone of our data services(APIs) used by various applications in the organization. This session will provide technical deep dive into our data platform and share key learnings including our decision to build Hadoop cluster in cloud.
Topic: Delivering personalization and recommendations using Hadoop in cloud
Speakers: Steve Reed is a principal engineer at zulily, the author of dropship (https://github.com/zulily/dropship), and former Geek of the Week (http://www.geekwire.com/2013/steve-reed/). Dylan Carney is a software engineer at zulily. They both work on personalization, recommendations, and improving your shopping experience.
Abstract: Working on personalization and recommendations at zulily, we've come to lean heavily on on-premise Hadoop clusters to get real work done. Hadoop is a robust and fascinating system, with myriad knobs to turn and settings to tune, and knowing the ins and outs of obscure Hadoop properties can be crucial for the health and performance of your hadoop cluster. (To wit: How big is your fsimage? Is your secondary namenode daemon running (and did you know it's not really a secondary namenode at all?))
But what if it didn't always have to be this way? Google Compute Engine (GCE) and other cloud platforms make promises of easier, faster, and easier-to-maintain Hadoop installations. Join us as we describe learning from our years of Hadoop use, and an overview of what we've been able to adapt, learn and unlearn while moving to GCE
Topic: Apache Optiq in Hive
Speaker: Julian Hyde, Principal, Hortonworks
Abstract: Tez is making Hive faster, and now cost-based optimization (CBO) is making it smarter. A new initiative in Hive introduces cost-based optimization for the first time, based on the Optiq framework. Optiq's lead developer Julian Hyde shows the improvements that CBO is bringing to Hive. For those interested in Hive internals, he gives an overview of the Optiq framework and shows some of the improvements that are coming to future versions of Hive.
Our format is flexible: We usually have 2 speakers who talk for ~30 minutes each and then do Q+A plus discussion (about 45 minutes each talk) finish by 8:45.
There will of course be beer afterwards, hosted by Greythorn!
Zulily (http://maps.google.com/maps?f=q&hl=en&q=2601+Elliot+Ave%2C+Seattle%2C+WA%2C+Seattle%2C+WA%2C+us), 2601 Elliot Ave, Seattle, WA, Seattle, WA
Paddy Coyne's: http://www.paddycoynes.com/
Drankin' map: http://bit.ly/ZO5Fxs
Doors open 30 minutes ahead of show-time. Please show up at least 15 minutes early out of respect for our first speaker.