Past Meetup

Apache Mesos for Apache Kafka and Apache Accumulo

This Meetup is past

126 people went

Details

5:30pm - 6:00pm - registration

6:00 - 7:00 - Working with Apache Kafka, Apache HDFS, Apache Accumulo and More!

This talk will give a brief overview of Apache Mesos and then focus on using systems like Apache Kafka, Apache Hadoop (specifically HDFS) and Apache Accumulo on Mesos. Apache Mesos is the operating system for your data center and traditionally has been utilized for short lived tasks and long lived tasks that didn't require data persistence. In this talk we will go through what is involved in making Mesos work with long lived tasks requiring data persistence and some of the amazing benefits that you can gain because of it.

Speaker Bio: Joe Stein (http://www.linkedin.com/in/charmalloc) is an Apache Kafka committer and PMC member. Joe is the Founder and Principal Architect of Big Data Open Source Security LLC (http://www.stealth.ly/) a professional services and product solutions company. Joe has been a developer, architect and technologist professionally for 15 years now having built back end systems that supported over one hundred million unique devices a day processing trillions of events. He blogs and hosts a podcast about Hadoop and related systems at All Things Hadoop (http://www.allthingshadoop.com/) and tweets @allthingshadoop (https://twitter.com/allthingshadoop)

7:00 - 8:00 - Survey of Accumulo techniques for indexing data

This talk will go over table design and row key design approaches for indexing large amounts of data in Apache Accumulo. We'll do an overview of how to store geographical data, entity relationship graphs, natural language text, numbers, and more in Accumulo. This will serve as a starting point to learning how to effectively store different types of data in Accumulo as well as showcase the capabilities of Accumulo for handling varying situations.

Speaker Bio:
Donald Miner (https://twitter.com/donaldpminer) is an avid user of Apache Hadoop and a practitioner of data science. He serves as Chief Technology Officer at ClearEdge IT Solutions (http://www.clearedgeit.com/), a company that provides Big Data professional services. He is author of the O’Reilly book MapReduce Design Patterns, which is based on his experiences as a MapReduce developer. Donald has architected and implemented a number of mission-critical and large-scale Hadoop systems within the U.S. Government and Fortune 500 companies. He received his PhD from the University of Maryland, Baltimore County in Computer Science, where he focused on Machine Learning and Multi-Agent Systems. He lives in Maryland with his wife and two young sons.

8:00 - 9:00 - Cocktail hour (food & beverage) hosted by Bloomberg