Skip to content

Hadoop-DC: Apache Spark & Real Time Analytics

Hadoop-DC: Apache Spark & Real Time Analytics

Details

Time for another Hadoop-DC meetup on Thursday, February 26th! The event will be held at EMC and food will be provided. Attendees can park in the garage on any level where a space is not reserved (P1 is generally the easiest). There will be an automated parking validation machine in the main lobby.

The schedule will be:

6:00 - 6:05 Welcoming and Introductions

6:05 - 6:30 Networking

6:30 - 8:30 Presentations

The two presentations confirmed for this event are: "Integrating Apache Spark and Apache HBase" by Ted Malaska and "Real-time Analytics" with Prasun Sinha. There may be more presentations - more info to follow as we get closer to the meetup.

Abstracts:

"Integrating Apache Spark and Apache HBase"

As Spark is replacing MapReduce, there is a need to figure out how to do HBase batch processing work with Spark. This talk will review different use cases for how we might want Spark to interact with HBase for batch processing, and then explain how this can be done with projects like SparkOnHBase (incubating in Cloudera Labs) as well as "basic" HBase.
Ted Malaska is a Solutions Architect at Cloudera and a Spark contributor.

"Real Time Analytics"

Now that IT and line-of-business executives have started to operationalize Hadoop and MPP based batch big data analytics, it's time to harness the next wave of innovation in data processing, i.e., analytics over real-time streaming data. This session will provide an overview and discussion on the business value, use cases and architectural considerations of integrating real-time streaming analytics into your Enterprise Big Data roadmap. It will also provide some details of one such solution.

Photo of Hadoop-DC group
Hadoop-DC
See more events
EMC
8444 Westpark Dr (Suite 100) · McLean, VA