Skip to content

Spark GraphX and Streaming

Photo of Brian Husted
Hosted By
Brian H.
Spark GraphX and Streaming

Details

Overview

Please join us for an insightful evening to learn about Spark Graph processing and Spark Streaming integration with NIFI. There will be some excellent talks, demos, and software shared during this event. We will have free pizza and there will be selection of uniquely crafted beers available for purchase. I hope to see many of you there as we continue to understand the impact of this game changing analytic platform. This event is being sponsored by Tetra Concepts (http://www.tetraconcepts.com).

Agenda

Networking & Happy Hour: 5pm - 6pm

Spark GraphX: 6pm - 7pm

Spark Streaming integration with NIFI: 7pm - 8pm

Dr. Brad Rees

Spark GraphX

Introduction and Overview with Follow-along Graph Analytic Examples

This talk will provide an introduction to writing graph analytics using the Spark GraphX framework. GraphX extends the Spark RDD model to simplify graph construction and processing. The GraphX API provides an easy means of switching between graph and tabular processing. This will cover the basics to complex graph processing.

This talk assumes a basic familiarity with Spark and Scala. Sample datasets and code will be posted prior to the meeting so that audience members can follow along.

Presentation: http://files.meetup.com/16621912/GraphX-Meetup.pdf

Spark Shell Scripts: http://files.meetup.com/16621912/spark-shell-scripts.txt

About the Speaker

Dr. Brad Rees is a senior software developer, and the Director of Engineering at Tetra Concepts (http://www.tetraconcepts.com). Brad has nearly three decades of experience developing complex analytic systems, with two decades of that experience being in graph-based data mining algorithms. Dr. Rees recently received his Ph.D. in Computer Science from the Florida Institute of Technology, in Melbourne, FL. His dissertation was entitled "FastEgoClustering: Detecting Overlapping Communities in Complex Networks Using Ego-based Knowledge".

Mark Payne

Spark Streaming Integration with NIFI

This presentation will focus on how NiFi can stream data to and from Apache Spark (specifically Spark Streaming), regardless of the size of the individual data records; some of the features of NiFi; and why, as a Spark user, you should care. Apache NiFi is a new Top-Level Apache project and is a dataflow platform designedwith Big Data and the IoT in mind. The project was developed for 8 years by the National Security Agency before it was granted to the Apache Software Foundation in November of 2014.

About the Speaker

A dataflow enthusiast, Mark worked for several years as the Lead Developer for a platform that has revolutionized the way in which data is ingested, processed, tracked, and distributed throughout the NSA's Global Enterprise. In 2014, the U.S. Government open-sourced this platform as Apache NiFi. Since then, Mark has co-founded Onyara, Inc. (now Hortonworks). He now leverages Apache NiFi and lessons learned from working with one of the largest data-driven organizations in the world, to help other companies and organizations improve how they handle their ever-increasing volumes of data.

Photo of Distributed Computing Maryland group
Distributed Computing Maryland
See more events
Jailbreak Brewing Company
9445 Washington Blvd N Ste F · Laurel, MD