Joint Seattle Spark and Graph Meetup Extravaganza


Details
We've got ourselves a pretty fun joint session between Seattle Spark Meetup and the Seattle Graph Meetup groups! We have two speakers at this session talking about using Dato's GraphLab Create SGraph and working with Spark GraphX. This is an introductory session - more details below!
Sponsored by StreamSets
Jon Natkins, Field Engineer, StreamSets will provide a quick demo of StreamSets / Spark Integration
The “Pretty Fast Stick” in GraphLab Create: Crunching Through >100 billion Edges in Minutes with SGraph
Jay (Haijie) Gu, Dato (http://dato.com/)
Last month, Dato/GraphLab gave GraphLab Create’s graph analytics toolkit some love by hitting it with the “pretty fast stick.” The result? It is now able to crunch through the Common Crawl webgraph in mere minutes on a single machine. At 3.5 billion nodes and 128 billion edges, this is the largest publicly available graph. Previously, Facebook reported crunching through 1 trillion edges in 4 minutes on 200 machines on a private graph dataset. No publicly available distributed system has been able to process a data set on the scale of Common Crawl. In this talk, we discuss some of the technical underpinnings of our approach. Live demo included!
Short bio:
Jay is a co-founder of Dato (formerly known as GraphLab), where he is currently a software engineer. Previously, Jay studied machine learning on big graphs at Carnegie Mellon University, developing advanced methods to construct, partition and represent graphs.
Graph Analytics using GraphX on a Heterogenous Graph
Anikate Singh, Concur (http://www.concur.com)
A quick primer on loading row-wise data into Spark, cleansing it and formatting it to load into a graph. Will also showcase in-built GraphX API’s that allow execution of some simple analytical queries on the graph.
Short bio:
Anikate is a Data Sciences Engineer at Concur leading their Data Sciences engineering efforts. Prior to joining Concur, Anikate was a researcher at the University of Washington DataLab and a Senior Member Technical Staff at Oracle.

Joint Seattle Spark and Graph Meetup Extravaganza