DataFrames and Spark SQL in network analytics


Details
At this meetup we are focusing on a higher level API, DataFrames and Spark SQL. Gabor and Daniel, two core engineers from Lynx Analytics will share their experiences about these APIs through an introductionary talk and by presenting their use cases.
Let's meet at 6:30 in Prezi! Talks are starting at 7pm, as usual.
What is a Spark DataFrame?Gábor Fehér, Lynx Analytics
https://www.linkedin.com/in/gfeher42
DataFrames in Apache Spark allow you to run good old SQL queries over terabytes of data distributed across a cluster of machines. Now you know roughly what a DataFrame is. But if you come to the talk you will get to see them in action and learn much more about how to use this API.
Gábor Fehér is a software engineer at Lynx Analytics. After leaving Google, Gábor dedicated his life to maximising the performance of the company’s Spark-based flagship product, the LynxKite big graph analytics system.
Spark SQL in network analyticsDániel Darabos, Lynx Analytics
https://www.linkedin.com/in/darabos
This talk will introduce a complex real-world Spark application, the LynxKite big graph analytics system. Why is it built on Spark? What was straightforward to do and what did we have to spend significant effort on? You can ask any questions about life with Spark, but the focus will be on our integration of Spark SQL: the motivation, the implementation, and the results.
Dániel Darabos is a software engineer at Lynx Analytics. Leaving behind his life at Google, Dániel now toils endlessly on pushing the boundaries of the Scala type system with LynxKite.

DataFrames and Spark SQL in network analytics