Skip to content

Software Analytics with Graphs & Using Cypher in Apache Spark

Photo of Michael Hunger
Hosted By
Michael H. and 2 others
Software Analytics with Graphs & Using Cypher in Apache Spark

Details

This time we will have two really cool talks.

First Markus will demonstrate how to do Software Analytics using Jupyter Notebooks and Neo4j.

And then Stefan talks about using Cypher for Apache Spark (CAPS) to bring streaming, relational and graph data together for data processing.

Software Analytics with Jupyter, Pandas, jqAssistant, and Neo4j

Abstract
As developers, we often feel that there might be something wrong with the way we develop software. Unfortunately, a gut feeling alone isn’t enough for the complex, interconnected problems in software systems. We need solid, understandable arguments to gain budgets for improvement projects. And we can help ourselves! Every step in the development or use of software leaves valuable, digital traces. With clever analysis, these data can show us root causes of problems in our software and deliver new insights – understandable and actionable for everybody.

In this meetup, I talk about the analysis of software data using a digital notebook approach. This allows you to express your gut feelings explicitly with the help of hypotheses, explorations and visualizations step by step. I also show the collaboration of open source data analysis tools (Jupyter, Pandas, jQAssistant and Neo4j) to spot problems in Java applications and their environment. We have a look at knowledge loss, worthless code parts any many more real-life analysis – completely automated from raw data up to visualizations for management. Come over and learn how you can do your first data analysis in software development!

About The Speaker

Markus Harrer is a software engineer at INNOQ and passionate about improving the way we do software development. He specializes in the analysis of software data such as source code, application performance data or version control repositories to show the underlying problems of the symptoms we face on the surface. Markus shares his thoughts and experience about how to create automated, data-driven, reproducible analysis of software data on his blog https://feststelltaste.de as well as conferences and meetups.

Combining SparkSQL and Cypher queries, and table/graph functions.

Choose the right language for the job: eliminate cumbersome multi-joins for connected-data traversals by using super-concise Cypher patterns for sub-graph detection and graph projection; use the power of table projection, grouping, aggregation in SparkSQL, all in one application.

Feel free to “dismantle your graph”: expose your graph nodes or relationships as dataframes, or as Hive tables.

  • Graph technology meets Big Data and Spark Analytics
  • Property graphs: the superset data model
  • Graph, relational and document data, interwoven
  • Lift, split, combine, and create new graphs, from any data source
  • Get your data fit to exploit graph compute, without losing any of your existing tools

About the Speaker

Stefan is Product Manager at Neo4j for the Cypher Query Engine and a Specification Engineer in Neo4j's Query Language Standards and Research team.

He's a senior computer scientist with a background in distributed systems and transaction processing and has been working on enterprise application integration, large-scale climate data management, and scalable overlay networks. At Neo4j, he played a key role in the design of the Cypher graph query language, helped build the first cost-based planner for property graph databases and pioneered the architecture of Cypher for Apache Spark and Neo4j Morpheus.

Stefan is passionate about computer language design and how languages as a medium enable access to new technology. He's now working on international standardization and specification of property graph query languages and related topics, as well as continously exploring how to expand the scope and applicability of graph technology in a way that makes it easily accessible to users.

Based in Berlin, Germany, he enjoys spending time with his family in nature and playing a good game of Go once in a while.

Photo of Graph Database - DACH (Germany, Austria, Switzerland ) group
Graph Database - DACH (Germany, Austria, Switzerland )
See more events