Abstracted Interfaces for Domain-driven Dev & Geospatial Data Analytics in Spark

This is a past event

90 people went

Location image of event venue

Details

Schedule:

6:00 - Doors & Food
6:30 - Talk 1
7:15 - Talk 2
8:00 - Wrap & Chat

**Talk 1: Abstracted Interfaces for Domain-driven Development**
Speaker: Soren Larson, Data Scientist & AI Engineer
Abstract:
Abstraction is ever increasing. Services we used to have to build and maintain in low level languages are now abstracted into the cloud with sturdy SLAs and frameworks accommodating of most our use cases. Stakeholders are changing. With enterprise interfaces becoming more consumer oriented in their outlook and design, stakeholders and even users of these interfaces can reasonably be without some of the skills previously needed to shape and manage large pieces of data. With fewer resources needed to spin up a reliable and powerful big data system, we can spend more time on nuance of data manipulation and offer interfaces to nontechnical stakeholders more expressive than before, with performance secured by the guardrails of our increasingly sturdy infrastructure. I'll talk about one type of interface I've found success with, and what enterprise b2b design might say about the future of data engineering.

**Talk 2: Using Spark for real time telemetry and geospatial data analytics at scale**
Speaker: Dillon Bostwick, Solutions Engineer @ Databricks
Abstract:
We will use public NYC neighborhood data sitting on Azure Blob Storage and telemetry streams from Azure Event Hubs to analyze routes through NYC. This will open up discussion on how the Magellan geospatial analytics library uses Spark’s catalyst optimizer to conduct spatial joins, as well as how we can use Databricks Delta to improve performance as we build an optimized real time pipeline at scale. Finally we will discuss how we can leverage Azure Databricks to move the application from development to production.
Technologies discussed:
Azure Blob Storage, Azure Event Hubs, Azure Databricks, Magellan, Spark Catalyst optimizer, Spark Structured Streaming, Databricks Delta