On interactive code exploration; real-time stream processing at scale


Details
Agenda:
6:30pm: Pizza + Networking
7:00pm: Real-time Store Visit Predictions at Scale.
Luca Giovagnoli, Software Engineer, Core-ML team at Yelp.
7:30pm: Finding Stranger Things: The Lost Art of Exploring Your Code. Suchakra Sharma, Staff Scientist at ShiftLeft Inc.
------------------
Title: Real-time Store Visit Predictions at Scale.
Talk abstract:
This talk aims to inspire attendees with a multidisciplinary Flink application, where different fields have come together with a graceful synergy. You will hear about geospatial clustering algorithms, a gradient boosting ML model, and cutting-edge stream-processing technology - all in the same talk! And, if you are wondering, you can incorporate all this into your SOA using Async I/O!
After introducing our product use-case (real-time notifications for nearby local businesses), we’ll dive into the big data challenges. The talk will be describing a Visit Detection algorithm we have built to cluster raw GPS pings into Visits, using Flink state management and custom processing constructs (custom Windows, Triggers and Evictors). Finally, we will discuss a real-time machine learning model to predict the correct nearby business, leveraging Flink’s Async I/O at scale.
Flink enabled us to scale complex algorithms to thousands of operations per second, and to power hundreds of thousands of daily push notifications. It availed itself as a clearly superior alternative, whose performance netted Yelp great cost savings, and allowed us to move away from hardly scalable Python alternatives.
Speaker bio:
Luca works as a Software Engineer on the Core ML team at Yelp. While part of his job involves data mining massive data sets at Yelp, he is also responsible for designing and scaling the backend for analyzing high-volume geospatial data from millions of Yelp’s mobile users. He was previously the main open-source contributor to Yelp’s asynchronous client Fido.
------------------
Title: Finding Stranger Things: The Lost Art of Exploring Your Code.
Talk abstract:
Coding without debugging is like riding a bike, blind, downhill, in the dead of night without any 'brakes'. It may be dangerous, but it's also unheard of ;-) The rigor we apply to runtime debugging of applications can also be applied before or after compile-time using static analysis tools. Not knowing what your application does is a recipe for disaster. The only recourse a modern developer has is to sift through gazillion lines of code manually searching for a dangerous pattern or an exotic coding blunder that may have seeped in. In this talk, we discover how code can be represented in a graphical format [1] which can then be queried interactively to find common security or performance bugs in code. We will use Code Property Graph (CPG) [2] as a data structure that has been designed for interactive as well as automated queries. We will explore a sample program's control and data flow and see potential cases of security bugs that can be modeled/discovered in our interactive investigations.
[1] https://pdfs.semanticscholar.org/ae6d/dcba8c848dd0a30a30c5a895cbb491c9e445.pdf
[2] https://www.sec.cs.tu-bs.de/pubs/2014-ieeesp.pdf
Speaker bio:
Suchakra is currently a Staff Scientist at ShiftLeft Inc. where he plays around with large graphs and hunts bugs. He completed his Ph.D. in Computer Engineering from École Polytechnique de Montréal where he worked on eBPF and hardware-assisted tracing techniques to advance systems performance analysis. On related topics, he has delivered talks and training at venues such as Usenix LISA, All Systems Go, Papers We Love, Tracing Summit, etc. He also developed one of the first hardware-trace based VM analysis techniques. More information about him can be found at https://suchakra.wordpress.com/about

On interactive code exploration; real-time stream processing at scale