Skip to content

December Presentation Night

Photo of Matthew Farrellee
Hosted By
Matthew F. and 3 others
December Presentation Night

Details

Thank you to SmarterTravel for sponsoring both the space and food and drink for this event!

Rough schedule:

  • 6:00 - 6:40: Food and mingling
  • 6:40 - 6:50: Opening remarks & Sponsor pitch
  • 6:50 - 6:55: Lightning talk
  • 7:00 - 7:30: Feature talk
  • 7:40 - 8:10: Feature talk

Lightning Talk: "An update on Flintrock: A faster, better spark-ec2" - Nick Chammas

In this lightning talk Nick will give a quick update on his progress with Flintrock, a hopeful successor to spark-ec2.

Feature Talk: "Cloud Security Monitoring and Spark Analytics" - Andre Mesarovic, ThreatStack

Andre will describe how ThreatStack uses Spark to produce roll-ups from a stream of Linux process events. An auditd-like agent installed on customers’ AWS instances sends a constant stream of kernel events to ThreatStack’s servers. These events are routed to RabbitMQ and a process writes them in batched JSON format to S3. On a fixed interval A Spark job reads the S3 objects, performs a number of aggregations and stores the results in a Postgres database.

Feature Talk: "Feedback Loops Between Ingest Processing & Analytics" - John Hugg, VoltDB

In this talk John Hugg, Founding Engineer of VoltDB, will show how a fast data solution like VoltDB can be combined with a powerful analytic solution like Apache Spark to enable continuous and adaptable processing of events.
The demonstration will use a click stream analysis example to demonstrate this pattern. VoltDB is used to segment and process individual clicks in real time, based on models generated from periodic batch processing. Spark, and specifically MLlib, is used to build a clustering model based on historical event data. That model is continuously run and continuously loading into VoltDB, where it can be applied to raw data.
This continuous loop, where models or rules are generated continuously, loaded into the event processing system and applied to live data, is a powerful tool with applications in fraud detection, segmentation and engagement.

This approach will be contrasted with approaches based on Storm and Spark Streaming.

Photo of Boston Data Technology (Boston Data Group/BDT) group
Boston Data Technology (Boston Data Group/BDT)
See more events
SmarterTravel
226 Causeway St. · Boston, MA