Nikos Fertakis on The Dataflow Model


Details
The third Athenian Papers We Love meetup will feature Nikos Fertakis presenting on The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing (https://research.google.com/pubs/archive/43864.pdf), by Akidau et al (Google) [2015].
Talks
• Nikos Fertakis on The Dataflow Model:
As Adrian Colyer put it on his Morning Paper blog: "Akidau et al. set out a strong manifesto for modern data processing, based on the notion of accepting uncertainty and incompleteness."
I think Dataflow is a really interesting framework on processing infinite streams. And make no mistake, infinite streams are all around us, even though we are used to splitting them into artificial segments (batches) to simplify how we process them.
The problems that can be solved this way include:
• log joining pipelines
• session-based analysis for search, ads, analytics, social, and YouTube
• billing pipelines
• aggregate statistics calculations
• abuse detection pipelines
• recommendation generation
But most of all, what I found especially interesting about this paper was that it gave me the mental tools to frame the data processing problem and its various nuances. All in all, I think it should make for an interesting topic!
Bio
Nikos Fertakis (twitter: @nikosfertakis, github: @greenonion) is a software engineer, and a tech lead of Skroutz's Search team. He holds a BSc in Computer Science from the University of Athens, and an MSc in Machine Learning from University College London.
Details
Doors open at 6:30pm; the presentations will begin around 7pm.
The talk will be in Greek, unless there are non-Greek speakers in the audience. In that case the talk can be in English.
Enter the building from the entrance right beside the parking lot and climb to the second floor.
After the presentation we will open the floor to discussion and questions.
We hope you will read the paper before the meetup, but don't stress if you can't. If you have any questions please add them on this event's thread.
We have a Github repo which we use for coordinating Papers We Love Athens: https://github.com/papers-we-love/athens We will be opening an issue before each meetup to serve as a space for Call for Presentations. If you have any ideas/questions about the meetup please open up an issue.

Nikos Fertakis on The Dataflow Model