Frank McSherry - Skew Strikes Back: New Developments in [...] Join Algorithms
We're really happy to have Frank McSherry as our speaker for our next meet up! Frank is well known for his work on Differential Privacy and on Timely Dataflow.
Frank is going to talk new developments in relational joins by guiding us through the excellent "Skew Strikes Back: New Developments in the Theory of Join Algorithms" (https://arxiv.org/pdf/1310.3314v2.pdf)
This time we'll meet at ETH, like the old times, and you'll get to network and chat about research over pizza and beer!
Evaluating the relational join is one of the central algorithmic and most well-studied problems in database systems. A staggering number of variants have been considered including Block-Nested loop join, Hash-Join, Grace, Sort-merge for discussions of more modern issues). Commercial database engines use finely tuned join heuristics that take into account a wide variety of factors including the selectivity of various predicates, memory, IO, etc. In spite of this study of join queries, the textbook description of join processing is suboptimal. This survey describes recent results on join algorithms that have provable worst-case optimality runtime guarantees. We survey recent work and provide a simpler and unified description of these algorithms that we hope is useful for theory-minded readers, algorithm designers, and systems implementors.
Frank McSherry received his PhD from the University of Washington, working with Anna Karlin on spectral analysis of data. He then spent twelve years at Microsoft Research's Silicon Valley research center, working on topics ranging from differential privacy to data-parallel computation. He currently works at ETH Zürich's Systems Group on scalable stream processing and related topics.