This is use case talk from the folks at Yahoo!.
In Yahoo!'s advertising and data platforms, we instrument and mine vast amounts of data to come up with insights into advertising effectiveness, and measure and improve the effectiveness of advertising reach. Over the last couple of quarters we have been prototyping Shark for some of the use cases and enhancing Shark to work effectively with our datasets.
In this talk, we first discuss some of our contributions to the Spark and Shark open source ecosystem, such as the JDBC server, column pruning and columnar compression, followed by what we are planning down the road. We then discuss some of our use cases that involve advanced algorithms and how we implement these algorithms on top of Spark and Shark to provide interactive, insightful analytics to our data scientists.