Personalization of web traffic is a hard problem and can quickly result in excess infrastructure with data sets easily reaching millions of rows. I'll be presenting a totally serverless approach to serving web traffic recommendations using AWS lambda and PySpark on elastic mapreduce. We'll take a look at how to use lambdas to programmatically create an EMR cluster and take a short trip down PySpark lane to look at a simple recommendation engine using collaborative filtering.
About The Speaker:
Rob Harrigan is a data engineer at CBS Interactive 24/7 sports in Brentwood. He graduated with a BS from Rochester Institute of Technology (2011), an MBA from the University of Tennessee at Martin (2013) and a PhD in Electrical Engineering from Vanderbilt University (2017). He has a passion for integrating computer science and statistics to solve real-world problems with data.