April 24, 2013 · 6:30 PM
This location is shown only to members
Please note the address of the venue for this meetup
We are happy to announce this meetup as part of Big Data Week.
6.30pm Welcome, networking, free beer+pizza
7pm Talks Start
“Approximate Methods for Scalable Data Mining” by Andrew Clegg, Data Scientist & Tech Manager of Analytics Team @Pearson Probabilistic data structures let you trade off accuracy for scalability, by allowing a small and measurable amount of error in return for huge improvements in efficiency. Andrew’s talk provides an overview with use cases
“Storm + Trident and The Holistic Architecture: Using Hadoop for batch and Storm for real time” by Yodit Stanton, Freelance Data Scientist, Developer & Systems Architect. Computing arbitrary functions on an arbitrary dataset in real time is a daunting problem. There is no single tool that provides a complete solution. Instead, you have to use a variety of tools and techniques to build a complete Big Data system. A Holistic Architecture may solve the problem of computing arbitrary functions on arbitrary data in real time by decomposing the problem into three layers: the batch layer, the serving layer, and the speed layer
"PredictionIO - An Open Source Scalable Machine Learning Architecture" by Simon Chan Product Lead @ Prediction.IO To deal with big data in a production environment, a horizontally non-blocking and scalable system is needed. PredictionIO provides a flexible architecture for data engineers to evaluate algorithms and apply them to real applications. The whole stack is built on top of open source software while PreodictionIO itself is an open source Scala project. Simon will introduce the system design and answer any questions you may have as developers or data scientists.
9.30pm-ish meetup ends