December SF Hadoop Users Meetup


Details
The December SF Hadoop User Group meetup will be held Wednesday, December 10 from 6:00pm to 8:00pm. This meetup will be hosted by 0xdata and held at Citizen Space, 425 2nd St, Suite 100, San Francisco. Food and drinks will be served.
Presentation: Distributed Deep Learning for Classification and Regression problems using H2O
Arno Candel - Physicist & Hacker, 0xdata
Abstract:
Deep Learning has been dominating recent machine learning competitions with better predictions. Unlike the neural networks of the past, modern Deep Learning methods have cracked the code for training stability and generalization. Deep Learning is not only the leader in image and speech recognition tasks, but is also emerging as the algorithm of choice for highest predictive performance in traditional business analytics. This talk introduces Deep Learning and implementation concepts in the open-source H2O in-memory prediction engine. Designed for the solution of business-critical problems on distributed compute clusters, it offers advanced features such as adaptive learning rate, dropout regularization, parameter tuning and a fully-featured R interface. World record performance on the classic MNIST dataset, best-in-class accuracy for a high-dimensional eBay text classification problem and other relevant datasets (incl. Kaggle) showcase the power of this game-changing technology. A whole new ecosystem of Intelligent Applications is emerging with Deep Learning at its core.
Prior to joining 0xdata as Physicist & Hacker, Arno was a founding Senior MTS at Skytree where he designed and implemented high-performance machine learning algorithms. He has over a decade of experience in HPC with C++/MPI and had access to the world’s largest supercomputers as a Staff Scientist at SLAC National Accelerator Laboratory where he participated in US DOE scientific computing initiatives. While at SLAC, he authored the first curvilinear finite-element simulation code for space-charge dominated relativistic free electrons and scaled it to thousands of compute nodes. He also led a collaboration with CERN to model the electromagnetic performance of CLIC, a ginormous e+e- collider and potential successor of LHC. Arno has authored dozens of scientific papers and was a sought-after academic conference speaker. He holds a PhD and Masters summa cum laude in Physics from ETH Zurich. Arno was named 2014 Big Data All-Star by Fortune Magazine.
Presentation: "H2O: Big Data Machine Learning for Everyone"
Yan Zou - Head of Product and Data Science, 0xdata (H2O.ai)
H2O is an open source machine learning platform. In this talk, I will use examples to demonstrate how H2O can be used by every analytical professional to gain instant insights over big data, and then give an introduction to a few H2O's powerful algorithms, namely Random Forest, Gradient Boost, and Deep Learning. The audience can learn how H2O is not only very easy to use, but also powerful when it comes to implemented algorithms. I'll also talk about the scoring and deployment model for H2O, focusing on H2O's NanoFast Scoring engine, and how companies such as Cisco are able to run 60K+ ensemble models, on 160M+ rows and 1000+ features on a 4-machine cluster in 2 days.

December SF Hadoop Users Meetup