Past Meetup

Scalable Machine Learning for Big Data using R + H2O

This Meetup is past

29 people went

Location image of event venue


Tentative Agenda
6:30pm - 7:00pm: - Pizza, Drinks, Socialize
7:00pm - 8:00pm: - Distributed Generalized Linear Modeling with Tom Kraljevic
8:00pm - 8:30pm - Questions and Closing Remarks

H2O ( is for data scientists and business analysts who need scalable and fast machine learning. Unlike traditional analytics tools, H2O provides a combination of extraordinary math and high performance parallel processing with unrivaled ease of use. H2O speaks the language of data science with support for R, Python, Scala, Java, and a robust REST API ( At this meet-up Tom Kraljevic (, will be speaking on Generalized Linear Modeling is the sliced bread of Data Science. It's transparent, it's flexible and allows for response variables to be of different distributions and connected to the model by different link functions.

In this talk we present an implementation of Distributed GLM in OpenSource Math Engine, H2O. We also take a peek into Regularization & ADMM - a technique that's been popularized by Stephen Boyd and gaining ground amongst data science practitioners. We tie the theory up with a showcase of the power of GLM on 16-nodes over all 20 years of Airline Flight data predicting which airports to avoid in your upcoming summer travels!

This is a fantastic opportunity to join your fellow Big Data enthusiasts and spend some time with one of the most talked about analytics startups.

Speaker Bio:
Tom Kraljevic ( is VP of Engineering at H2O. Before joining H2O, Tom was Co-founder & CTO at Luminix, where he and the team developed a cutting-edge offline mobile application for Salesforce users. This involved a healthy blend of focusing on the user-experience along with a deep-dive in various technologies. Prior to Luminix, Tom was a Principal Engineer at Azul Systems, where he worked in both the JVM and System Software teams. Tom served as the technical leader for the distributed management application team, appliance security and tools for distributed debugging. Tom’s experience at systems and chip startup companies involved straddling the hardware-software boundary.

Tom led pre-silicon verification infrastructure development for a terabit networking switch fabric chipset at Abrizio (acquired by PMC-Sierra). He also developed architectural CPU simulators, debuggers and toolchains at Chromatic Research (acquired by ATI). Tom got his start in technology at Intel, spending internships and co-ops in both the Portland (MD6) and Santa Clara (MD7) microprocessor design groups. A lesser known fact of Tom: Finding the now-famous Pentium floating-point divide (FDIV) bug while at Intel using the testing harness and methodology he developed (several months before it was independently discovered outside Intel, and subsequently gained worldwide attention). Tom has an MS degree in Electrical Engineering from the University of Illinois (at Urbana-Champaign), and a BSE degree in Computer Engineering from the University of Michigan (at Ann Arbor).