Skip to content

Distributed Random Forest

Photo of Todd Holloway
Hosted By
Todd H. and srisatish
Distributed Random Forest

Details

Purdue Professor Jan Vitek will be stopping by to discuss his implementation of random forest in the open-source package H2O. Come join us!

Schedule
6:30 - 7:00pm Social (food + drinks served)
7:00 - 8:30pm Talks
8:30 - 9:00pm Social

Abstract
Random Forest is a promising algorithm for Big Data Science. In this talk Jan will discuss the design and implementation of Distributed Random Forest in H2O - one that scales to multiple nodes and big datasets. Unbalanced data & missing features pose unique problems for classification algorithms and proper handling can lead to greater predictive power. A short demo presenting the power of DRF on datasets will spice up the evening.

Jan Vitek
"Jan researched nuances of R at Purdue as a Programming Languages geek. Jan paired with 0xdata to make a better world for Math. He is on sabbatical with 0xdata & a full Professor of Computer Science at Purdue. Jan's students are solving some of the hardest problems in Programming Language and Virtual Machine Implementations. Jan is a hacker - He developed Distributed Random Forest for H2O.

Professor Vitek works in foundations and implementation of programming languages and has an interest in program analysis, real time systems, object-oriented software engineering, and information security. His research is being conducted in the Secure Software Systems (S3) Lab which he co-founded.

Dr. Vitek was born in Czechoslovakia and educated in Switzerland. He authored over 100 papers and edited books on mobile objects and secure Internet programming. He served on program committees for international conferences such as PLDI, OOPSLA, ECOOP, POPL, ESOP, ICALP, RTSS and RTAS"

Photo of SF Data Mining group
SF Data Mining
See more events
Trulia
116 New Montegomery St, 9th Floor · San Francisco, CA