Past Meetup

A High-Performance Implementation of Bayesian Clustering

This Meetup is past

21 people went

Location image of event venue


Join us to learn about the frontier of Bayesian methods and probabilistic machine learning: Bayesian nonparametrics. We will introduce you to both the mathematical and computational aspects of the most important such model, the Chinese Restaurant Process. CRP is a fully probabilistic, generative model of clustering. First we will explore the theory and applications in both equations and plain-language graphical examples. Then we’ll tell you our story of woe and ultimate triumph in translating this powerful paper into software for big data through 7 rounds of performance optimization rewrites in Scala. Even if you’re not using CRP, these time-honored strategies for speeding up complex calculations will help you unleash your statistical models on big data.

Cibo Technologies is a Computational Agronomy startup using daily, planetary-scale agricultural models to improve farms, commodity trading and global supply chains. Our team blends mechanistic biological models with data science, remote sensing and functional programming to maximize both farmers' harvests and environmental sustainability.

Ryan Richt leads software engineering and data science at Cibo. Ryan holds a BA in Math and an MBA from Washington University in St. Louis. After bootstrapping a genomics startup, Ryan spent 6 years at Monsanto leading a Scala distributed data mining team in Biotech R&D (read: GMOs), cloud transformation in IT and ultimately IT’s Monsanto Labs. Ryan is passionate about functional programming, Bayesian stats and distributed systems.