Next Meetup

Machine Learning Best Algorithms: Gradient Boosting Machines (GBM)
RSVP on Eventbrite, see link below. We'll have a main talk (30 mins) and 3 excellent lightning talks about the machine learning algorithm that usually achieves the best accuracy on structured/tabular data (e.g. in industry/business applications or in Kaggle competitions): Better than Deep Learning: Gradient Boosting Machines (GBM) by Szilard Pafka, PhD Chief Scientist, Epoch Abstract: With all the hype about deep learning and "AI", it is not well publicized that for structured/tabular data widely encountered in business applications it is actually another machine learning algorithm, the gradient boosting machine (GBM) that most often achieves the highest accuracy in supervised learning tasks. In this talk we'll review some of the main GBM implementations available as R and Python packages such as xgboost, h2o, lightgbm etc, we'll discuss some of their main features and characteristics, and we'll see how tuning GBMs and creating ensembles of the best models can achieve the best prediction accuracy for many business problems. Bio: Szilard studied Physics in the 90s and obtained a PhD by using statistical methods to analyze the risk of financial portfolios. He worked in finance, then more than a decade ago moved to become the Chief Scientist of a tech company in Santa Monica doing everything data (analysis, modeling, data visualization, machine learning, data infrastructure etc). He is the founder/organizer of several meetups in the Los Angeles area (R, data science etc) and the data science community website He is the author of a well-known machine learning benchmark on github (1000+ stars), a frequent speaker at conferences (keynote/invited at KDD, R-finance, Crunch, eRum and contributed at useR!, PAW, EARL etc.), and he has developed and taught graduate data science and machine learning courses as a visiting professor at two universities (UCLA and CEU in Europe). Lighting talks: 1. Not your father’s objective function - weird things you can do with xgboost by Peter Foley Vice President, Analytics at 605 XGBoost makes it easy to predict a variety of different outcome types -- binary, continuous, ranked, categorical, count, and also custom objective functions for specialized needs. Those custom objective functions let you do some even weirder stuff that you might not expect. I’ll give examples of hacking custom objectives to fit vector-valued outcomes and local linear regressions, and give tips on how to plug your own weird functions into xgboost to take advantage of it’s powerful and fast tree construction. Bio: Peter leads the analytics and data science team, and manages 605’s behavioral modeling and experimentation. 605 offers unique, independent television audience measurement and analytics to build better marketing and programming initiatives within the media and entertainment industries. 2. TBD (GBM model interpretability) by Michael Tiernay Senior Data Scientist at Netflix TBD 3. Why lightGBM become my 1st choice in Kaggle? by Hang Li Hulu Xgboost has become the most popular algorithm in Kaggle competitions 3 or 4 years ago. Recently, another open source GBM implementation (lightGBM) was introduced to the Kaggle community by Microsoft. Due to its good performance, it has become my 1st choice for Kaggles. In this talk will briefly introduce some of the nice features of lightGBM. Bio: Hang is a Competition Master at Kaggle. He participated in over 30 Data Science Competitions. He has a strong passion for using machine learning techniques to solve real-world problems. Timeline: – 6:00pm arrival, pizza+drinks and networking – 7:00pm talks start You must have a confirmed RSVP and please arrive by 6:55pm the latest. Please RSVP here on Eventbrite:


2500 Broadway · Santa Monica , CA

    Past Meetups (28)

    What we're about

    This meetup group aims to bring together data professionals and businesses where data and analytics play a central role. Topics: data warehouses, data pipelines, ETL, business analytics, MPP, big data, business intelligence, OLAP, data visualization, dashboards - using tools from open source to commercial products for large-enterprise.

    Members (597)

    Photos (20)