Data Science Classroom: Ensemble Methods

  • December 17, 2013 · 6:30 PM

For our December Meetup, we're thrilled to bring you the next in our occasional "Data Science Classroom" series, about foundational topics in statistics and machine learning. This time, we have Jay Hyer, GMU graduate student, introducing ensemble learning, an important set of techniques that combine the results from a large number of learners to get better predictions. 

NOTE: This event will be at a different venue -- the gorgeous offices of Gallup, in an old Masonic temple, conveniently located between the Gallery Place and Metro Center metro stations.


6:30pm -- Networking, Empenadas, and Refreshments

7:00pm -- Introduction

7:15pm -- Presentations and discussion

8:30pm -- Adjourn for Data Drinks (Fado, 808 7th St.)

Abstract:  This presentation will review how ensemble learning differs from more traditional machine learning techniques in its approach to modeling. The discussion will cover three popular methods: boosting, bagging and stacking.  The talk will finish with an overview of modern advancements and applications of ensemble methods.

Bio: Jay K. Hyer is currently pursuing a PhD in Computational Science and Informatics at George Mason University, where he also earned his MS in Statistics in 2009. Jay works as a data scientist for a DC based research, technology and consulting firm where he contributes to the development and delivery of technology solutions for institutes of higher education. Follow Jay on Twitter at @aDataHead.



This event is sponsored by Intridea, ClouderaStatistics.comElder Research, and MemSQL.

Join or login to comment.

  • Tommy J.

    One of my favorite presentations to date. I admit, I'm a sucker for classroom-style presentations, especially on algorithms.

    1 · December 22, 2013

  • Jay K.

    Thanks again to Harlan and the DSDC crew, Gallup, and the audience. Without everyone there wouldn't have been a meetup to present at!

    Please find the below link to a PDF version of my slides. I added a further reading/ R package slide as well.

    2 · December 20, 2013

  • John K.

    Thanks for the talk Jay; you are a very good explainer.

    December 20, 2013

    • Jay K.

      Thank you very much John! I appreciate the compliment and opportunity.

      December 20, 2013

  • Brand N.

    Fahad, I am planning on teaching a data science class at GMU in January 2014:

    and just wrote a Blog for Data Science DC:

    that may help you learn how to become a data scientist.

    You might also consider becoming a data journalist:

    2 · November 26, 2013

    • Azad N.

      Definitely Yes - however If given an option I would prefer to register for this course at GMU as it would count towards my PhD credits and would be very much helpful towards my research at the same time.

      December 17, 2013

    • cpboc

      If any are interested, Stanford is hosting a free introductory course to Statistical Learning (https://class.stanford...­). You can earn a signed Statement of Achievement by Prfs. Hastie and Tibshirani, who are also the co-authors of two of the books Mr. Hyer recommended.

      3 · December 19, 2013

  • Domenico

    How does one recommend a speaker?

    December 18, 2013

    • Harlan H.

      Click the big Contact button on the left sidebar! Thanks!

      December 18, 2013

    • Sean Moore G.

      You can also tweet @DataCommunityDC, @HarlanH, or @SeanMGonzalez with the handle of the speaker.

      December 19, 2013

  • Fei X.

    Very neat talk. I am wondering if Jay would post the slides later?

    1 · December 18, 2013

  • John K.

    Yes. Invite Jay back to talk on other things. Excellent presentation.

    1 · December 18, 2013

  • Shuli P.


    1 · December 18, 2013

  • Rick J.

    Excellent. One of the best meetup talks I've attended. Good job!

    1 · December 17, 2013

  • Valerie

    Sans serif fonts are a lot more readable in presentations!

    2 · December 17, 2013

    • Abhijit

      Yeah but looks like LaTeX defaults were used :-(

      December 17, 2013

  • Fahad K.

    Hey Guys, This is finals week for me. I won't be able to make it tonight. So sorry for changing plans later but I really hope to attend the meetups for DSDC next time.

    December 17, 2013

  • David M.

    Hey all,

    I've been looking forward to this meetup since it was posted, but I'm not sure if I'll be able to make it. Will there be a recording of the presentation? Or maybe the slides will be made available? I really want to come but I'm not sure if I can float it tonight. In any event, thanks for setting all this up!

    December 17, 2013

  • Shalina

    I can't make it tonight. I would like to get the presentation.

    December 17, 2013

  • Fahad K.

    Hello Everyone! I am currently a student seeking advice on how to become a data scientist, what path to take and how to get there. Is this a good meetup to start for beginners in data science? I also looked at the Computational Data Sciences program at GMU, however, after talking to one of the professors (Prof. Kirk Borne) I have learned that the CDS program is no longer offered? Can someone confirm this?

    November 22, 2013

    • Fahad K.

      Thank You guys for all of your detailed answers! That is A LOT of ground to cover so I really apreciate your reply regarding DS in general. Looking forward to meeting many of you at the meetups!

      November 24, 2013

    • Qishen H.

      IMO, Data Scientist = serious Math/Stat + serious computer science = MS in Math/Stat + MS or BS in Computer Science.

      December 17, 2013

  • Harlan H.

    We're looking forward to seeing everyone tomorrow! Also, Jay has written a list of ensemble learning resources that he recommends, and they're on the DC2 blog:

    December 16, 2013

    • Fahad K.

      These programs look like an ambitious goal to aim for, Thanks for the link Andy.

      December 5, 2013

  • Brand N.

    Continued: An example of the "Intermediate"­­ or "Advanced" track would be my recent data science audit using Ben Shneiderman's "8 Golden Rules of Data Science" for Drug Smuggling: Global and Local Detection:

    I look forward to your feedback to help design the course.

    November 28, 2013

  • Brand N.

    I added a section on Prerequisites:

    Please tell me if you understand these previous GMU lectures for first year engineering and computer science students.

    Also if you can follow the example I did today using Data Science Central, which is offering a new Data Science Book and Apprenticeship, to see if you can follow and even recreate my example:

    To recreate, you would need the spreadsheet, a free tool like Silver Spotfire, and my dashboard. This would be an example of The "Intro" track not requiring data preparation and coding.

    November 28, 2013

  • John K.

    What's the easiest/simplest way to begin to work with/understand ensembles?

    November 6, 2013

    • Alex P.

      ^ if you're impatient go to 47:40.

      1 · November 6, 2013

    • John K.

      watching right now; very nice. thanks.

      November 6, 2013

  • Jim D.

    Ensembles can give multi-perspective insights with biomedical classification and clustering applications. I am sure this is true for other subject matter domains as well. Ensembles are of high interest to me right now and so I am happy that this meeting has been organized. Thank you!

    1 · November 6, 2013

  • Amrinder A.

    Great! Ensemble methods is specifically an area of interest for me, and I am glad that there is a presentation on this topic. Great venue too! PS: I am not coming there just for empenadas. OK, well, maybe I am.

    November 5, 2013

Our Sponsors

People in this
Meetup are also in:

Create your own Meetup Group

Get started Learn more

Meetup has allowed me to meet people I wouldn't have met naturally - they're totally different than me.

Allison, started Women's Adventure Travel

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy