Hadoop and Machine Learning

The Big Data buzzword represents the convergence of massive, dynamic data sets with powerful techniques for effective use of that data.  At the core is a scale-out architecture involving distributed computation and storage. 

Machine learning algorithms are adapting to work effectively in distributed environments. Eron will examine a few frameworks and tools for scale-out machine learning algorithms.  We will discuss how the Hadoop framework is evolving to support a greater diversity of machine learning algorithms.  We will touch on the use of the cloud for hosting scale-out machine learning projects. 

The session will consist of a presentation, some demonstrations, and open discussion.

About the speaker:

Eron Wright is a Director of Engineering at EMC.   Eron works on the ViPR product, delivering data services for petabyte-scale storage with integrated Hadoop capabilities.  Eron was formerly a developer at Microsoft, working on the Windows Azure platform.  In his spare time, he is a machine learning hobbyist, and welcomes our new sentient overlords!

Join or login to comment.

  • Eron W.

    Thanks everyone for attending! I will post some links soon.

    1 · February 7

    • Anand G.

      Can you please post the presentation? I missed the meetup unfortunately.

      February 11

    • Stephen

      Any chance you can post the links?

      March 23

  • Stephen

    Eron's talk was totally awesome and very informative!

    (Eron - don't sweat it about preparation, you talking ad-lib is far better than most prepared talks)

    March 23

  • Qiming H.

    You are more then welcome to join the following data center product briefing


    February 21

  • Dan B.

    People, last night I presented at the
    Title: Use Postgres and MADlib Logistic Regression for Stockmarket Predictions
    Here is the github repo:

    1 · February 7

  • Song C.

    Hi, guys, I have started to post my Hadoop codes at https://github.com/songcui­ and song-cui.blogspot.com/‎ and will post more in the future. Most of the algorithms will be trying to solve problems covered in Stanford Course CS 246 & CS 246h focuses on mining massive datasets. Hope you will find it useful and your comments are welcome!

    2 · February 7

    • Vamshideep

      Thanks, Song! for posting it before the presentation.

      February 7

  • anne

    Free IEEE Computer Soc presentation , Feb 11 , hosted by Cadence in San Jose

    February 2

  • Mahi

    interested in learning Had pop and Machine learning.

    February 1

Our Sponsors

People in this
Meetup are also in:

Create your own Meetup Group

Get started Learn more

I'm surpris ed by the level of growth I've seen since becoming an organizer, it's given me more confidence in my abilities.

Katie, started NYC ICO

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy