addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscrossdots-three-verticaleditemptyheartexporteye-with-lineeyefacebookfolderfullheartglobegmailgooglegroupshelp-with-circleimageimagesinstagramFill 1linklocation-pinm-swarmSearchmailmessagesminusmoremuplabelShape 3 + Rectangle 1ShapeoutlookpersonJoin Group on CardStartprice-ribbonShapeShapeShapeShapeImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruserwarningyahoo

Have presentations and discuss Machine Learning scalability

In this meetup hosted by SAP we will have two presentations and the Q&A after each presentation.


Short introduction of SAP HANA Co-Innovation Lab given by Mike Kemelmakher.

First presentation will be about real life example of scaling particular ML algorithm (NaiveBayes classifier) using Hadoop.

This simple algorithm can pose challenges when applied to the web scale data.  The problem arises from the size of training sets, fact that model size is bigger then available physical memory of single node, and amount of computations required.

It will be technical presentation and aimed to people interested in scalable Machine Learning.

Partial disclosure was kindly permitted by SimilarGroup company which powers SimilarWeb web analytics site.

Presenter: David Gruzman, BigDataCraft.com


SAP will present the company's activities in the field of Predictive Analytics and overview of SAP HANA Predictive Analysis tool.

Accurately predict and act on big data, while extending insights across the business. See how integrating SAP Predictive Analysis with SAP HANA can help you expose never-before-seen opportunities and risks.

Presenter: John MacGregor, Head of SAP Centre for Predictive Analytics (the presentation will be given in English)


After both presentation we will open discussion.

The event will be hosted by SAP HANA Co-Innovation Lab.
Address: SAP Labs IL, Hatidhar 15, Raanana

Join or login to comment.

  • Gilad B.

    Technology and science are intermittent, within the process of implementing an algorithm. It seems you guys @ BigDataCraft.com are big data experts, no doubt about it. However, sometimes a good scientific understanding may completely reduce the complexity of the target problem. For example, as one of the audience mentioned, separating the problem into different languages may reduce the problem size dramatically. Having said that, even the simplest NB classification algorithm has its own golden tips & tricks, e.g. term weighting, smoothing, removing Hapax Legomena terms, thus keeping them in mind may lead to a probable much simpler problem, hence a solution.

    1 · March 15, 2013

    • David G.

      Indeed most of our expertise is in big data machinery and not in pure ML.
      I am completely agree that there are ways to significantly reduce model size and indeed many results improvements where
      achieved by playing with feature weights and regularization.
      Intentionally we wanted to build NB engine which can cope with very large models - in order not to have size constrained. For example adding even fraction of n-grams will lead to many-fold increase in model size.
      Specifically in this case - removing features with small weights lead to very bad results. It looks like in internet data
      most of the information sits in the long tail.

      March 16, 2013

  • Tomer Y.

    Interesting indeed!.
    Anyone uses R's packages for building map-reduce jobs on Hadoop like the RHadoop or RHipe , to share with us?

    March 15, 2013

  • Evgeny B.

    A very interesting topic. Thanks, David

    March 15, 2013

  • Ran L.

    Presentations were good, John Macgregor was a real delight; a bit more 'networking' and open discussion would've been great.
    Oh, and the guy REALLY DID INVENT parallel coordinates! :)
    http://en.m.wikipedia.org/wiki/Parallel_coordinates

    1 · March 14, 2013

  • Ofir M.

    Interesting presentations, a bit too short

    March 14, 2013

  • Dan

    David - looking forward to hearing and seeing you again!

    March 8, 2013

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy