Skip to content

What makes machine learning easy to program and what makes it fast?

Photo of Arshak Navruzyan
Hosted By
Arshak N.
What makes machine learning easy to program and what makes it fast?

Details

First Topic - Fast libsvm learning and cross-validation on dense datasets (https://github.com/clinicalpersona/cross_svm)

cross_svm is an open source modification of the widely used libsvm learning program for Support Vector Machines. It is usually at least10x faster than libsvm in cross-validation mode on dense datasets (thedense datasets are those which have relatively small fraction of zeroelements). Such datasets are common in genomics applications.

Dejan Miljkovic is Senior Software Engineer at Argyle Data, a company that provides real-time fraud analytics at network speed and Hadoop scale. Prior to Argyile Data, Dr. Miljkovic worked in e-commerce, power systems and process control industries. His current interest are in machine learning and high performance algorithm implementations.

Ljubomir Buturovic is CEO and co-founder of Clinical Persona, a consulting company providing machine learning solutions to clients in the life science industry. Prior to Clinical Persona, Dr. Buturovic served as Chief Scientist at Pathwork Diagnostics, leading the informatics team which delivered machine learning predictive algorithms for two FDA-cleared, clinically validated genomic tests for cancer.

Second Topic - What makes machine learning easy to program and what makes it fast?

The new Mahout DSL has two aims. One, to make it easy to program distributed machine learning algorithms using a math-like notation for the programs. The secondary goal is to allow such programs to be fairly performant by allowing alternative back-end computational engines. The primary back-end for Mahout is currently Spark, but there is also work going on with the h2o system. I will talk about how these back-ends help achieve these two goals, with particular attention to how speed is achieved.

Ted Dunning is Chief Applications Architect at MapR Technologies and committer and PMC member of the Apache Mahout, Apache ZooKeeper, and Apache Drill projects and mentor for Apache Storm, DataFu, Flink and Optiq projects. Ted was the chief architect behind the MusicMatch (now Yahoo Music) and Veoh recommendation systems. He built fraud detection systems for ID Analytics (LifeLock) and he has 24 patents issued to date and a dozen pending. Ted has a PhD in computing science from the University of Sheffield. When he’s not doing data science, he plays guitar and mandolin. He also bought the beer at the first Hadoop user group meeting.

Photo of SF GenAI LLMs Group group
SF GenAI LLMs Group
See more events
A9
130 Lytton Avenue, Palo Alto, CA · Palo Alto, CA