Using Existing Math Libraries with Spark

Name: Using Existing Math Libraries with Spark
Start: 2015-06-24T17:30:00-05:00
End: 2015-06-24T20:30:00-05:00
Location: Orbitz

Hosted By

Dean W.

Using Existing Math Libraries with Spark

Details

Brian Spector from the Numerical Algorithms Group (NAG) will be discussing using existing math libraries on Spark. Brian is a Technical Consultant at NAG where he has begun to successfully implement the NAG Library’s 1600 mathematical routines for Big Data applications. Brian will share the many pitfalls and successes he has had while using a numerical library in a distributed computing environment. Today’s mathematical algorithms require all relevant data to be in-memory at run time for efficiency. As this differs from Spark’s ecosystem, we must now rethink our algorithms for Big Data applications. As an example, we will review the simple linear regression problem and find that it is not so simple to run on hundreds of GBs of data. We’ll touch on the efficient algorithms for Big Data applications and the importance of scaling as you increase the number of worker nodes. Other topics covered include; starting a Spark ec2 instance and required steps to use existing libraries on Spark.

Events in Chicago, IL

Wednesday, June 24, 2015 at 5:30 PM to Wednesday, June 24, 2015 at 8:30 PM CDT

Orbitz

500 W Madison St · Chicago, IL

Chicago Spark Users

public group

Using Existing Math Libraries with Spark