Skip to content

Big Data Science Meetup on Julia

Photo of Sanhita Sarkar
Hosted By
Sanhita S.
Big Data Science Meetup on Julia

Details

http://photos2.meetupstatic.com/photos/event/6/3/3/f/600_443545407.jpeg

6:30 P.M. - 7:00 P.M. Networking/Introduction

7:00 P.M. - 7:25 P.M. Session 1

Topic: Julia - a fresh approach to numerical computing and data science

Speaker: Tony Kelman, Julia Computing Inc.

Abstract: Julia is a high-level, high-performance dynamic programming language for numerical computing and data science. It provides a sophisticated compiler, distributed parallel execution, numerical accuracy, and an extensive mathematical function library. Julia’s Base library, largely written in Julia itself, also integrates mature, best-of-breed open source C and Fortran libraries for linear algebra, random number generation, signal processing, and string processing, to name a few. In addition, the Julia developer community is contributing a number of external packages through Julia’s built-in package manager - there are over 700 packages today (http://pkg.julialang.org (http://pkg.julialang.org/)). Jupyter notebook, a collaboration between the Jupyter and Julia communities, provides a powerful browser-based graphical notebook interface to Julia.

In this talk, the audience will get a feel for Julia, its syntax and language features, and why they are a good fit for data science. The demos will be based on Jupyter notebooks, and show how to load datasets, analyze, and plot. We will show some of Julia's built-in parallel computing features, and highlight interoperability with other data ecosystems via the Julia packages PyCall, JavaCall, ProtoBuf, Thrift, and Elly (an HDFS/Yarn client).

Website: http://julialang.org (http://julialang.org/)

Speaker Bio: Tony Kelman is a core contributor to the Julia programming language, and Software Engineer at Julia Computing Inc. He was previously a PhD student at UC Berkeley conducting research on optimization-based control theory.

7:25 P.M. - 7:30 P.M. Q/A

7:30 P.M. - 7:50 P.M. Session 2

Topic: Combining Julia, Scala and Spark for machine learning with CALSTATDN Model

Speaker: Shyam Sarkar and Ayush Sarkar, AyushNet

Abstract: Internet of Things (IoT) has led to a wide variety of applications producing massive amounts of data in nearly every field such as natural ecosystems, bioinformatics, smart cities, factories, automobiles, airplanes and others. There is demand for efficient methods for near-real time collection, processing, and analysis of data. The purpose of this presentation is to implement a new model called CALSTATDN, which iterates over a sequence of computing stages based on a Calculus (CAL) based model, a Statistics (STAT) based model and database normalization (DN). This computing model reduces processing time of machine learning by several orders of magnitude. This model is applicable to any system with sensors, intelligent devices and other internet sources in order to bring forth significant reduction in time while processing and analyzing streaming data for insight.

Julia is a high-level, high-performance dynamic programming language for scientific computing, with easy to write syntax. It provides a sophisticated compiler technology and distributed parallel execution paradigm with numerical accuracy. It also provides an extensive mathematical function library. Spark is a fast and general processing engine compatible with Hadoop data. It is designed to perform both batch processing (similar to MapReduce) and new workloads like streaming, interactive queries, and machine learning with MLib analyzing terabytes of data.

This talk will explore possibilities of combining Spark, Scala and Julia to create a new scientific computing platform for implementing CALSTATDN model.

Speaker Bios: Dr. Shyam Sarkar is an entrepreneur in Big Data industry. He has thirty years of experience in database research and development. He has multiple patents and publications. Ayush Sarkar is a software developer with experience in mathematical modeling and analysis. CALSTATDN Model was jointly invented by Dr. Shyam Sarkar and Mr. Ayush Sarkar.

7:50 P.M. - 8:00 P.M. Networking

Photo of Big Data Science group
Big Data Science
See more events
MapR Technologies
350 Holger Way · San Jose, CA