Past Meetup

"Mathematical Shape of Big Data"@Hadoop Su­mmit, 2014

This Meetup is past

120 people went


5:00 P.M. - 6:30 P.M. Meetup Cocktail Reception

– Almaden Ballroom – Hilton San Jose

6:30 P.M. - 6:45 P.M. Introduction

6:45 P.M. - 7:20 P.M. Session 1

Title: Mathematical Shape of Big Data and Future Challenges of Data Sciences

Speaker: Dr. Shyam Sarkar and Dr. Sanhita Sarkar, Organizers of Big Data Science

Abstract: New multimodal big data sets collected from biological systems, human social systems, financial systems and many other systems are gathered even before data scientists have any hypothesis. Some scientists assert the need for an underlying global theory on a par with the invention of calculus. A broader theoretical model should integrate various techniques and tools being developed. Machine Learning currently serves as a standard technique for big data analysis. Many of the methods in machine learning are most effective when working with data matrices or sets of vectors. In many cases, data sets are too complex and apparently do not comply with this framework. Some researchers are trying to find structures in unstructured data. so that machine learning algorithms can be applied. Structures in big data should be found using techniques of Calculus even before machine learning algorithms can be applied as techniques of Statistics. A sequence over successive applications of such techniques of Calculus and Statistics may very well lead to a general theoretical model.

7:20 P.M. - 7:30 P.M. Q/A

7:30 P.M. - 8:05 P.M. Session 2

Title: Bayesian Network with R and Hadoop

Speaker: Ofer Mendelevitch, Director of Data Sciences, Hortonworks

Abstract: A bayesian network is an intuitive graphical model that effectively models various real world problems in fields such as genetic research, medicine, robotics, document classification, image processing and gaming. In this talk Ofer will provide an overview of bayesian networks, and describe how to use R for learning of and inference with bayesian network models, covering challenges and solutions for large scale inference with bayesian network with R and Hadoop.

Speaker Bio: Ofer Mendelevitch is Director of data sciences at Hortonworks, where he is responsible for professional services involving data science with Hadoop. Prior to joining Hortonworks, Ofer served as Entrepreneur in Residence at XSeed Capital where he developed an investment strategy around big data. Before XSeed, Ofer served as VP of Engineering at Nor1, and before that he was Director of engineering at Yahoo! where he led multiple engineering and data science teams responsible for R&D of large scale computational advertising projects including CTR prediction (with Hadoop), a new front-end ad-serving system and sales tools.

8:05 P.M. - 8:15 P.M. Q/A

8:15 P.M. - 8:30 P.M. Break

8:30 P.M. - 9:05 P.M. Session 3

Title: The Triumph of Power Law and other stories from big data

Speaker: Sri Satish Ambati, CEO of OxData

Abstract: Data represents the universe and phenomena. Universe is sparse and networks follow the power law. In this talk we look at example empirical structure of human-to-human and machine interactions and seek patterns and topological structure. Some new tools like h2o and techniques like ADMM allow us to get beyond the curse of dimensionality and look for interaction amongst dimensions. The pathology of big data begins by using eigen as a lens and ends with a story that transforms innovation and enterprises. Making the world a better place with data & science is a (stochastic) process to be continued...

Bio of Speaker: Sri is co-founder and CEO of 0xdata (@hexadata), the builders of H2O. H2O democratizes bigdata science and makes hadoop do math for better predictions. Before 0xdata, Sri spent time scaling R over bigdata with researchers at Purdue and Stanford. Prior to that Sri co-founded Platfora and was the Director of Engineering at DataStax. Before that Sri was Partner & Performance engineer at java multi-core startup, Azul Systems, tinkering with the entire ecosystem of enterprise apps at scale. Before that Sri was at sabbatical pursuing Theoretical Neuroscience at Berkeley. Prior to that Sri worked on nosql trie based index for semistructured data at in-memory index startup RightOrder.

Sri is known for his knack for envisioning killer apps in fast evolving spaces and assembling stellar teams towards productizing that vision. A regular speaker in the BigData, NoSQL and Java circuit, Sri leaves trail @srisatish.

9:05 P.M. - 9:15 P.M. Q/A

9:15 P.M. - 10:00 P.M. Networking

Our Sponsors: