addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscrossdots-three-verticaleditemptyheartexporteye-with-lineeyefacebookfolderfullheartglobegmailgooglegroupshelp-with-circleimageimagesinstagramFill 1linklocation-pinm-swarmSearchmailmessagesminusmoremuplabelShape 3 + Rectangle 1ShapeoutlookpersonJoin Group on CardStartprice-ribbonShapeShapeShapeShapeImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruserwarningyahoo

Big Data Science Meetup @ Strata Conference

 

Link for Strata Conference :  http://oreil.ly/1d2X5PK

 

 

 

 

 

 

 

 

5:30 P.M. - 6:00 P.M.  Welcome

6:00 P.M. - 6:25 P.M.  Session 1

Title: ADAM: Big Data Processing and Storage for Genomics

Speaker: Frank Austin Nothaft, Graduate Student at UC-Berkeley

Abstract: By using cloud computing services or large computing clusters to process genomic data, we can significantly decrease the cost and latency of genomic analysis. However, current genomics data formats and processing pipelines were introduced prior to many significant advances in cloud and cluster computing technologies. Through the careful design of new file formats, we can unlock the advantages of distributed computing and also ensure that the file formats can easily be optimized for future computing advances. In this talk, we introduce ADAM, a set of file formats and command line tools for processing genome data on clusters and in the cloud. Using 100 nodes from Amazon Web Services, ADAM performs genetic processing steps such as marking duplicates and sorting 40 to 50 times faster.

Speaker Bio: Frank Austin Nothaft is a graduate student in the AMP and ASPIRE labs at UC-Berkeley, and is advised by Prof. David Patterson. His current focus is on high performance computer systems for bioinformatics, and is involved in the ADAM, avocado, and FireBox projects. Prior to Berkeley, Frank worked at Broadcom in Irvine, CA on high performance electronic design automation. Frank has a Bachelors of Science with Honors in Electrical Engineering from Stanford University.

6:25 P.M. - 6:30 P.M.  Q/A


6:30 P.M. - 7:15 P.M.  Session 2

Title:  All models are wrong, but some models are useful

Speaker: SriSatish Ambati, Founder and CEO of 0xData(@hexadata)

Abstract: The promise of big data is better predictions. There is no best model that works for all of your data. Model predictive performance is domain specific. What works in one data domain has sometimes very little consequence in another one. Data science needs to get closer to the business and unlock value.

Ensembles are here to stay! Users want a buffet of algorithms that try to "lock-pick" the data for it's secrets. Time is eventually the key limiter. Data science efforts have to make best out of the budget for experimentation and use some kind of co-evolutionary technique that picks the "Champion" model of models for your data. Robust automation and fast analytics can speedup large parts of data smithy. In this talk we discuss ensemble techniques of boosting & trees that when applied on use cases lead to a substantial better predictions.

Speaker Bio: Sri is co-founder and CEO of 0xdata (@hexadata), the builders of H2O. H2O democratizes bigdata science and makes hadoop do math for better predictions. Before 0xdata, Sri spent time scaling R over bigdata with researchers at Purdue and Stanford. Prior to that Sri co-founded Platfora and was the Director of Engineering at DataStax. Before that Sri was Partner & Performance engineer at java multi-core startup, Azul Systems, tinkering with the entire ecosystem of enterprise apps at scale. Before that Sri was at sabbatical pursuing Theoretical Neuroscience at Berkeley. Prior to that Sri worked on nosql trie based index for semistructured data at in-memory index startup RightOrder.

Sri is known for his knack for envisioning killer apps in fast evolving spaces and assembling stellar teams towards productizing that vision. A regular speaker in the BigData, NoSQL and Java circuit, Sri leaves trail @srisatish.

7:15 P.M. - 7:25 P.M.  Q/A

7:25 P.M. - 7:30 P.M.  Break

 

7:30 P.M. - 8:15 P.M.  Session 3

Title: Real World Big Data Prescriptive Analytics

Speaker: Nick Gonzalez, Pentaho

Abstract: Todays large and convoluted data landscape coupled with the abundance of available computing resources presents unique opportunities for data scientists around the world. To remain competitive in this landscape, we must go beyond generating predictions to generating solutions from big data that are driven by actions derived from data driven predictions. And we have to do this as fast as possible.  This is the real world of big data prescriptive analytics.

Performing prescriptive analytics that is both accurate and responsive on big data is simultaneously our most valuable tool and our biggest challenge. Solving this challenge involves building intelligence and automation into data preparation and acquisition, leveraging distributed architectures to increase accuracy while reducing processing time, automating the descriptive analytics process, building intelligent workflows that minimize human error while maximizing human creativity, plus a whole lot more.

This talk will address each one of these challenges and present technical solutions and algorithms to address them.  By the end of this presentation each individual solution will come together in a symphony of code and hardware to form a unified automated process that is the backbone of a successful big data prescriptive analytics solution.

Speaker Bio:

Nick Gonzalez wrote his first multiplayer video game at the age of 8 on a Tandy TRS80, he was the youngest programmer ever to lead R&D efforts for one of the top video game publishers in the world, he built targeted advertising systems for the web before we knew what to call them, and he built several technology companies (one of which was acquired by Microsoft before the age of 25).  Nick has held several titles in the past 18 years, including Chief Software Architect for EA Sports where he architected Tiger Woods PGA Tour for Xbox 360 and PS3, Founder and CTO of 2  companies, where he built a big data behavioral analysis system that personalizes video games in real time optimizing the user experience and driving revenue, and most recently VP & Chief Data Scientist at Pentaho where he is laying the foundation for the next generation of prescriptive analytical technologies.  Predictive algorithms, artificial intelligence, and large scale distributed systems have played a central role in almost every project Nick has ever been involved in.

As a self proclaimed modern philosopher, Nick spends his nights and weekends contemplating  the simplicity of the human mind and devising ways to replicate it in a machine with as few lines of LISP code as humanly possible without leaving Emacs.  When he is not solving ridiculous problems or staring at a computer screen, he is reading math and philosophy books, chillin with his wife listening to music, or hanging out with his four kids playing basketball and watching kung-fu movies.

8:15 P.M. - 8:25 P.M.  Q/A

8:25 P.M. - 8:30 P.M.  Break

 

8:30 P.M. - 9:30 P.M.  Networking

Sponsors for this event:

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


 

 

Join or login to comment.

  • Randy B.

    If we're currently on the waitlist...should we even bother to drive over to the convention center and try to get in? It would be useful if you could indicate what the chance is of someone on the waitlist to get in.
    Thanks.

    February 10, 2014

    • Rajesh T.

      I was just struck by the peculiarity of a gathering of "Predictive Analysts" asking a Chance question to others. Please I am not an organizer. I think one of the organizers can answer your question better. Thanks.

      February 10, 2014

    • Shyam S.

      Try to get in very early and grab a place to sit. When there are so many people in the waiting list, we cannot assure anything.

      February 10, 2014

  • Frederick F. Kautz I.

    Can you please stream this online?

    February 10, 2014

  • Alistair M.

    How likely am I to be able to hear the talks if I'm currently on the waitlist? (I haven't purchased Strata tickets)

    February 9, 2014

  • Joe F

    I will be at Strata. Can I get confirmed please? I am on the waitlist.

    February 9, 2014

  • Dave

    P

    February 9, 2014

  • Wojtek P.

    Do you have to have a Strata conference ticket to go to this meetup? Conference is sold out...

    February 9, 2014

    • Shyam S.

      No you do not need Strata Conference ticket..

      February 9, 2014

  • Vinay N.

    Sorry guys something came up so can't come.

    February 9, 2014

  • dkuldeep11

    looking forward to presentations

    February 8, 2014

  • He-Man

    Interested

    February 2, 2014

  • jitakshi

    Hi there

    January 27, 2014

  • Varun L B.

    Meets us @ Strata Booth # 211

    InfoObjects
    InfoObjects is an open-source focused IT services company. We are headquartered in Santa Clara, CA with offshore development center in India. InfoObjects specializes in training, implementation and support of Hadoop and related technologies. We strongly believe that all Big Data problems can be solved using pure open-source software and provide solutions around Apache stack of Hadoop. Please visit us at www.infoobjects.com for more details , you can also meet me at Strata just ask for me and i will be happy to share more information.

    Regards,
    Varun Berry

    January 25, 2014

  • Biz S.

    Bizsmart provides you the best Training experience. Our trainers are realtime experts in their respective domains. Visit us at http://www.bizsmart.in

    January 3, 2014

  • David

    Looking for help revolutionizing search.

    January 1, 2014

  • SVR T.

    SVR Technologies provides you the best online learning experience that you ever had. Our trainers are having a real time experience of more than 8 years, they are working professionals and industry level experts. We offer all IT software courses for the lowest price with a quality assured training. Visit us at http://svrtechnologies.com/datastage_training.html

    December 21, 2013

Our Sponsors

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy