Fast-Data (2nd Meetup event)

Name: Fast-Data (2nd Meetup event)
Start: 2016-06-21T18:00:00-04:00
End: 2016-06-21T20:45:00-04:00
Location: McLean Auditorium

Hosted by Harsha B.

Fast Data DC (NoVA/MD/DC)

Details

AGENDA

6:00 – 6:30 Networking and food

6:30 – 6:45 Welcome & Introductions

6:45 – 7:30 Dean Wampler ( Lightbend) - Scala and the JVM as a big data platform: Lessons from Apache Spark

7:30 – 8:15 Ryan Zotti and Subbu Thiruppathy ( Capital One) : Building Real-time Targeting Capabilities on AWS

8:15 – 8:45 Close & Wrap-up . Networking.

Dean Wampler, Ph.D., is the Architect for Big Data Products and Services in the Office of the CTO at Lightbend (http://lightbend.com/), He will talk about.........

Apache Spark (http://spark.apache.org/) is implemented in Scala and it’s user-facing Scala API is very similar to Scala’s own collections API. The power and concision of this API are bringing many developers to Scala. The core abstractions in Spark have created a flexible, extensible platform for applications like streaming, SQL queries, machine learning, and more.

Scala’s uptake reflect the following advantages over Java:

A pragmatic balance of object-oriented and functional programming. An interpreter mode, which allows the same sort of exploratory programming that Data Scientists have enjoyed with Python and other languages. Scala-centric “Notebooks” are also now available. A rich collections library that enables composition of operations for concise, powerful code.

Tuples

are naturally expressed in Scala and very convenient for working with data. Pattern Matching makes data

deconstruction

fast and intuitive. Type inference provides safety, feedback to the developer, yet minimal typing of actual type signatures. Scala idioms lend themselves to the construction of small

domain specific languages

, which are useful for building libraries that are concise and intuitive for domain experts.

Using these and other examples from the Spark project, this talk discusses the strengths and weaknesses of Scala and the JVM for Big Data, and how we might improve both to make them better tools for our needs.

Speaker Bio: Dean Wampler, Ph.D., is the Architect for Big Data Products and Services in the Office of the CTO at Lightbend (http://lightbend.com/), where he focuses on the evolving Fast Datastack for streaming applications based on Spark (http://spark.apache.org/), Kafka (http://kafka.apache.org/), Mesos (http://mesos.apache.org/), the Lightbend Reactive Platform (http://lightbend.com/platform), and other tools. Dean is also an advocate for Functional Programming and Scala (http://scala-lang.org/). He contributes to several open source projects and he co-organizes the Chicago-Area Scala Enthusiasts (CASE) (http://meetup.com/chicagoscala/), the Chicago Spark Users, and the Chicago-area Hadoop User Group (CHUG) (https://www.meetup.com/Chicago-area-Hadoop-User-Group-CHUG/) Meetups. He also helps organize a number of conferences, including Strata + Hadoop World (http://conferences.oreilly.com/strata) andGOTO Chicago (http://gotocon.com/). Dean is the author of Programming Scala, 2nd Edition (http://shop.oreilly.com/product/0636920033073.do) and Functional Programming for Java Developers (http://shop.oreilly.com/product/0636920021667.do), and the co-author of Programming Hive (http://shop.oreilly.com/product/0636920023555.do), all from O'Reilly. He lurks on twitter, @deanwampler (http://twitter.com/deanwampler).

Ryan Zotti and Subbu Thiruppathy are Engineers at Capital One and will talk about

The Fast Marketing team at Capital One is experimenting with various technologies to enable lightning-fast promotional content that visitors will see when they visit Capital One’s website looking to apply for a credit card. In this presentation we’ll first talk about some of the technologies that we’re exploring such as the Akka-based Play framework, and H2O, a popular open source machine learning library. Then conclude with a quick demo followed by a few tips and tricks that we learned along the way.

Speaker Bio:

Ryan Zotti is a data engineer at Capital One, where he focuses on putting fast, scalable, open source Big Data applications into production and in the cloud on AWS. Ryan has experience with technologies such as Hadoop, Spark, Storm, Flink, Akka, and Kafka. In his spare time, Ryan likes to work on solving difficult machine learning problems. For example, he is currently building a self-driving remote-controlled car with a Raspberry Pi and Google’s TensorFlow, and he recently co-authored a machine learning research paper with Yale researchers that was published in The Lancet, one world’s oldest and best known medical journals.

Subbu Thiruppathy is a Software Engineer at Capital One, where he focuses on putting fast, scalable, in-house Java APIs into production and in the cloud on AWS. Subbu has experience with technologies such as Java, Spark, Spring, Hibernate, AWS.

Fast Data DC (NoVA/MD/DC)

Fast-Data (2nd Meetup event)

Fast Data DC (NoVA/MD/DC)

Details

Related topics

You may also like