Skip to content

Big Data Streaming Platform Ecosystem

Photo of Slim Baltagi
Hosted By
Slim B. and Srini P.
Big Data Streaming Platform Ecosystem

Details

Please join us for an exciting evening to learn more about real-time streaming analytics from Reza Farivar (https://www.linkedin.com/in/reza-farivar-5209465) who is Data Engineering Manager at Capital One.

Sponsor:

• Capital One (http://www.CapitalOne.com) is hosting the event. Prashant Mehrotra (https://www.linkedin.com/in/mehrotraprashant), Director of Big Data Engineering at Capital One is sponsoring for pizza and drinks.

Schedule:

5:30 pm - 6:00 pm : Networking, pizza and drinks

6:00 pm - 6:05 pm: Welcome and Kickoff by Slim Baltagi

6:05 pm – 7:00 pm : Talk by Reza Farivar

Talk description:

The Big Data Streaming Systems landscape is constantly changing, and many of the competing projects are complementary in nature. The current state of the art is to mix and match multiple systems to arrive at a complete end-to-end solution.

In this session, we present one such architecture which is gaining in popularity in the community. In this architecture, Apache NiFi is used as a scalable first stage data gathering system. Once the streaming data is collected from multiple geographical locations, it is stored for staging in Apache Kafka. Then, either a real stream processing engine such as Apache Storm or a micro-batch streaming engine such as Apache Spark Streaming is used for real-time processing (filtering, database lookup and joins, projections, etc.) is used to format the incoming data. The results are then stored in a time-series database such as druid, which keeps the incoming data in time-windows to manage storage requirements, and finally an analytics framework such as Apache Spark is used to perform queries or machine learning tasks on the past window.

We present a hands on demo on what this architecture looks like in action, and provide some best-practices knowledge.

Speaker Bio:

Reza Farivar is a Data Engineering Manager at Capital One, where he works on Big Data / Fast Data Cloud Computing platforms.

Before joining Capital One, he was a senior software engineer at Yahoo working on Big/Fast Data platforms including Apache Storm and Spark. He has done both his PhD and postdoctoral works at the University of Illinois at Urbana-Champaign, with his research focusing on Big Data and Cloud platforms, programming models and the application of these technologies in diverse domains including finance, machine learning and bioinformatics. He holds a special interest in the application of specialized hardware accelerators such as GPUs in big data computing platforms.

He is also a Research Assistant Professor at the Computer Science department of the University of Illinois, where he has been involved in research and teaching courses (including on coursera.org website) on Cloud Computing, Big Data and Operating Systems since 2011.

Photo of Chicago Advanced Analytics Meetup group
Chicago Advanced Analytics Meetup
See more events
Capital One
77 W Upper Wacker Dr · Chicago, IL