Hadoop Summit San Jose: Enterprise-Grade Streaming Under 2ms on Hadoop


Details
This talk is part of Hadoop Summit 2016 San Jose. If you are attending the summit, come to our talk! If you are interested in attending, you can purchase tickets here (http://2016.hadoopsummit.org/san-jose/register/).
Enterprise-Grade Streaming Under 2ms on Hadoop
What if we had reached that point where open source can handle massively difficult streaming problems with enterprise-grade durability? We set out to answer this question. We wanted to answer it not with opinion or speculation, but in a rigorous and complete way. We needed not just to make a prototype or discover all the exciting new tools, but to create an open source-based enterprise-ready product that can transparently replace an enormously expensive proprietary solution. Today, we present Capital One's novel solution for real-time decisioning on Apache Apex. With an analysis of the dominant streaming frameworks, we show how Apex provides unique capabilities ensuring less than 2ms latency in an enterprise-grade solution on Hadoop. We'll cover the following: • A detailed dive into the business requirements of a new real-time decisioning platform for model building, feature computation, and model scoring • A survey and analysis of the leading open source technologies for stream processing and what tradeoffs we considered when selecting our technology stack. • Our solution, based on Apache Apex, which provides un-paralleled performance on Hadoop and meets the stringent performance, scalability, and durability requirements necessary for enterprise-grade decision making.
This talk is part of Hadoop Summit 2016 San Jose. If you are attending the summit, come to our talk! If you are interested in attending, you can purchase tickets here (http://2016.hadoopsummit.org/san-jose/register/).
Speaker:
Ilya Ganelin, Capital One
Ilya is a roboticist turned data engineer. After a few years at the University of Michigan building self-discovering robots and another few years working on embedded DSP software with cell phones and radios at Boeing, he landed in the world of big data at the Capital One Data Innovation Lab. Ilya is an active contributor to the core components of Apache Spark and a PPMC member of Apache Apex with the goal of learning what it takes to build a next-generation distributed computing platform. Ilya is an avid bread maker and cook, skier, and race-car driver.
This talk is part of Hadoop Summit 2016 San Jose. If you are attending the summit, come to our talk! If you are interested in attending, you can purchase tickets here (http://2016.hadoopsummit.org/san-jose/register/).


Hadoop Summit San Jose: Enterprise-Grade Streaming Under 2ms on Hadoop