addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscrossdots-three-verticaleditemptyheartexporteye-with-lineeyefacebookfolderfullheartglobegmailgooglegroupshelp-with-circleimageimagesinstagramlinklocation-pinm-swarmSearchmailmessagesminusmoremuplabelShape 3 + Rectangle 1ShapeoutlookpersonJoin Group on CardStartprice-ribbonShapeShapeShapeShapeImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruserwarningyahoo

Ingest and Extract Big Data Natively in Hadoop with Apache Apex


5:45pm - Food & Network

6:00pm - Talk 1: Architecture of DataTorrent Ingestion Solution, presented by Pramod Immaneni

6:40pm - Q&A

6:50pm - Talk 2: Ingesting Data from Kafka, presented by Vlad Rozov

7:30pm - Q&A

7:40pm - Food & Network

Free T-Shirts.

Abstract: Ingesting and extracting data from Hadoop can be a frustrating, time consuming activity for many enterprises. Apache Apex Data Ingestion is a standalone big data application that simplifies the collection, aggregation and movement of large amounts of data to and from Hadoop for a more efficient data processing pipeline. Apache Apex Data Ingestion makes configuring and running Hadoop data ingestion and data extraction a point and click process enabling a smooth, easy path to your Hadoop-based big data project. 

During the meetup, we will first cover the basics of Apache Apex platform and tools provided by DataTorrent to make applications like Data Ingestion possible. Then we will focus on the Ingestion application and specific ingestion flows such as ingesting unbounded data from Kafka to JDBC with couple of processing operators, namely Transform and Enrichment. 


Pramod Immaneni is Apache Apex PMC member and senior architect at DataTorrent, where he works on Apache Apex and specializes in big data platform and applications. Prior to DataTorrent, he was a co-founder and CTO of Leaf Networks LLC, eventually acquired by Netgear Inc, where he built products in core networking space and was granted patents in peer-to-peer VPNs. 

Vlad Rozov is Apache Apex PMC member and back-end engineer at DataTorrent where he focuses on the buffer server, Apex platform network layer, benchmarks and optimizing the core components for low latency and high throughput. Prior to DataTorrent Vlad worked on distributed BI platform at Huawei and on multi-dimensional database (OLAP) at Hyperion Solutions and Oracle.

For deeper engagement with Apache Apex - follow ApacheApexpresentationsrecordings, download (communitysandbox), Apache Apex releasesdocs

Join or login to comment.

Our Sponsors

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy