CD in Big Data Pipelines with Amaterasu & Spark story telling


Details
18:00 - 18:30 - Mingling
18:30 - 19:15 - Continuously Deploying Big Data Pipelines with Amaterasu - Yaniv Rodenski & Eyal Ben Ivri
19:15 - 20:00 - Evolving a premium raw data product from a simple Spark script in 3 months - Avi Perez @ AppsFlyer
Talk Title: Continuously Deploying Big Data Pipelines with Amaterasu
https://github.com/shintoio/amaterasu
http://photos3.meetupstatic.com/photos/event/7/0/5/6/600_456628758.jpeg
Abstract:
In the last few years, the DevOps movement has introduced groundbreaking approaches to the way we manage the lifecycle of software development and deployment. Today organizations aspire to fully automate the deployment of microservices and web applications with tools such as Chef, Puppet and Ansible. However, the deployment of data-processing pipelines remains a relic from the dark-ages of software development. Processing large-scale data pipelines is the main engineering task of the Big Data era, and we believe it should be treated with the same respect and craftsmanship as any other software. That is why we created Amaterasu - an open source framework that takes care of the specific needs of Big Data applications in the world of continuous delivery. In this session we will take a look at how you can build Big Data applications using your framework of choice, such as Apache Spark, Apache Flink etc. and without sacrificing proper development methodologies
Bio:
Yaniv Rodenski (https://www.linkedin.com/in/yanivrodenski) is a software developer, a speaker, an author, a Jedi and all around nerd. Yaniv has been developing software as a hobby from a young age, and professionally since 1997. He is the co-author of “Pro Couchbase Server” (Apress) and specializes in Big Data, NoSQL and distributed systems. Yaniv is also the co-organizer of the Melbourne Hadoop and Big Data meetup groups.
Eyal Ben Ivri (https://www.linkedin.com/in/eyalbenivri) is a senior consultant for Sela, with expertise in the Big Data field, and is a totally way better Jedi then Yaniv. Eyal is probably best described as a big-data enthusiast, with knowledge in many different programming languages, public-cloud platforms, NoSQL databases.
Seriously - Way better Jedi then Yaniv. Look it up.
Title: Evolving a premium raw data product from a simple Spark script in 3 months
Abstract:
AppsFlyer system is processing 11 billion daily events which generate more than 5TB of compressed data every day at Amazon S3. Providing our clients simple yet scalable view of their portion of our raw data was essential for maintaining both our competitive advantage as well as that of our clients. At this talk I will tell the story on the journey we took, in understanding our clients needs, providing them incremental big data solutions to producing a polished and production grade product which eventually turned to be their own growth engine to scale and excel ,on top of AppsFlyer evolving data infrastructure.
Bio:
Avi Perez (https://www.linkedin.com/in/xperetz) - Development Team Lead at AppsFlyer

CD in Big Data Pipelines with Amaterasu & Spark story telling