Bigdata Orchestration using Airflow, Big data Infrastructure and Data Science

Name: Bigdata Orchestration using Airflow, Big data Infrastructure and Data Science
Start: 2017-12-14T18:30:00Z
End: 2017-12-14T20:30:00Z
Location: ACIA - Aon Ireland

Hosted by Paul F.

Dublin Big data and IoT Meetup

Details

Guest Speaker, Joshua Robinson, PureStorage

Joshua Robinson (https://www.linkedin.com/in/joshuarobinson80/) will be doing a talk around Datascience.
This talk will overview Pure Storage’s streaming big data analytics pipeline, which uses open source technologies like Spark and Kafka to process over 30 billion events per day and provide real-time feedback in under five seconds. This pipeline is supported by Pure Storage’s FlashBlade as a shared storage solution, which enables a streaming use case as well as on-demand batch analytics.

This pipeline illustrates the use case for big data analytics technologies, the lessons learned from this project, and the underlying elastic infrastructure that provides flexible scaling, agility, and simplicity across multiple application clusters.

Joshua is a founding engineer of FlashBlade from Purestorage.
He has spearheaded the big-data strategy for FlashBlade.
Flashblade is an enterprise hardware backed storage product that provide extremely fast IOPs (read/writes).

Guest Speaker, TBD, Aon Centre for Innovation and Analytics (ACIA)

ACIA will speak about how how they leaverage cloud based infrastructure to support their growing data science needs.
More details will be released tomorrow!

Paul Foran, Organizer of Meetup

I will talk about how I use Apache Airflow (a python based data-pipeline/scheduling system) to interact with big-data systems in the cloud (like AWS: S3, EMR, Spark and Redshift).
Apache airflow can be used to schedule pretty much anything! (from scheduling jobs to train models right through to ingesting data.

I will go through the various elements within airflow, like stabilizing the environment, building dynamic DAGs, interacting with custom or generic restful APIs (such as a metadata API system) to aid on-board new ingestion systems in an ETL pipeline

Beer and pizza will be provided by ACIA!

Dublin Big data and IoT Meetup

Bigdata Orchestration using Airflow, Big data Infrastructure and Data Science

Dublin Big data and IoT Meetup

Details

Guest Speaker, Joshua Robinson, PureStorage

Guest Speaker, TBD, Aon Centre for Innovation and Analytics (ACIA)

Paul Foran, Organizer of Meetup

Related topics

You may also like