Skip to content

Apache Flink... Don’t Cross The Streams! Modern Data Science Workflows

Photo of John Mulhall
Hosted By
John M. and Uli B.
Apache Flink... Don’t Cross The Streams! Modern Data Science Workflows

Details

It gives us great pleasure to announce our July Meetup at Bank of Ireland @boistartups where we will have an awesome line up with John Gorman (https://www.linkedin.com/in/johnpgorman?authType=NAME_SEARCH&authToken=UeK1&locale=en_US&trk=tyah&trkInfo=clickedVertical%3Amynetwork%2CentityType%3AentityHistoryName%2CclickedEntityId%3Amynetwork_11662340%2Cidx%3A0) on Apache Flink and "Fast Data!" On second slot, we will have Vincent De Stoecklin (https://www.linkedin.com/in/vincentdestoecklin?authType=NAME_SEARCH&authToken=n5As&locale=en_US&trk=tyah&trkInfo=clickedVertical%3Amynetwork%2CclickedEntityId%3A75956606%2CauthType%3ANAME_SEARCH%2Cidx%3A1-1-1%2CtarId%3A1466415558018%2Ctas%3AVi) on Modern Industrialised Data Science Workflows. As you can see, we are going lower in this Meetup on data processing and use cases for good data science architectures.

Please note there will be a post-event party (http://entanon.com/event?eventid=232112186) in the Marker hotel. This is for the launch of the Travel meets Big Data conference in November.

Our full agenda is as follows;

Apache Flink... Don’t Cross The Streams! by John Gorman (https://www.linkedin.com/in/johnpgorman?authType=NAME_SEARCH&authToken=UeK1&locale=en_US&trk=tyah&trkInfo=clickedVertical%3Amynetwork%2CentityType%3AentityHistoryName%2CclickedEntityId%3Amynetwork_mynetwork_11662340%2Cidx%3A1), Senior Data Consultant with Eberon.

Along with the arrival of BigData, a parallel yet less well known but significant change to the way we process data has occurred. Data is getting faster! Business models are changing radically based on the ability to be first to know insights and act appropriately to keep the customer, prevent the breakdown or save the patient. In essence, knowing something now is overriding knowing everything later. Stream processing engines allow us to blend event streams from different internal and external sources to gain insights in real time. This talk will discuss the need for streaming, business models it can change, new applications it allows and why Apache Flink enables these applications. Apache Flink is a top Level Apache Project for real time stream processing at scale. It is a high throughput, low latency, fault tolerant, distributed, state based stream processing engine. Flink has associated Polyglot APIs (Scala, Python, Java) for manipulating streams, a Complex Event Processor for monitoring and alerting on the streams and integration points with other big data ecosystem tooling.

Modern Industrialised Data Science Workflows by Vincent De Stoecklin (https://www.linkedin.com/in/vincentdestoecklin?authType=NAME_SEARCH&authToken=n5As&locale=en_US&trk=tyah&trkInfo=clickedVertical%3Amynetwork%2CclickedEntityId%3A75956606%2CauthType%3ANAME_SEARCH%2Cidx%3A1-1-1%2CtarId%3A1466415558018%2Ctas%3AVi) of Dataiku (http://www.dataiku.com/)

The talk will focus on presenting standard production architectures for data products, and give insight on best practices to articulate efficiently a design environment (data lab prototyping use cases) and a production environment (where workflows are run and monitored). We will take two examples from Dataiku clients to show the different types of architectures and how they can allow companies to address different types of uses cases :
Using a real time API to deploy machine learning models for real time prediction - example for dynamic pricing with AramisAuto
Deploying and monitoring a (batch) industrialized data product to identify monthly churners - example from Coyote
Hybrid Batch + Real Time scoring architectures - example from Fraud Detection in Healthcare

So as always, the event hashtag is #HUGIreland so do include us if you are tweeting about it! Please note after July 11th, we will be taking a break for our collective vacations in August but plan to return with a special event in partnership with Oracle as Sponsor so stay tuned for developments!! Looking forward to seeing you on July 11th so do RSVP today... chat to ya then!!

Photo of Data Engineering and Data Architecture Group (DEDAG) group
Data Engineering and Data Architecture Group (DEDAG)
See more events