Skip to content

Details

Event description:

Exponea is full-stack Omni-channel real-time marketing cloud. In Exponea, we are extensively building practical AI applications varying from predictions or recommendations to simple simulated annealing. Regardless of application we are building, each one needs data. A lot of data that Exponea can efficiently provide.

Major issue, when building any AI application or ML model, is data preprocessing. This problem arises when you need to process vast volume datasets or high velocity data streams. We build such data pipelines mostly using Spark respectively PySpark and Python, but also many other tools are adopted.

In this talk we will go through the steps we implemented to build such pipelines. We will show you how to get Spark running easily, basic data wrangling with PySpark and Spark Streaming. In the end, we will use our data pipeline for real application and finish talk about resource managing joys and sorrows.

About speaker:

Matus Cimerman

1+y Data science @Exponea, before BI intern and other stuff @Orange.

Finishing masters FIIT STU, thesis: Data stream analysis

https://github.com/cimox

https://www.linkedin.com/in/matúš-cimerman-4b08b352/ (https://www.linkedin.com/in/mat%C3%BA%C5%A1-cimerman-4b%20%2008b352/)

https://twitter.com/MatusCimerman

https://www.facebook.com/matus.cimerman

Registration:

@Meetup.com group's event here (https://www.meetup.com/PyData-Bratislava/events/239255882/) & @Eventbrite registration here (https://www.eventbrite.com/e/pydata-bratislava-meetup-3-building-ai-data-pipelines-using-pyspark-tickets-33765397212) (if you use both your seat is guarateed). +our event you can find also @Facebook here (https://www.facebook.com/events/1879974522214572/).

[Disclaimer: If you just mark "going" @Facebook event we can't guarantee your seat]

Language of the event: Python, Slovak & English

PyData Bratislava [Python Data Enthusiasts and Users, Data Scientists & Statisticians of all levels from Slovakia]

You may also like