PyData Bratislava Meetup #3 [Building AI data pipelines using PySpark]

![PyData Bratislava Meetup #3 [Building AI data pipelines using PySpark]](https://secure.meetupstatic.com/photos/event/8/7/1/0/highres_494734576.jpeg?w=750)
Details
Event description:
Exponea is full-stack Omni-channel real-time marketing cloud. In Exponea, we are extensively building practical AI applications varying from predictions or recommendations to simple simulated annealing. Regardless of application we are building, each one needs data. A lot of data that Exponea can efficiently provide.
Major issue, when building any AI application or ML model, is data preprocessing. This problem arises when you need to process vast volume datasets or high velocity data streams. We build such data pipelines mostly using Spark respectively PySpark and Python, but also many other tools are adopted.
In this talk we will go through the steps we implemented to build such pipelines. We will show you how to get Spark running easily, basic data wrangling with PySpark and Spark Streaming. In the end, we will use our data pipeline for real application and finish talk about resource managing joys and sorrows.
About speaker:
Matus Cimerman
1+y Data science @Exponea, before BI intern and other stuff @Orange.
Finishing masters FIIT STU, thesis: Data stream analysis
https://www.linkedin.com/in/matúš-cimerman-4b08b352/ (https://www.linkedin.com/in/mat%C3%BA%C5%A1-cimerman-4b%20%2008b352/)
https://twitter.com/MatusCimerman
https://www.facebook.com/matus.cimerman
Registration:
@Meetup.com group's event here (https://www.meetup.com/PyData-Bratislava/events/239255882/) & @Eventbrite registration here (https://www.eventbrite.com/e/pydata-bratislava-meetup-3-building-ai-data-pipelines-using-pyspark-tickets-33765397212) (if you use both your seat is guarateed). +our event you can find also @Facebook here (https://www.facebook.com/events/1879974522214572/).
[Disclaimer: If you just mark "going" @Facebook event we can't guarantee your seat]
Language of the event: Python, Slovak & English
------------------------------------
PyData Bratislava [Python Data Enthusiasts and Users, Data Scientists & Statisticians of all levels from Slovakia]
------------------------------------
This meetup group is for Data Scientists, Statisticians, Economists and Data Enthusiasts using Python for data analysis and data visualization. The goals are to provide Python enthusiasts a place to share ideas and learn from each other about how best to apply the language and tools to ever-evolving challenges in the vast realm of data management, processing, analytics, and visualization.
PyData is a group for users and developers of data analysis tools to share ideas and learn from each other. We gather to discuss how best to apply Python tools, as well as those using R and Julia, to meet the evolving challenges in data management, processing, analytics, and visualization. PyData groups, events, and conferences aim to provide a venue for users acrossall the various domains of data analysis to share their experiences and their techniques. PyData is organized by NumFOCUS.org, a 501(c)3 non-profit in the United States.
The PyData Code of Conduct governs this meetup. To discuss any issues or concerns relating to the code of conduct or the behavior of anyone at a PyData meetup, please contact the organizer or NumFOCUS Executive Director Leah Silen (+1512-222-5449; leah@numfocus.org).
Our PyData Bratislava (Facebook group) you can find here: https://www.facebook.com/groups/1813599648877946/
Our PyData Bratislava (Twitter account) here: https://twitter.com/PyDataBA
Our PyData Bratislava (LinkedIn group) here: https://www.linkedin.com/groups/13506080
GapData Institute (http://www.gapdata.org) (GDI) is a nonprofit nonpartisan research institution harnessing power of data & wisdom of economics for public good.
|| Data. Think. Change. ||
NumFOCUS (http://www.numfocus.org/) is a 501(c)(3) nonprofit that supports and promotes world-class, innovative, open source scientific computing. The mission of NumFOCUS is to promote sustainable high-level programming languages, open code development, and reproducible scientific research. We accomplish this mission through our educational programs and events as well as through fiscal sponsorship of open source data science projects. We aim to increase collaboration and communication within the scientific computing community.

Sponsors
PyData Bratislava Meetup #3 [Building AI data pipelines using PySpark]