Please register here -> http://bit.ly/bdx10-19
In this workshop we will put together an end-to-end pipeline for data analytics, customer segmentation using AI and a recommender system using jupyter lab, jupyter notebook, minio, spark, airflow, elasticsearch, logstash, kibana, kafka, postgres, clickhouse, apache arrow, tensorflow, tensorflow extended, TFX, keras, docker and kubernetes.
During this workshop, Nat will highlight the reasons and the functions of each component, and spend some time playing with each of them, while building incrementally the full data-pipeline. Next to the analysis of each component, he will summarise do and don'ts for real-world data pipelines, aiming for a pragmatic approach to analytics and data science, away from hypes, and developer community marketing.
Natalino Busa is currently the founder of SELECT COUNT(*) [http://selectcountstar.com/] data engineering services in Singapore. Nat has bootstrapped R&D teams and delivered digital products and data-driven applications for banking, retail, infotainment and telecom domains. He has worked in the past as lead engineer and scientist for Philips, and ING Bank in the Netherlands and DBS bank in Singapore, and currently Chief Scientist at Teko in Vietnam. Author for O'Reilly Media on Spark and Data Science Applications. Passionate about data science, data engineering, distributed computing, man-machine interfaces, AI and cloud computing.
Software Engineers, DevOps, DataOps, ML Engineers, Data Scientists
- Intermediate understanding of Python
- Basic understanding of SQL
- Basic understanding of Data Engineering
- Basic understanding of Data Science
- Basic understanding of Google Cloud Platform