What we're about

An online meetup community for practitioners, developers, aspiring and professional data engineers and data scientists, who are interested in learning about data + AI. Join this group to connect with fellow enthusiasts and to learn more about open source projects including Apache Spark, Delta Lake, MLflow, Koalas, TensorFlow and PyTorch.

We host three types of live online meetups which we'll call out in the title of each event. Most meetups will be recorded and the videos will be posted here: https://dbricks.co/youtube-meetups

Interviews: Interview style with time for Q&A, no slides

Tech Talks: Presentation, slides, demo and time for Q&A

Workshops: Tutorials with time for Q&A

Join us on slack if you’re interested in Delta Lake: https://dbricks.co/DeltaSlack and/or MLflow: https://dbricks.co/MLflowSlackInvite

Upcoming events (2)

MLflow Integration with PyCaret and PyTorch

Online event

Join us for virtual tech talks at Data + AI Meetup about MLflow Integration with PyCaret and PyTorch sponsored by the Databricks MLflow Team. It will be simultaneously broadcasted live on YouTube and LinkedIn.

Agenda:
9:00 - 9:05 AM: Introduction & Announcements
9:05 - 9:35 AM: Machine Learning made easy with PyCaret and MLflow
9:40 - 10:10 AM: Reproducible AI using MLflow and PyTorch

Quick links:
MLflow: https://mlflow.org/
PyCaret: https://pycaret.org/
PyTorch: https://pytorch.org/

Talk One

Title: Machine Learning made easy with PyCaret and MLfLow
Presenter: Moez Ali
Abstract: PyCaret is an open source, low-code machine learning library in Python that allows you to go from preparing your data to deploying your model within minutes in your choice of environment. This talk is a practical demo using PyCaret in your existing workflows and supercharges your data science team's productivity.

Bio: Moez Ali is a seasoned data scientist with a decade of experience working with data in healthcare, education, and professional consulting. He is an active member of the open source community, and he created and open-sourced PyCaret in 2020.

Talk Two

Title: Reproducible AI using MLflow and PyTorch
Presenter: Geeta Chauhan
Abstract: Model reproducibility is becoming the next frontier for successful AI models building and deployments for both Research and Production scenarios. In this talk, we will show you how to build reproducible AI models and workflows using PyTorch and MLflow that can be shared across your teams, with traceability and speed up collaboration for AI projects.

Bio: Geeta Chauhan leads AI Partnership Engineering at Facebook AI with expertise in building resilient, anti-fragile, large-scale distributed platforms for startups and Fortune 500s. As a core member of the PyTorch team, she leads TorchServe and many partner collaborations for building a solid PyTorch ecosystem and community.

Continuous Integration and Continuous Delivery with Delta Lake

Join us for the final session in a four part series with Salesforce Engineering.

Abstract: As we build our Engagement Delta Lake on Databricks Workspace, one of the challenges is how to automate the integration testing of our Spark jobs in the CI/CD pipeline. We came up with two designs to tackle the challenge : Namespace Deployment and Scenario Based Testing. In this talk, we will discuss the rationale and implementations of the two designs.

Part 1: Engagement Activity Delta Lake Recording: https://youtu.be/a7_I1Qi1LoU
Part 2: Boost Delta Lake Performance with Data Skipping and Z-Order Recording: TBD
Part 3: Global Synchronousness and Ordering in Delta Lake
RSVP: https://www.meetup.com/data-ai-online/events/276975214/

-----------------
Speakers
-----------------

Zhidong Ke, Software Engineer PMTS, Salesforce
Zhidong is passionate in designing distributed systems, real-time/batch data processing and building applications.

Yifeng Liu, Software Engineer LMTS, Salesforce
Yifeng is a software engineer who has extensive experience in big data processing and distributed system, and interested in high volume, high complexity, low latency data pipeline and framework building.

Aaron Zhang
Title: Software Engineering PMTS, Salesforce
Aaron is an experienced software engineering leader with interests and areas of focus in engineering secure, fault-tolerant, high volume systems built on micro services.

Heng Zhang, Software Engineering PMTS, Salesforce
Heng is a software engineer who is interested and specialized in micro services, distributed systems and big data.

Past events (68)

MLflow Integration from Azure ML and Algorithma

Online event

Photos (91)