Data Engineers Meetup: PySpark 101 & using LLMs in data Pipelines

Name: Data Engineers Meetup: PySpark 101 & using LLMs in data Pipelines
Start: 2025-02-24T17:30:00-05:00
End: 2025-02-24T20:00:00-05:00
Location: Prefect

Hosted By

Rahul S. and 3 others

Data Engineers Meetup: PySpark 101 & using LLMs in data Pipelines

Details

Join us at Prefect to talk about Data Processing with Apache Spark!

Data Engineers DC is a professional group that meets monthly to discuss topics including all things related to Data Engineering such as open data, data gathering, data munging, and the creation, storage and maintenance of datasets. We combine presentations with hands-on workshops, always seeking to make our data munging lives easier.

---
Location:
Prefect - 2112 Pennsylvania Avenue NW
Washington, DC 20037

Note: Bring a photo ID, as these are required by building security.

Agenda:
5:30-6:15pm: Food & Networking
6:16-6:30pm: Introductions
6:30-7:00pm: PySpark 101 - Mike Jadoo
7:00pm-7:30pm: Using LLMs in Data pipelines

Talk: PySpark 101 - Mike Jadoo

Unlock the pyspark for big data. This is a beginner-friendly presentation designed to introduce you to Apache Spark, a fast and scalable distributed computing framework. This talk covers the fundamentals of PySpark, including:

• Apache Spark Overview – Understand the core concepts and benefits of Spark for big data processing.
• PySpark Essentials – Learn about RDDs (Resilient Distributed Datasets) for distributed computation and DataFrames for optimized, structured data handling. Using SQL.
• Machine Learning with MLlib – Explore basic Spark’s scalable machine learning library for analytics and predictive modeling.
Perfect for beginners in data engineering and analytics, this course will equip you with the foundational skills to process and analyze large datasets efficiently using PySpark.

Talk: Using LLM's in Data Pipelines - Rahul Singh

---

Data Engineers DC is a program of DC2. Learn more at www.dc2.org

---

Fun meetup syntax bug:
Link: https://forms.gle/8YSBJr5LGsfr3qKMA
Text formatted as a link: link
Link formatted as a link: https://forms.gle/8YSBJr5LGsfr3qKMA
(Data processing is hard. If you've found bugs like this in your pipeline, sign up using the above link to talk about it for five minutes!)

Events in Washington, DC Software Architecture SQL

Data Engineering Data Management Database Professionals

Data Engineers DC

See more events

Data Engineers DC

Monday, February 24, 2025 at 5:30 PM to Monday, February 24, 2025 at 8:00 PM EST

Prefect

2112 Pennsylvania Ave NW · Washington, DC

Data Engineers DC

public group

Data Engineers Meetup: PySpark 101 & using LLMs in data Pipelines