About us
About TorontoAI
TorontoAI is a vibrant, inclusive community of engineers, builders, founders, and curious minds passionate about making AI infrastructure more accessible, human-centered, and scalable.
We host bi-weekly in-person socials, tech meetups, and hands-on webinars to connect people across disciplines — from DevOps to Data Science, from students to senior architects. Whether you're deploying LLMs in production or just exploring what Databricks does, you're welcome here.
🤝 We’re Building More Than a Meetup
In a world dominated by virtual everything, we believe in real, human-to-human connection.
TorontoAI is a space to:
- Share ideas over coffee
- Spark collaborations face-to-face
- Meet people who understand your stack and your journey
- Build your network beyond LinkedIn likes
💬 What We Talk About:
- Scalable AI & LLM infrastructure (Kubernetes, GPUs, vLLM, Ollama, LangChain)
- Databricks, Snowflake, Fivetran, dbt — building the modern data stack
- MLOps, LLMOps, DevOps — the operational glue of AI systems
- Real-world engineering stories, founder spotlights, and tool breakdowns
🌈 Who We Welcome:
- DevOps, SREs & Platform Engineers moving into data/AI
- Data Engineers, Analysts & ML practitioners
- Founders, freelancers, and technologists in transition
- Students and early-career professionals seeking real-world exposure
We’re committed to creating a welcoming, diverse, and equity-focused space where all voices matter — no gatekeeping, no rockstars, just good humans building cool stuff.
📍 Based in Toronto, open to the world
📅 Join an event — and be part of something human, helpful, and hands-on.
Upcoming events
5

Panel Discussion: Intellectual Property in the Era of Vibe Coding
401 Bay Street, Meet at Bay Street Entrance, Toronto, ON, CAWhen code becomes commoditized, what actually gets protected?
This is an in-person panel discussion hosted at a Dipchand Law office, bringing together experts from legal, AI, and strategy domains to explore how intellectual property is evolving in the age of AI-assisted development and “vibe coding.”
Why this matters:
AI tools are rapidly commoditizing software development. The barrier to building products is dropping, shifting the focus from writing code to owning ideas, data, and systems. This creates new challenges around ownership, licensing, and long-term defensibility.
Key discussion areas:
- Whether code still holds value as intellectual property
- Ownership of AI-generated code and outputs
- What developers and companies should protect beyond code (data, workflows, architecture)
- Enterprise risks including compliance, governance, and data exposure
- How organizations build defensibility when building becomes easy
Speakers:
Stephano Salani
Intellectual Property Lawyer, Dipchand LLPYulia Pavlova, PhD
Applied AI and Governance Leader, RBC Borealis AIMohit Rajhans
AI Consultant, ThinkStart.caEvent details:
Date: May 20, 2026
Time: 5:30 PM to 7:30 PM EDT
Location: Dipchand LLP Office, Toronto, ONWhat to expect:
Panel discussion, networking, and Q&A sessionRSVP - https://luma.com/0ktn86g3
2 attendees
Build Your First AWS Data Lake in 60 Minutes — Live, with Real Code
·OnlineOnlineA free, hands-on community session for data engineers and folks breaking into data.
If you've read the AWS docs but never actually built the pipeline end-to-end — or you've shipped pieces of it but never seen how they all fit together — this session is for you. We'll go from an empty AWS account to a working data lake in 90 minutes. Live build, real code, real data.### What we'll build together
The pipeline that's underneath every "data lake on AWS" project — once you see it once, you see it everywhere:
```
S3 raw CSV → Glue Crawler → Glue Catalog → Glue ETL Job
→ S3 curated Parquet → Glue Crawler → Athena
```Concretely, you'll watch:
- A raw CSV (Kaggle Crude Oil historical data, ~6,400 rows) land in S3
- A Glue Crawler infer the schema and register a table in the Glue Catalog
- An Athena query against that CSV — count rows, filter, aggregate, with no servers to manage
- A PySpark Glue ETL job transform the CSV into partitioned Parquet (columnar, compressed, ~10× cheaper to scan)
- A second crawler register the Parquet table
- The same query, run again — and a side-by-side comparison of "data scanned" between CSV and Parquet
That last comparison is the punchline. Parquet vs CSV is the difference between a $5 query and a $0.50 query at scale. Seeing it in the Athena UI lands differently than reading about it.
### What you'll leave with
- The full lab open-sourced — Terraform for the IAM + sandbox + Glue + Athena setup, the PySpark ETL script, and a step-by-step walkthrough you can re-run on your own AWS account.
- A working mental model of the shape of every data lake project: land raw → catalog → transform → catalog again → query. The dataset is just the variable.
- Practical IAM patterns most tutorials skip — region-locking, prefix-scoped S3 access, Glue role policies that don't accidentally grant the world. The kind of thing that actually shows up in production reviews.
- A take-home assignment: run the same pipeline against a Kaggle dataset of your choice. Bring it to the next session for feedback.
### Who this is for
- Data engineers who've shipped pieces of a data lake but want to see the whole pipeline end-to-end
- Career changers moving into data engineering — this is a portfolio-grade project you can talk about in interviews
- Bootcamp grads and self-taught engineers who can write SQL and Python but haven't seen how Glue, Athena, and S3 actually fit together
- Backend engineers picking up data work and wanting a fast on-ramp to the AWS data stack
- Anyone whose manager said "we should look at building a data lake" and now it's on your plate
If you've never opened the AWS console, you'll still follow along — we explain every click. If you've been doing this for years, you'll probably still pick up the IAM scoping pattern.
### Format
- Live on Microsoft Teams — questions in chat, full screen-share, no slides
- 90 minutes — same length as the lab itself
- Recording shared with everyone who registers
- Open Q&A throughout, not just at the end
### About your host
Chandan Kumar — founder of beCloudReady and organizer of TorontoAI, a 10K+ member community of AI and data builders. Twenty-plus years across software, cloud, and data engineering. Has trained and placed 500+ engineers across Canada and the US. Maintainer of open-source labs and the db-agent project (presented at AAAI-25).
26 attendees
Past events
275



