Toronto AI Meetup cover photo

Toronto AI Meetup

4.2•611 ratings

Toronto, ON, CA

Share

About us

About TorontoAI

TorontoAI is a vibrant, inclusive community of engineers, builders, founders, and curious minds passionate about making AI infrastructure more accessible, human-centered, and scalable.

We host bi-weekly in-person socials, tech meetups, and hands-on webinars to connect people across disciplines — from DevOps to Data Science, from students to senior architects. Whether you're deploying LLMs in production or just exploring what Databricks does, you're welcome here.

🤝 We’re Building More Than a Meetup

In a world dominated by virtual everything, we believe in real, human-to-human connection.

TorontoAI is a space to:

Share ideas over coffee
Spark collaborations face-to-face
Meet people who understand your stack and your journey
Build your network beyond LinkedIn likes

💬 What We Talk About:

Scalable AI & LLM infrastructure (Kubernetes, GPUs, vLLM, Ollama, LangChain)
Databricks, Snowflake, Fivetran, dbt — building the modern data stack
MLOps, LLMOps, DevOps — the operational glue of AI systems
Real-world engineering stories, founder spotlights, and tool breakdowns

🌈 Who We Welcome:

DevOps, SREs & Platform Engineers moving into data/AI
Data Engineers, Analysts & ML practitioners
Founders, freelancers, and technologists in transition
Students and early-career professionals seeking real-world exposure

We’re committed to creating a welcoming, diverse, and equity-focused space where all voices matter — no gatekeeping, no rockstars, just good humans building cool stuff.

📍 Based in Toronto, open to the world
📅 Join an event — and be part of something human, helpful, and hands-on.

Upcoming events

4

CA$10.00
Build a Text-to-SQL AI Agent on Databricks — 2-Hr Hands-On Bootcamp
Sat, May 16 · 12:00 PM EDT
·
Online
Online
Best Utilize your weekend! In 2 hours, build and deploy a production-style text-to-SQL AI agent on Databricks — with SQL safety guardrails, schema-aware prompts, and live deployment. Only 20 seats.

What you will build

A working AI agent that converts natural language into safe SQL, validates it against schema and SELECT-only guardrails, executes on Databricks, and returns results with explanations. You leave with a repo you can deploy at work Monday.

Why this bootcamp?

80% of new enterprise databases are now generated by AI agents (Databricks 2026 State of AI Agents). Every data team is being asked to build one. Most get the LLM call right and the safety layer wrong. This session fixes that.

What you will learn

Understand Concepts around AI Agents

A deployed text-to-SQL agent on Databricks (your own workspace or ours)

The open-source db-agent repo cloned and configured

A working SQL safety validator (SELECT-only, schema-enforced)

The full prompt engineering pattern for schema-aware generation

Who is this for

Data Engineers and Analysts who want to ship internal AI tools fast

AI / Platform Engineers building agentic workflows over production data

Technical leads evaluating text-to-SQL for their teams

Prerequisite
Comfort with Python and SQL. A laptop. A GitHub account. That's it.

About the Instructor

Chandan Kumar

Founder of BeCloudReady (Databricks Registered Partner)

Organizer of TorontoAI (10K+ members).

Author of the open-source db-agent project (github.com/db-agent/db-agent), presented at the AAAI-25 workshop on AI agents.

Live

Live on Microsoft Teams.

Recording sent to attendees within 48 hours.

Saturday, May 16 — 12:00-2:00 PM EDT.

Capped at 20 seats.

Build something real. Not just prompts.

URLs
6 attendees
Panel Discussion: Intellectual Property in the Era of Vibe Coding
Wed, May 20 · 5:30 PM EDT
401 Bay Street, Meet at Bay Street Entrance, Toronto, ON, CA
When code becomes commoditized, what actually gets protected?

This is an in-person panel discussion hosted at a Dipchand Law office, bringing together experts from legal, AI, and strategy domains to explore how intellectual property is evolving in the age of AI-assisted development and “vibe coding.”

Why this matters:

AI tools are rapidly commoditizing software development. The barrier to building products is dropping, shifting the focus from writing code to owning ideas, data, and systems. This creates new challenges around ownership, licensing, and long-term defensibility.

Key discussion areas:

Whether code still holds value as intellectual property

Ownership of AI-generated code and outputs

What developers and companies should protect beyond code (data, workflows, architecture)

Enterprise risks including compliance, governance, and data exposure

How organizations build defensibility when building becomes easy

Speakers:

Stephano Salani
Intellectual Property Lawyer, Dipchand LLP

Yulia Pavlova, PhD
Applied AI and Governance Leader, RBC Borealis AI

Mohit Rajhans
AI Consultant, ThinkStart.ca

Event details:
Date: May 20, 2026
Time: 5:30 PM to 7:30 PM EDT
Location: Dipchand LLP Office, Toronto, ON

What to expect:
Panel discussion, networking, and Q&A session
RSVP - https://luma.com/0ktn86g3
2 attendees
Genie, Agent Bricks, or Build Your Own on Databricks Lakebase
Tue, Jun 9 · 5:00 PM EDT
·
Online
Online
Data Engineering leaders deciding whether to adopt Genie, build a custom text-to-SQL stack, or wire something in between.
90 minutes. Live build, not slides. Real workspace, real data, a real LLM call across an HTTP boundary you control.

What you will see

I'll go from an empty Databricks workspace to a working text-to-SQL agent that:

Joins live OLTP rows in Lakebase (managed Postgres) with pre-aggregated gold tables in Unity Catalog Delta — through Lakehouse Federation, in a single query.

Generates SQL via a pluggable LLM endpoint — Databricks Model Serving, OpenAI-compatible APIs, or a self-hosted vLLM on a neo-cloud GPU — switched with one environment variable.

Validates every SQL string before execution with a SELECT-only safety guardrail that catches the Databricks-specific destructive ops generic validators miss (`OPTIMIZE`, `VACUUM`, `ZORDER`, `COPY`).

Is auditable end-to-end: one question = one LLM call, one SQL statement, one execution. No autonomous loops, no surprise bills.

What you will leave with
- The companion open-source db-agent repo (presented at AAAI-25, ships a Databricks Apps deployment variant) and a quick-lab repo with a step-by-step build.
- A reference architecture diagram and the actual code — pipeline orchestrator is ~60 lines of Python, safety validator is ~30.
- Specific gotchas that cost me a half-day each: federation database options, Lakebase token rotation, Streamlit/Apps reverse-proxy traps, context-window blowouts on real catalogs.
Who is this for
- Heads of Data, Data Engineering Managers, Staff and Principal Data Engineers.
- Teams already on Databricks (or evaluating) who are being asked: "Can we put an AI agent on top of this?"
- Anyone making a build-vs-buy call between Genie, Agent Bricks, and a custom text-to-SQL stack — and wants to make it with their eyes open.
- Demo of the Reference Architecture explained here - https://becloudready.com/blog/text-to-sql-databricks-lakebase-db-agent
This is a technical session. We'll read code. Bring your senior engineers.

Agenda
1. The architecture in one slide (5 min)
2. Lakebase + Unity Catalog + Lakehouse Federation — why both data planes, and what breaks (15 min)
3. The agent pipeline — schema → prompt → LLM → validate → execute (20 min)
4. The SQL safety guardrail — what generic SELECT-only validators miss on Databricks (10 min)
5. The pluggable LLM layer — live swap from a hosted API to a self-hosted vLLM on a neo-cloud GPU (15 min)
6. db-agent vs Genie vs Agent Bricks — when to use which, and why (10 min)
7. Q&A (15 min)
About the Speaker
Chandan Kumar — founder of BeCloudReady, organizer of the TorontoAI community (10K+ members), and a Databricks Partner. Maintainer of the open-source db-agent text-to-SQL agent, presented at AAAI-25. Runs the Databricks Lakehouse Bootcamp and works with engineering teams on getting AI agents into production against real data
18 attendees
Build Your First AWS Data Lake in 60 Minutes — Live, with Real Code
Tue, Jun 16 · 5:00 PM EDT
·
Online
Online
A free, hands-on community session for data engineers and folks breaking into data.
If you've read the AWS docs but never actually built the pipeline end-to-end — or you've shipped pieces of it but never seen how they all fit together — this session is for you. We'll go from an empty AWS account to a working data lake in 90 minutes. Live build, real code, real data.

### What we'll build together

The pipeline that's underneath every "data lake on AWS" project — once you see it once, you see it everywhere:

```
S3 raw CSV → Glue Crawler → Glue Catalog → Glue ETL Job
→ S3 curated Parquet → Glue Crawler → Athena
```

Concretely, you'll watch:

A raw CSV (Kaggle Crude Oil historical data, ~6,400 rows) land in S3

A Glue Crawler infer the schema and register a table in the Glue Catalog

An Athena query against that CSV — count rows, filter, aggregate, with no servers to manage

A PySpark Glue ETL job transform the CSV into partitioned Parquet (columnar, compressed, ~10× cheaper to scan)

A second crawler register the Parquet table

The same query, run again — and a side-by-side comparison of "data scanned" between CSV and Parquet

That last comparison is the punchline. Parquet vs CSV is the difference between a $5 query and a $0.50 query at scale. Seeing it in the Athena UI lands differently than reading about it.

### What you'll leave with

The full lab open-sourced — Terraform for the IAM + sandbox + Glue + Athena setup, the PySpark ETL script, and a step-by-step walkthrough you can re-run on your own AWS account.

A working mental model of the shape of every data lake project: land raw → catalog → transform → catalog again → query. The dataset is just the variable.

Practical IAM patterns most tutorials skip — region-locking, prefix-scoped S3 access, Glue role policies that don't accidentally grant the world. The kind of thing that actually shows up in production reviews.

A take-home assignment: run the same pipeline against a Kaggle dataset of your choice. Bring it to the next session for feedback.

### Who this is for

Data engineers who've shipped pieces of a data lake but want to see the whole pipeline end-to-end

Career changers moving into data engineering — this is a portfolio-grade project you can talk about in interviews

Bootcamp grads and self-taught engineers who can write SQL and Python but haven't seen how Glue, Athena, and S3 actually fit together

Backend engineers picking up data work and wanting a fast on-ramp to the AWS data stack

Anyone whose manager said "we should look at building a data lake" and now it's on your plate

If you've never opened the AWS console, you'll still follow along — we explain every click. If you've been doing this for years, you'll probably still pick up the IAM scoping pattern.

### Format

Live on Microsoft Teams — questions in chat, full screen-share, no slides

90 minutes — same length as the lab itself

Recording shared with everyone who registers

Open Q&A throughout, not just at the end

### About your host
Chandan Kumar — founder of beCloudReady and organizer of TorontoAI, a 10K+ member community of AI and data builders. Twenty-plus years across software, cloud, and data engineering. Has trained and placed 500+ engineers across Canada and the US. Maintainer of open-source labs and the db-agent project (presented at AAAI-25).
23 attendees

Past events

275

Organizers

Chandan K and 1 other

Chandan K. is a Super Organizer

Members

6,440

www.torontoai.io

Related topics

Software Development

Software Engineering

Networking for Job Seekers

Recruiting & Hiring

Data Science using Python

Machine Learning with Python

Startup Businesses

Technology Startups

Internet Startups

Starting an E-Commerce Business

Startup Incubation