About us
About TorontoAI
TorontoAI is a vibrant, inclusive community of engineers, builders, founders, and curious minds passionate about making AI infrastructure more accessible, human-centered, and scalable.
We host bi-weekly in-person socials, tech meetups, and hands-on webinars to connect people across disciplines — from DevOps to Data Science, from students to senior architects. Whether you're deploying LLMs in production or just exploring what Databricks does, you're welcome here.
🤝 We’re Building More Than a Meetup
In a world dominated by virtual everything, we believe in real, human-to-human connection.
TorontoAI is a space to:
- Share ideas over coffee
- Spark collaborations face-to-face
- Meet people who understand your stack and your journey
- Build your network beyond LinkedIn likes
💬 What We Talk About:
- Scalable AI & LLM infrastructure (Kubernetes, GPUs, vLLM, Ollama, LangChain)
- Databricks, Snowflake, Fivetran, dbt — building the modern data stack
- MLOps, LLMOps, DevOps — the operational glue of AI systems
- Real-world engineering stories, founder spotlights, and tool breakdowns
🌈 Who We Welcome:
- DevOps, SREs & Platform Engineers moving into data/AI
- Data Engineers, Analysts & ML practitioners
- Founders, freelancers, and technologists in transition
- Students and early-career professionals seeking real-world exposure
We’re committed to creating a welcoming, diverse, and equity-focused space where all voices matter — no gatekeeping, no rockstars, just good humans building cool stuff.
📍 Based in Toronto, open to the world
📅 Join an event — and be part of something human, helpful, and hands-on.
Upcoming events
5

How to Land AI Cloud Engineer Job in 2026
·OnlineOnlineWhat "AI Cloud Engineer" actually means as a job in 2026. AWS, Kubernetes, Terraform, CI/CD — plus LLMOps, RAG, vector DBs, GPU infra, AgentOps. Live demo (git push → EKS cluster → running container), Q&A, recording for everyone who registers
What I'll cover in 1 hour:
- Why "DevOps Engineer" job postings are being quietly renamed — and to what
- The foundation that didn't change: AWS, Kubernetes, Terraform, CI/CD, SRE
- The new layer: LLMOps, RAG, vector DBs, GPU infra, AgentOps
- Live demo: git push → EKS cluster → running container
- Open Q&A
Speaker:
Chandan Kumar — Founder of beCloudReady. Toronto-based DevOps engineer and educator. Trained 500+ engineers now working at startups and enterprises across Canada and the US.
Register:
https://us06web.zoom.us/webinar/register/9517812218196/WN_sOpim9dyQO2waJ5ERHkHrw17 attendees
Genie, Agent Bricks, or Build Your Own on Databricks Lakebase
·OnlineOnlineData Engineering leaders deciding whether to adopt Genie, build a custom text-to-SQL stack, or wire something in between.
90 minutes. Live build, not slides. Real workspace, real data, a real LLM call across an HTTP boundary you control.What you will see
I'll go from an empty Databricks workspace to a working text-to-SQL agent that:
- Joins live OLTP rows in Lakebase (managed Postgres) with pre-aggregated gold tables in Unity Catalog Delta — through Lakehouse Federation, in a single query.
- Generates SQL via a pluggable LLM endpoint — Databricks Model Serving, OpenAI-compatible APIs, or a self-hosted vLLM on a neo-cloud GPU — switched with one environment variable.
- Validates every SQL string before execution with a SELECT-only safety guardrail that catches the Databricks-specific destructive ops generic validators miss (`OPTIMIZE`, `VACUUM`, `ZORDER`, `COPY`).
- Is auditable end-to-end: one question = one LLM call, one SQL statement, one execution. No autonomous loops, no surprise bills.
What you will leave with
- A decision framework for db-agent vs Genie vs Agent Bricks for your specific use case — including when not to build.
- The companion open-source db-agent repo (presented at AAAI-25, ships a Databricks Apps deployment variant) and a quick-lab repo with a step-by-step build.
- A reference architecture diagram and the actual code — pipeline orchestrator is ~60 lines of Python, safety validator is ~30.
- Specific gotchas that cost me a half-day each: federation database options, Lakebase token rotation, Streamlit/Apps reverse-proxy traps, context-window blowouts on real catalogs.
Who is this for
- Heads of Data, Data Engineering Managers, Staff and Principal Data Engineers.
- Teams already on Databricks (or evaluating) who are being asked: "Can we put an AI agent on top of this?"
- Anyone making a build-vs-buy call between Genie, Agent Bricks, and a custom text-to-SQL stack — and wants to make it with their eyes open.
- Demo of the Reference Architecture explained here - https://becloudready.com/blog/text-to-sql-databricks-lakebase-db-agent
This is a technical session. We'll read code. Bring your senior engineers.
Agenda
- The architecture in one slide (5 min)
- Lakebase + Unity Catalog + Lakehouse Federation — why both data planes, and what breaks (15 min)
- The agent pipeline — schema → prompt → LLM → validate → execute (20 min)
- The SQL safety guardrail — what generic SELECT-only validators miss on Databricks (10 min)
- The pluggable LLM layer — live swap from a hosted API to a self-hosted vLLM on a neo-cloud GPU (15 min)
- db-agent vs Genie vs Agent Bricks — when to use which, and why (10 min)
- Q&A (15 min)
About the Speaker
Chandan Kumar — founder of BeCloudReady, organizer of the TorontoAI community (10K+ members), and a Databricks Partner. Maintainer of the open-source db-agent text-to-SQL agent, presented at AAAI-25. Runs the Databricks Lakehouse Bootcamp and works with engineering teams on getting AI agents into production against real data35 attendees
Build Your First AWS Data Lake in 60 Minutes — Live, with Real Code
·OnlineOnlineBuild a working AWS Data Lake from scratch — live, in 60 minutes. Free hands-on session for data engineers and anyone breaking into data.
Full lab + screenshots: https://becloudready.com/learn/roadmaps/aws-data-engineer
WHAT YOU'LL WATCH US BUILD, LIVE
- A raw CSV (Kaggle Crude Oil price data, ~6,400 rows) land in S3
- A Glue Crawler infer the schema and register a table in the Glue Catalog
- An Athena SQL query run against that CSV — count, filter, aggregate, with zero servers to manage
- A PySpark Glue ETL job turn the CSV into partitioned Parquet — columnar, compressed, about 10x cheaper to scan
- A second crawler register the Parquet table
- The same query run again — with a side-by-side look at "data scanned": CSV vs Parquet
That last part is the punchline. At scale, Parquet vs CSV is the difference between a $5 query and a 50-cent query. Seeing it live in the Athena UI hits different than reading about it.
WHAT YOU'LL LEAVE WITH
- The full lab, open-sourced. Terraform for the IAM, sandbox, Glue, and Athena setup. The PySpark ETL script. A step-by-step walkthrough you can re-run on your own AWS account.
- A clear mental model of every data lake project: land raw, catalog, transform, catalog again, query. The dataset is just the variable.
- IAM patterns most tutorials skip — region-locking, prefix-scoped S3 access, Glue role policies that don't accidentally grant access to the world. The stuff that shows up in real production reviews.
- A take-home assignment: run the same pipeline on a Kaggle dataset of your choice, and bring it to the next session for feedback.
- Full lab instructions and screenshots: https://becloudready.com/learn/roadmaps/aws-data-engineer
WHO THIS IS FOR
- Data engineers who've shipped parts of a data lake but want to see the full pipeline end-to-end
- Career changers moving into data engineering — this is a portfolio-grade project you can talk about in interviews
- Bootcamp grads and self-taught engineers who know SQL and Python but haven't seen how Glue, Athena, and S3 connect
- Backend engineers picking up data work who want a fast on-ramp to the AWS data stack
- Anyone whose manager said "we should build a data lake" and now it's on your plate
Never opened the AWS console? You'll still follow along — we explain every click. Been doing this for years? You'll still pick up the IAM scoping pattern.
FORMAT
- Live on Microsoft Teams — full screen-share, no slides
- 60 minutes — same length as the lab itself
- Recording shared with everyone who registers
- Open Q&A throughout, not just at the end
ABOUT YOUR HOST
Chandan Kumar — founder of beCloudReady (https://www.becloudready.com) and organizer of TorontoAI (https://www.toronto-ai.org), a 10,000+ member community of AI and data builders. Twenty-plus years across software, cloud, and data engineering. Maintainer of open-source labs and the db-agent project, presented at AAAI-25 (https://github.com/db-agent/db-agent).
Full hands-on lab, with screenshots and instructions: https://becloudready.com/learn/roadmaps/aws-data-engineer8 attendees
Past events
280

