About us
Welcome to our AI Meetup! We are a passionate community dedicated to building and learning about artificial intelligence. Whether you're an expert or just starting out, join us to share knowledge, collaborate on projects, and explore the fascinating world of AI together.
We'll be getting different events off the ground, both locally (NY) and virtually.
We'll AI cover topics such as Machine Learning (ML), Large Language Models (LLMs), Deep Learning, Data engineering, MLOps, Python, Computer Vision, Natural Language Processing (NLP), the Latest AI developments, and more!
Questions? Reach out to Sage Elliott on LinkedIn: https://www.linkedin.com/in/sageelliott/
Upcoming events
7

AI Book Club: AI Systems Performance Engineering
·OnlineOnlineFebruary's book is "AI Systems Performance Engineering"!
This is a casual-style event. Not a structured presentation on topics. Sometimes, the discussion even drifts away from the chapters, but feel free to grab the mic to help steer it back.
Feel free to join the discussion even if you have not read the book chapters! :)
Want to discuss the contents during the reading week? Join the Slack Flyte MLOps Slack group and search for the "ai-reading-club" channel. https://slack.flyte.org/
-------------------------------------------------
About the book:
Title: AI Systems Performance Engineering
Authors: Chris Fregly
Published: November 2025https://learning.oreilly.com/library/view/ai-systems-performance/9798341627772/
Chapters:
1. Introduction and AI System Overview
2. AI System Hardware Overview
3. OS, Docker, and Kubernetes Tuning for GPU-based Environments
4. Tuning Distributed Networking Communication
5. GPU-Based Storage I/O Optimizations
6. GPU Architecture, CUDA Programming, and Maximizing Occupancy
7. Profiling and Tuning GPU Memory Access Patterns
8. Occupancy Tuning, Warp Efficiency, and Instruction-Level Parallelism
9. Increasing CUDA Kernel Efficiency and Arithmetic Intensity
10. Intra-Kernel Pipelining, Warp Specialization, and Cooperative Thread Block Clusters
11. Inter-Kernel Pipelining, Synchronization, and CUDA Stream-Ordered Memory Allocations
12. Dynamic Scheduling, CUDA Graphs, and Device-Initiated Kernel Orchestration
13. Profiling, Tuning, and Scaling PyTorch
14. PyTorch Compiler, OpenAI Triton, and XLA Backends
15. Multinode Inference, Parallelism, Decoding, and Routing Optimizations
16. Profiling, Debugging, and Tuning Inference at Scale
17. Scaling Disaggregated Prefill and Decode for Inference
18. Advanced Prefill-Decode and KV Cache Tuning
19. Dynamic and Adaptive Inference Engine Optimizations
20. AI-Assisted Performance Optimizations and Scaling Toward Multimillion GPU ClustersBook Description
Elevate your AI system performance capabilities with this definitive guide to unlocking peak efficiency across every layer of your AI infrastructure. In today's era of ever-growing generative models, AI Systems Performance Engineering equips professionals with actionable strategies to co-optimize hardware, software, and algorithms for high-performance and cost-effective AI systems. Authored by Chris Fregly, a performance-focused engineering and product leader, this comprehensive resource transforms complex systems into streamlined, high-impact AI solutions.
Inside, you'll discover step-by-step methodologies for fine-tuning GPU CUDA kernels, PyTorch-based algorithms, and multinode training and inference systems. You'll also master the art of scaling GPU clusters for high performance, distributed model training jobs, and inference servers.- Codesign and optimize hardware, software, and algorithms to achieve maximum throughput and cost savings
- Implement cutting-edge inference strategies that reduce latency and boost throughput in real-world settings
- Utilize industry-leading scalability tools and frameworks
- Profile, diagnose, and eliminate performance bottlenecks across complex AI pipelines
- Integrate full stack optimization techniques for robust, reliable AI system performance
Whether you're an engineer, researcher, or developer, AI Systems Performance Engineering offers a holistic roadmap for building resilient, scalable, and cost-effective AI systems that excel in both training and inference.
https://learning.oreilly.com/library/view/ai-systems-performance/9798341627772/
8 attendees
Scalable Research Agents with Tavily, LangGraph, Flyte - ai workshop
·OnlineOnlineBuild your own scalable research agents with Tavily, LangGraph and Flyte.
In this live workshop we’ll build a scalable ReAct research agent using Tavily, LangGraph, and Flyte.
We’ll cover:
- Quick intro to AI Agents
- Tavily - for agentic web search
- LangGraph - for our agentic framework and ReAct pattern
- Flyte 2.0 for scalable orchestration and durability
- How to extend into your own agentic use cases
We’ll be building with the Flyte 2.0 SDK and I’ll show what that platform looks like, but you’ll be able to run the agents locally with or without Flyte cluster access.
https://github.com/flyteorg/flyte-sdk/Hosted by Sage Elliott- AI Engineer Union.ai
11 attendees
Model Context Protocol (MCP) - AI Build & Learn #1
·OnlineOnlineModel Context Protocol (MCP) - AI Build & Learn #1
Welcome to AI Build & Learn a weekly AI engineering stream where we pick a new topic and learn by building together.
This episode covers Model Context Protocol (MCP), an open standard that helps AI models connect to tools, data sources, and apps through a consistent interface.
Resources- GitHub: https://github.com/sagecodes/ai-build-and-learn
- Events Calendar: https://luma.com/ai-builders-and-learners
- Slack (Flyte AI Slack): https://slack.flyte.org/
- Hosted by Sage Elliott: https://www.linkedin.com/in/sageelliott/
In this stream
- MCP concepts overview
- Hands-on demo
- Discussion + practical examples
Community challenge (optional)
Try spending 30–90 minutes during the week learning or building something related to MCP, then share what you’re working on in Slack.
Note on Flyte / Union.ai
You may see Flyte used in some demos. Flyte is an open-source AI orchestration platform maintained by Union.ai for building scalable, durable, and observable AI workflows. You do not need to use Flyte to participate.- Union.ai: https://www.union.ai/
- Flyte: https://flyte.org/
Drop a comment with ideas for future topics (agents, RAG, MLOps, robotics, frameworks, and more).
1 attendee
Past events
2



