
What we’re about
This Meetup group supports the SF Bay ACM Chapter. You can join the actual SF Bay Chapter by coming to a meeting - most meetings are free, and our membership is only $20/year !
The chapter has both educational and scientific purposes:
- the science, design, development, construction, languages, management and applications of modern computing.
- communication between persons interested in computing.
- cooperation with other professional groups
Our official bylaws will be available soon at the About Us page on our web site. See below for out Code of Conduct.
Videos of past meetings can be found at http://www.youtube.com/user/sfbayacm
Official web site of SF Bay ACM:
http://www.sfbayacm.org/
Click here to Join or Renew
Article IX: Code of Conduct - from the ACM Professional Chapter Code of Conduct
Harassment or hostile behavior is unwelcome, including speech that intimidates,creates discomfort, or interferes with a person’s participation or opportunity for participation, in a Chapter meeting or Chapter event.Harassment in any form, including but not limited to harassment based on alienage or citizenship, age, color, creed, disability, marital status, military status, national origin, pregnancy, childbirth- and pregnancy-related medical conditions, race, religion, sex, gender,veteran status, sexual orientation or any other status protected by laws in which the Chapter meeting or Chapter event is being held, will not be tolerated. Harassment includes the use of abusive or degrading language, intimidation, stalking, harassing photography or recording,inappropriate physical contact, sexual imagery and unwelcome sexualattention. A response that the participant was “just joking,” or “teasing,”or being “playful,” will not be accepted.2. Anyone witnessing or subject to unacceptable behavior should notify a chapter officer or ACM Headquarters.3. Individuals violating these standards may be sanctioned or excluded from further participation at the discretion of the Chapter officers or responsible committee members.
Upcoming events (4)
See all- Deploying & Scaling LLM in the Enterprise: Architecting Multi-agent AI SystemsLink visible for attendees
Deploying and Scaling Large Language Models in the Enterprise: Architecting Multi-Agent AI Systems Integrating Vision, Data, and Responsible AI
LOCATION ADDRESS (update - virtual)
If you want to join remotely, you can submit questions via Zoom Q&A. The zoom link:
https://acm-org.zoom.us/j/97422303746?pwd=XGkOzZpT1w2Y6OMfxqw2s1IQYov1Dh.1Large Language Models (LLMs) are rapidly reshaping enterprise AI, but real-world deployments demand far more than fine-tuning and API calls. They require sophisticated architectures capable of scaling inference, integrating multi-modal data streams, and enforcing responsible AI practices—all under the constraints of enterprise SLAs and cost considerations.
In this session, I’ll deliver a deep technical dive into architecting multi-agent AI systems that combine LLMs with computer vision and structured data pipelines. We’ll explore:
Multi-Agent System Design: Architectural patterns for decomposing enterprise workflows into specialized LLM-driven agents, including communication protocols, context sharing, and state management.
Vision-Language Integration: Engineering methods to fuse embeddings from computer vision models with LLM token streams for tasks such as visual question answering, document understanding, and real-time decision support.
Optimization for GPU Inference: Detailed strategies for memory optimization, quantization, mixed-precision computation, and batching to achieve high throughput and low latency in LLM deployment on modern GPU hardware (e.g., NVIDIA A100/H100).
Observability and Responsible AI: Techniques for building observability layers into LLM pipelines—capturing token-level traces, detecting drift, logging model confidence—and implementing fairness audits and risk mitigation protocols at runtime.
Drawing on practical examples from large-scale enterprise deployments across retail, healthcare, and finance, I’ll discuss the engineering trade-offs, tooling stacks, and lessons learned in translating research-grade LLMs into production-grade systems.
This talk is designed for AI engineers and researchers eager to understand the technical complexities—and solutions—behind scaling multi-modal, responsible AI systems that deliver real business value.
Speaker Bio:
Dhanashree is a Senior Machine Learning Engineer and AI Researcher with over a decade of experience designing and deploying advanced AI systems at scale. Her expertise spans architecting multi-agent solutions that integrate Large Language Models (LLMs), computer vision pipelines, and structured data to solve complex enterprise challenges across industries including retail, healthcare, and finance.
At Albertsons, Deloitte, and Fractal, Dhanashree has led the development of production-grade AI applications, focusing on optimization, model observability, and responsible AI practices. Her work includes designing scalable inference architectures for LLMs on modern GPU infrastructures, building hybrid pipelines that fuse vision and language models, and engineering systems that balance performance with ethical and regulatory considerations.
She actively collaborates with research institutions like the University of Illinois. Dhanashree actively engages with the research community and frequently speaks on bridging advanced AI research and production systems.
https://www.linkedin.com/in/dhanashreelele/
Large Language Models (LLMs) are rapidly reshaping enterprise AI, but real-world deployments demand far more than fine-tuning and API calls. They require sophisticated architectures capable of scaling inference, integrating multi-modal data streams, and enforcing responsible AI practices—all under the constraints of enterprise SLAs and cost considerations.
In this session, I’ll deliver a deep technical dive into architecting multi-agent AI systems that combine LLMs with computer vision and structured data pipelines. We’ll explore:- Multi-Agent System Design: Architectural patterns for decomposing enterprise workflows into specialized LLM-driven agents, including communication protocols, context sharing, and state management.
- Vision-Language Integration: Engineering methods to fuse embeddings from computer vision models with LLM token streams for tasks such as visual question answering, document understanding, and real-time decision support.
- Optimization for GPU Inference: Detailed strategies for memory optimization, quantization, mixed-precision computation, and batching to achieve high throughput and low latency in LLM deployment on modern GPU hardware (e.g., NVIDIA A100/H100).
- Observability and Responsible AI: Techniques for building observability layers into LLM pipelines—capturing token-level traces, detecting drift, logging model confidence—and implementing fairness audits and risk mitigation protocols at runtime.
Drawing on practical examples from large-scale enterprise deployments across retail, healthcare, and finance, I’ll discuss the engineering trade-offs, tooling stacks, and lessons learned in translating research-grade LLMs into production-grade systems.
This talk is designed for AI engineers and researchers eager to understand the technical complexities—and solutions—behind scaling multi-modal, responsible AI systems that deliver real business value.
Speaker Bio:
Dhanashree is a Senior Machine Learning Engineer and AI Researcher with over a decade of experience designing and deploying advanced AI systems at scale. Her expertise spans architecting multi-agent solutions that integrate Large Language Models (LLMs), computer vision pipelines, and structured data to solve complex enterprise challenges across industries including retail, healthcare, and finance.
At Albertsons, Deloitte, and Fractal, Dhanashree has led the development of production-grade AI applications, focusing on optimization, model observability, and responsible AI practices. Her work includes designing scalable inference architectures for LLMs on modern GPU infrastructures, building hybrid pipelines that fuse vision and language models, and engineering systems that balance performance with ethical and regulatory considerations.
She actively collaborates with research institutions like the University of Illinois. Dhanashree actively engages with the research community and frequently speaks on bridging advanced AI research and production systems.
https://www.linkedin.com/in/dhanashreelele/
Large Language Models (LLMs) are rapidly reshaping enterprise AI, but real-world deployments demand far more than fine-tuning and API calls. They require sophisticated architectures capable of scaling inference, integrating multi-modal data streams, and enforcing responsible AI practices—all under the constraints of enterprise SLAs and cost considerations.
In this session, I’ll deliver a deep technical dive into architecting multi-agent AI systems that combine LLMs with computer vision and structured data pipelines. We’ll explore:- Multi-Agent System Design: Architectural patterns for decomposing enterprise workflows into specialized LLM-driven agents, including communication protocols, context sharing, and state management.
- Vision-Language Integration: Engineering methods to fuse embeddings from computer vision models with LLM token streams for tasks such as visual question answering, document understanding, and real-time decision support.
- Optimization for GPU Inference: Detailed strategies for memory optimization, quantization, mixed-precision computation, and batching to achieve high throughput and low latency in LLM deployment on modern GPU hardware (e.g., NVIDIA A100/H100).
- Observability and Responsible AI: Techniques for building observability layers into LLM pipelines—capturing token-level traces, detecting drift, logging model confidence—and implementing fairness audits and risk mitigation protocols at runtime.
Drawing on practical examples from large-scale enterprise deployments across retail, healthcare, and finance, I’ll discuss the engineering trade-offs, tooling stacks, and lessons learned in translating research-grade LLMs into production-grade systems.
This talk is designed for AI engineers and researchers eager to understand the technical complexities—and solutions—behind scaling multi-modal, responsible AI systems that deliver real business value.
Speaker Bio:
Dhanashree is a Senior Machine Learning Engineer and AI Researcher with over a decade of experience designing and deploying advanced AI systems at scale. Her expertise spans architecting multi-agent solutions that integrate Large Language Models (LLMs), computer vision pipelines, and structured data to solve complex enterprise challenges across industries including retail, healthcare, and finance.
At Albertsons, Deloitte, and Fractal, Dhanashree has led the development of production-grade AI applications, focusing on optimization, model observability, and responsible AI practices. Her work includes designing scalable inference architectures for LLMs on modern GPU infrastructures, building hybrid pipelines that fuse vision and language models, and engineering systems that balance performance with ethical and regulatory considerations.
She actively collaborates with research institutions like the University of Illinois. Dhanashree actively engages with the research community and frequently speaks on bridging advanced AI research and production systems.
https://www.linkedin.com/in/dhanashreelele/
Join via YouTube:
https://youtube.com/live/AGENDA
7:00 SFBayACM upcoming events, introduce the speaker
7:15 speaker presentation starts
8:15 - 8:30 finish, depending on Q&AJoin SF Bay ACM Chapter for an insightful discussion on:
Abstract:
Large Language Models (LLMs) are rapidly reshaping enterprise AI, but real-world deployments demand far more than fine-tuning and API calls. They require sophisticated architectures capable of scaling inference, integrating multi-modal data streams, and enforcing responsible AI practices—all under the constraints of enterprise SLAs and cost considerations.
In this session, I’ll deliver a deep technical dive into architecting multi-agent AI systems that combine LLMs with computer vision and structured data pipelines. We’ll explore:- Multi-Agent System Design: Architectural patterns for decomposing enterprise workflows into specialized LLM-driven agents, including communication protocols, context sharing, and state management.
- Vision-Language Integration: Engineering methods to fuse embeddings from computer vision models with LLM token streams for tasks such as visual question answering, document understanding, and real-time decision support.
- Optimization for GPU Inference: Detailed strategies for memory optimization, quantization, mixed-precision computation, and batching to achieve high throughput and low latency in LLM deployment on modern GPU hardware (e.g., NVIDIA A100/H100).
- Observability and Responsible AI: Techniques for building observability layers into LLM pipelines—capturing token-level traces, detecting drift, logging model confidence—and implementing fairness audits and risk mitigation protocols at runtime.
Drawing on practical examples from large-scale enterprise deployments across retail, healthcare, and finance, I’ll discuss the engineering trade-offs, tooling stacks, and lessons learned in translating research-grade LLMs into production-grade systems.
This talk is designed for AI engineers and researchers eager to understand the technical complexities—and solutions—behind scaling multi-modal, responsible AI systems that deliver real business value.Speaker Bio:
Dhanashree is a Senior Machine Learning Engineer and AI Researcher with over a decade of experience designing and deploying advanced AI systems at scale. Her expertise spans architecting multi-agent solutions that integrate Large Language Models (LLMs), computer vision pipelines, and structured data to solve complex enterprise challenges across industries including retail, healthcare, and finance.
At Albertsons, Deloitte, and Fractal, Dhanashree has led the development of production-grade AI applications, focusing on optimization, model observability, and responsible AI practices. Her work includes designing scalable inference architectures for LLMs on modern GPU infrastructures, building hybrid pipelines that fuse vision and language models, and engineering systems that balance performance with ethical and regulatory considerations.She actively collaborates with research institutions like the University of Illinois. Dhanashree actively engages with the research community and frequently speaks on bridging advanced AI research and production systems.
https://www.linkedin.com/in/dhanashreelele/
- Designing for Scale, Reliability, and Resiliency: Real-World LessonsValley Research Park , Mountain View, CA
Designing for Scale, Reliability, and Resiliency: Real-World Lessons from Building High-Throughput Systems
LOCATION ADDRESS (Hybrid, in person or by zoom, you choose)
Valley Research Park
319 North Bernardo Avenue
Mountain View, CA CA 93043
Don't use the front door. When facing the front door, turn right along the front of the building. Turn left around the building corner. The 2nd door should be open and have a banner and event registration.If you want to join remotely, you can submit questions via Zoom Q&A. The zoom link:
https://acm-org.zoom.us/j/94270873151?pwd=DFGIb9xhn5GPv8iJD9Bxt1Ya2qJHmN.1
Join via YouTube:
https://youtube.com/live/AGENDA
6:30 Door opens, food and networking (we invite honor system contributions)
7:00 SFBayACM upcoming events, introduce the speaker
7:15 speaker presentation starts
8:15 - 8:30 finish, depending on Q&AJoin SF Bay ACM Chapter for an insightful discussion on:
Talk Description:
As modern software systems grow in complexity and scale, the demand for architectures that are not just fast—but also reliable, resilient, observable, and auditable—has never been greater. In this talk, we'll dive into practical strategies and real-world patterns for designing and operating large-scale distributed systems.
Topics include:- Traffic segmentation and routing strategies across multi-cluster environments
- Patterns for achieving high availability and failover across global infrastructure
- Monitoring and observability at scale: what to measure, how to alert
- Auditing for compliance, trust, and debugging
- Common failure modes and how to build for graceful degradation
- Real examples from mission-critical production systems
Attendees will walk away with architectural insights, tools, and mental models to apply to their own systems, whether working in startups or enterprises.
***
Speaker Bio:
I’m a Senior Software Engineer at DoorDash and previously led platform initiatives at Conviva, where I built scalable, fault-tolerant systems handling tens of millions of sessions daily for customers like Disney, HBO, and Sky. My work has spanned everything from routing frameworks and disaster recovery to monitoring pipelines and SLA enforcement. I’m passionate about making infrastructure reliable and maintainable, and I enjoy sharing lessons learned from real-world systems.
https://www.linkedin.com/in/karanluniya---
Valley Research Park is a coworking research campus of 104,000 square feet hosting 60+ life science and technology companies. VRP has over 100 dry labs, wet labs, and high power labs sized from 125-15,000 square feet. VRP manages all of the traditional office elements: break rooms, conference rooms, outdoor dining spaces, and recreational spaces.
As a plug-and-play lab space, once companies have secured their next milestone and are ready to expand, VRP has 100+ labs ready to expand into.
https://www.valleyresearchpark.com/ - Taming Tech Debt for Platform ReliabilityValley Research Park , Mountain View, CA
LOCATION ADDRESS (Hybrid, in person or by zoom, you choose)
Valley Research Park
319 North Bernardo Avenue
Mountain View, CA CA 93043
Don't use the front door. When facing the front door, turn right along the front of the building. Turn left around the building corner. The 2nd door should be open and have a banner and event registration.If you want to join remotely, you can submit questions via Zoom Q&A. The zoom link:
https://acm-org.zoom.us/j
Join via YouTube:
https://youtube.com/live/AGENDA
6:30 Door opens, food and networking (we invite honor system contributions)
7:00 SFBayACM upcoming events, introduce the speaker
7:15 speaker presentation starts
8:15 - 8:30 finish, depending on Q&AJoin SF Bay ACM Chapter for an insightful discussion on:
Talk Description:
Site Reliability Engineering (SRE) at Google is a job function, mindset, and set of engineering practices focused on ensuring the reliability, scalability, and efficiency of production systems. The term is created in the early 2000s.
This talk aims to reframe the conversation around technical debt. Rather than viewing it as a mere backlog of chores, I will present a methodical framework for identifying and prioritizing it as a strategic opportunity. Drawing from my experiences at various companies, including Google, I will share practical insights on how to transform tech debt from an impedance to a catalyst for organizational velocity. I will also discuss how to apply SRE principles to measure the reliability of infrastructure as we tackle technical debt and will be sharing a few anecdotes and practical methods to address this interesting problem with engineering and automation.
Speaker Bio:
Saurabh Phaltane is a Senior Site Reliability Engineer at Google, where he focuses on designing and optimizing web-scale infrastructures for resilience and high performance. With over a decade of experience in SRE and distributed systems, Saurabh specializes in building reliable systems through robust infrastructure automation, comprehensive observability, and scalable automation. Prior to his work at Google, Saurabh gained valuable experience as an SRE at Okta. He is also a thought leader in SRE and cloud technologies, a mentor for startup entrepreneurs through the Google for Startups program, and a frequent speaker on the topic of foundational thinking for scalable and reliable system infrastructure.https://www.linkedin.com/in/karanluniya
---
Valley Research Park is a coworking research campus of 104,000 square feet hosting 60+ life science and technology companies. VRP has over 100 dry labs, wet labs, and high power labs sized from 125-15,000 square feet. VRP manages all of the traditional office elements: break rooms, conference rooms, outdoor dining spaces, and recreational spaces.
As a plug-and-play lab space, once companies have secured their next milestone and are ready to expand, VRP has 100+ labs ready to expand into.
https://www.valleyresearchpark.com/