Skip to content

Details

Inference is no longer just a deployment detail. It is becoming the core systems problem in AI.
As models get larger, workloads get more complex, and real-world expectations move toward low latency, high throughput, and sustainable cost, the conversation is shifting from “can it run?” to “can it serve well at scale?”
That is exactly what this meetup is about.

On 14 March 2026, Red Hat AI, NeevCloud, and HPE are bringing together the vLLM community in Pune for a focused afternoon on modern inference systems, practical engineering lessons, and hands-on exploration.

This is for people building with LLMs in production, working on model serving infrastructure, optimizing GPU utilization, reducing latency, improving token economics, or exploring new patterns like semantic routing and disaggregated serving.

What to expect
• Technical talks grounded in real inference challenges
• Practical discussions on performance, architecture, and serving tradeoffs
• Sessions around vLLM, semantic routing, and production-minded inference design
• A hands-on workshop to go beyond slides and get closer to the system
• Time to connect with engineers, maintainers, and practitioners working on the next wave of inference infrastructure

Agenda:

12:30 PM to 01:00 PM - Registration and opening remarks

01:00 PM to 01:30 PM - Keynote: Why inference matters

01:30 PM to 02:00 PM - vLLM technical introduction

02:00 PM to 02:30 PM - vLLM semantic routing

02:30 PM to 03:00 PM - Break and pizza

03:00 PM to 03:30 PM - Hands-on workshop

03:30 PM to 04:00 PM - Project Sardeenz

04:00 PM to 04:30 PM - Technical session by HPE

04:30 PM to 06:00 PM - Technical session by NeevCloud

What to bring
• Your laptop with SSH installed
GPU instances will be provided by the organizers
• A government-issued photo ID
Required for venue entry
• Questions, curiosity, and a strong interest in how inference systems are evolving

A few important notes
• Registration closes 24 hours before the event
• Unregistered attendees will not be allowed at the venue
• The agenda may slightly evolve as we finalize demos and live discussions

If you care about how AI systems actually serve, scale, and perform in the real world, this meetup will be worth your time. See y'all on the 14th!

Related topics

Events in Pune, IN
AI and Society
Artificial Intelligence
Artificial Intelligence Machine Learning Robotics
DevOps

You may also like