About us

A monthly meetup for engineers curious about how LLMs actually run in production.
Whether you are just getting started with vLLM or you are optimising CUDA kernels for fun, you are welcome here :)

We dig into the systems-level work behind fast, cheap AI inference: GPU architecture, KV cache management, benchmarking, and everything in between.

Format is TBC, but the point is to learn and connect !

Upcoming events

No upcoming events

Organizers

Mourad

Members

See all

Inference Engineering

About us

Upcoming events

Organizers

Members

Related topics

Inference Engineering

About us

Upcoming events

Group links

Organizers

Members

Related topics