AI Meetup at TechAlley
Details
Hi everyone, we are welcoming back Joshua Stacey for this weekends AI Meetup at TechAlley. Click RSVP to see venue details and register. As usual, TechAlley starts at 10:00am and our meetup will start at 11:00am and is FREE to attend.
Want to know the pros/cons of cloud/local LLM inference? This talk will help you peek under the hype!
Every developer building with LLMs hits the same fork: call an API or run the model yourself. The marketing from both sides makes it sound simple. It isn't. This is a ground-level deep dive from someone who runs both — works with Cursor/Claude/Codex and also has run benchmarks on dozens of models.
We'll look at benchmarks across current-generation open models including Qwen 3.5, Gemma 3, Gemma 4, and more compared head-to-head against frontier API models. We'll cover the real costs, the content policy friction, the context window lies, and what it actually takes to keep a self-hosted fleet alive.
Whether you're a solo dev picking a stack, a founder evaluating costs at scale, or just curious what running your own inference looks like — come with questions.
