AWS Bedrock Mantle — Distributed Inference Engine in Amazon Bedrock
Details
Join us for a deep dive into AWS Bedrock Mantle — Amazon Bedrock's next-generation distributed inference engine for large-scale ML model serving.
Mantle powers the new OpenAI API-compatible endpoints (Chat Completions & Responses APIs) and introduces a fundamentally new architecture with higher default quotas, async inference for long-running workloads, stateful conversations, and a zero-operator-access security model.
Whether you're migrating from OpenAI or building new GenAI applications, this session will show you how Mantle fits into your strategy.
What You'll Learn:
- What is Project Mantle and how it differs from bedrock-runtime
- Migration simplicity — just change the base URL in your OpenAI SDK
- Higher default quotas (100M TPM, 10K RPM) and async inference
- Stateful conversations and the zero-operator-access security model
- Live code demos using the OpenAI SDK with Bedrock via Mantle
- Best Practices how to use Mantel API endpoints.
Who Should Attend:
- Engineers building GenAI applications on AWS
- Teams currently using OpenAI APIs evaluating migration paths
- Solutions architects designing high-throughput inference workloads
- Anyone interested in the next generation of model serving on AWS
Agenda:
- 12:00 PM — Arrival & Networking
- 12:10 PM — Introduction & Overview: What is Project Mantle?
- 12:30 PM — Positioning & Customer Value
- 12:50 PM — Break
- 1:00 PM — Key Benefits
- 1:20 PM — Architecture Deep Dive
- 1:40 PM — Code Demos & Limitations
- 1:55 PM — Q&A / Open Discussion
Speaker: Luis Salcido — Sr. Technical Account Manager, AWS
Location: WeWork One Glenwood — 1 Glenwood Ave, Raleigh, NC 27603 (9th floor). Parking available at Glenwood South parking deck.
