Small Model Inference for AI search, RAG, and Document Processing
Details
Running RAG or AI search in production? Then you already know: inference infrastructure becomes expensive and complicated fast.
Superlinked Inference Engine (SIE) GitHub is an open-source inference stack designed to simplify that entire layer running seamlessly on your own infrastructure, including Google Cloud Platform (GCP).
With just three SDK calls you can run a complete inference pipeline across 85+ state-of-the-art models on your own GPUs.
SIE also includes the infrastructure layer out of the box:
load balancing
autoscaling
monitoring & observability
Terraform configs for GKE and EKS
The same codebase scales from your laptop to a 100-node cluster without changing your workflow whether deployed locally, on Kubernetes, or on Google Cloud Platform.
Open source under Apache 2.0 inference and infrastructure included, end to end.
Explore the project and star it on GitHub: https://github.com/superlinked/sie
Agenda
---
Speaker
Filip Makraduli - Superlinked (Founding ML Developer Relations Engineer)
Machine Learning Engineer working in Developer Relations at Superlinked, shaping open source technical strategy, shipping product features, and working on small LLMs Bridging ML research with production systems and developer communities. I specialise in building end-to-end AI solutions, leading developer relations initiatives, and crafting AI/ML engineering solutions that transform research in…
Hosts
Marko Paloski - Netcetera (Senior System Engineer)
Jasmina Dimitrievska
DevOps Engineer / Solutions Architect with +5 years of experience in application development, integration, automation, planning and designing system’s architecture for on-premises and cloud infrastructure. Leading project workflows with latest DevOps practices in the software industry. Engaging in public speaking activities to share knowledge and advocate for technological advancements.
Dimitar Mileski - GDG Skopje (GDG Organizer, PhD Student in Computer Science)
Junior Teaching Assistant at Ss. Cyril and Methodius University – Skopje, Faculty of Computer Science and Engineeing - FCSE, PhD Student in Computer Science, MSc in Cloud Computing
Complete your event RSVP here: https://gdg.community.dev/events/details/google-gdg-skopje-presents-small-model-inference-for-ai-search-rag-and-document-processing/.
