MOPS - Meetup #5

Details
Hi Pugaholics!
We are pleased to invite you to the fifth edition of the MOPS meetup, the first time in the new location - NVIDIA Office in Warsaw!
This time, we kindly ask you to complete the following form before coming: https://forms.gle/b7uNo6b5qjSvu6UJ9
You’ll receive an invitation via email to register for a meeting at NVIDIA. The registration is needed to be admitted to the office.
The formula is similar to previous editions, about 30-minute practical talks followed by question-and-answer sessions with networking afterward.
The entire event and all talks will be held in English.
Key details:
- Location: NVIDIA office, Varso 2, 10th floor, ul. Chmielna 73, Warsaw
- Insightful talks 💬
- Knowledge exchange during networking with pizza 🍕
Plan of the meeting:
18:00-20:00 - Presentations:
- Small Models, Big Impact: The power of SLMs for On-Device App Intelligence using MLX (Prince Canuma)
- NVIDIA NIM: Deploying Generative AI at the Speed of Light (Mateusz Szczęsny)
- Building Your Own OpenAI: Self-Hosting LLM Inference Engines (Vladimir Alekseichenko)
20:00-22:00 - Pizza + Networking
Presentations:
-
Small Models, Big Impact: The power of SLMs for On-Device App Intelligence using MLX - This talk explores Small Language Models (SLMs) for on-device applications, covering:
- SLM advantages
- Apple MLX framework
- Building native apps with SLMs
- Real-world applications
- Privacy and performance considerations
- Future of SLM-powered app development
After this talk you’ll have insights on leveraging open-source SLMs for intelligent, privacy-respecting on-device applications. -
NVIDIA NIM: Deploying Generative AI at the Speed of Light - In the realm of generative AI, deploying models efficiently and at scale is paramount. In this session, we will explore in depth how NIMs work - NVIDIA's cutting-edge set of cloud-native microservices designed to streamline and accelerate the deployment of AI models across various infrastructures. Moreover, we will delve into the role of NIM in optimizing model performance employing NVIDIA TensorRT engines. Attendees will gain insight into the NIM's architecture and how NIM integrates with on-premise infrastructure, cloud providers and tools that facilitate the deployment of inference microservices.
-
Building Your Own OpenAI: Self-Hosting LLM Inference Engines - Learn to run LLM models locally and scale them to handle thousands of users. I'll share my experience from DataWorkshop, guiding you on how to start with LLMs on a CPU using tools like Ollama and then boost performance with a GPU. You'll learn to manage latency, throughput, and memory in production systems. I'll cover techniques like continuous batching, PagedAttention, and memory optimization with tools like vLLM.
I'll offer practical tips, pitfalls to avoid, and advice for deploying LLMs, whether you're new to AI or an experienced ML engineer.
Speakers:
- Prince Canuma - Prince is a distinguished ML Engineer specializing in MLOps, Data Science, Computer Vision, and NLP. Dubbed the "King of Apple MLX," Prince has become a pivotal figure in Apple's on-device ML ecosystem through his innovative work with the MLX framework. He created MLX-VLM, MLX-Embeddings, and FastMLX, showcasing his expertise in LLMs, RAG, and efficient AI model implementation on Mac. Prince's contributions have significantly advanced the field of cloud and on-device machine learning deployment.
- Mateusz Szczęsny - Mateusz began his career in the tech industry as a Software Engineer, gaining experience and transitioning into an MLOps role. This journey allowed him to develop both technical and soft skills, setting the stage for his current position at NVIDIA as a Senior MLOps engineer. At NVIDIA, he is responsible for leadership of a technical team and development of internal MLOps software stack. Now he is involved in the development of NVIDIA Inference Microservices (NIMs).
- Vladimir Alekseichenko - Vladimir, CEO of DataWorkshop, has 10+ years of ML experience. He advises Fortune 500 companies on AI/ML integration across industries. As creator of DataWorkshop community and conference organizer, he educates thousands on practical ML and LLMs. Vladimir hosts an AI podcast Biznes Myśli and has authored five courses, with graduates working at top tech companies.
Please find us on Linkedin
Hope to see you there!

MOPS - Meetup #5