DataPhilly Talks: From Efficient Deep Learning to Open Multilingual Vision LLMs

Name: DataPhilly Talks: From Efficient Deep Learning to Open Multilingual Vision LLMs
Start: 2025-06-26T18:00:00-04:00
End: 2025-06-26T20:00:00-04:00
Location: ZeroEyes

Hosted By

Tetyana Y. and 3 others

DataPhilly Talks: From Efficient Deep Learning to Open Multilingual Vision LLMs

Details

🚀 Join Us for DataPhilly’s June Talks at ZeroEyes! 🚀
We fill learn about optimizing AI: From Efficient Deep Learning to Open Multilingual Vision-LLMs.
Our host is ZeroEyes - company that delivers a proactive, human-verified visual gun detection and situational awareness solution that integrates into existing digital security cameras to stop mass shootings and gun-related violence.
Our food sponsor is Liberty Personnel, widely recognized as the finest direct placement and contract recruiting firm in the region.

Event Schedule:
Doors opens at 6:00 pm ET
6:00 - 6:30 Event start and networking, DataPhilly intro
6:30 - 7:15 Dave Ramsey: "The Neural Net Diet Plan: Cutting Parameters and GFOPs not performance", followed by Q&A
7:15 - 8:00 Karthik reddy Kanjula: “Maya: An Instruction Finetuned Multilingual Multimodal Mode” , followed by Q&A
After 8:00 Networking time

Speakers:
Dave Ramsey, machine learning engineer at ZeroEyes: "The Neural Net Diet Plan: Cutting Parameters and GFOPs not performance".
How to make deep learning models faster and smaller using techniques like distillation, pruning, quantization, Ghost convolutions, and Reparameterization. Perfect for anyone looking to deploy efficient AI without losing accuracy.

Abstract: This talk dives into the practical techniques used to shrink, speed up, and streamline deep learning models for real-world deployment. We'll explore convolutions, model distillation, pruning, quantization (including QAT), Ghost convolutions, and reparameterization strategies—each offering a different angle on how to maintain accuracy while cutting down size and latency. You'll also get a look at how these methods fit into modern workflows, from training to edge deployment, and how neural architecture search (NAS) can automate some of the process. Whether you're building for mobile, embedded systems, or just want faster inference, this talk offers a toolkit of optimization techniques grounded in real R&D experience.

Speaker Bio: Dave C. Ramsey is a machine learning engineer and jazz trumpet player based in Philadelphia. He began his journey into AI during the COVID shutdown, where a FastAI course hooked him on the parallels between machine learning programming and jazz composition. At ZeroEyes, his work includes cutting-edge model optimization, and image generation using tools like Stable Diffusion and computer vision models. Dave has also helped organize Data Philly meetups running FastAI study groups from time to time helping others break into the field. His work blends creativity and precision, with a passion for building efficient, high-performing models for real-world deployment.

Karthik reddy Kanjula, AI Researcher (Cohere Labs Community, WCUPA),https://www.x.com/karthik_kanjula : “Maya: An Instruction Finetuned Multilingual Multimodal Model”.
Maya – A Multimodal Multilingual Vision-LLM built in the open. Maya is completely open source, open weight and open dataset, designed to handle 8 languages, cultural diversity, and nuanced real-world contexts in vision-language models.

Abstract: The rapid development of large Vision-Language Models (VLMs) has led to impressive results on academic benchmarks, primarily in widely spoken languages. However, significant gaps remain in the ability of current VLMs to handle low-resource languages and varied cultural contexts, largely due to a lack of high-quality, diverse, and safety-vetted data. Consequently, these models often struggle to understand low-resource languages and cultural nuances in a manner free from toxicity. To address these limitations, we introduce Maya, an open-source Multimodal Multilingual model. Our contributions are threefold: 1) a multilingual image-text pretraining dataset in eight languages, based on the LLaVA pretraining dataset; 2) a thorough analysis of toxicity within the LLaVA dataset, followed by the creation of a novel toxicity-free version across eight languages; and 3) a multilingual image-text model supporting these languages, enhancing cultural and linguistic comprehension in vision-language tasks.

Speaker Bio: I'm Karthik, an AI Researcher, and I hold a master’s degree in computer science from Westchester University of Pennsylvania. Over the years, I’ve immersed myself in AI, computer vision, and data science. Currently, I lead the team for Maya, an innovative open-source multimodal and multilingual AI project from Cohere Labs community. My background includes collaborative research with Penn State University focused on edge computing, and my research journey began by analyzing abnormal epilepsy seizures.

🚊🚙 We will follow up with parking and transit information couple days before the event. Please note, by RSVPing this event you agree to our Code of Conduct
Looking forward to seeing you there! 🚀

Events in Conshohocken, PA Artificial Intelligence Machine Learning

Data Science Intellectual Discussions Professional Networking