Skip to content

July 24 - Women in AI

Network event
103 attendees from 39 groups hosting
July 24 - Women in AI

Details

Hear talks from experts on cutting-edge topics in AI, ML, and computer vision!

When

Jul 24, 2025 at 9 - 11 AM Pacific

Where

Online. Register for the Zoom

Exploring Vision-Language-Action (VLA) Models: From LLMs to Embodied AI

This talk will explore the evolution of foundation models, highlighting the shift from large language models (LLMs) to vision-language models (VLMs), and now to vision-language-action (VLA) models. We'll dive into the emerging field of robot instruction following—what it means, and how recent research is shaping its future. I will present insights from my 2024 work on natural language-based robot instruction following and connect it to more recent advancements driving progress in this domain.

About the Speaker

Shreya Sharma is a Research Engineer at Reality Labs, Meta, where she works on photorealistic human avatars for AR/VR applications. She holds a bachelor’s degree in Computer Science from IIT Delhi and a master’s in Robotics from Carnegie Mellon University. Shreya is also a member of the inaugural 2023 cohort of the Quad Fellowship. Her research interests lie at the intersection of robotics and vision foundation models.

Farming with CLIP: Foundation Models for Biodiversity and Agriculture

Using open-source tools, we will explore the power and limitations of foundation models in agriculture and biodiversity applications. Leveraging the BIOTROVE dataset. The largest publicly accessible biodiversity dataset curated from iNaturalist, we will showcase real-world use cases powered by vision-language models trained on 40 million captioned images. We focus on understanding zero-shot capabilities, taxonomy-aware evaluation, and data-centric curation workflows.

We will demonstrate how to visualize, filter, evaluate, and augment data at scale. This session includes practical walkthroughs on embedding visualization with CLIP, dataset slicing by taxonomic hierarchy, identification of model failure modes, and building fine-tuned pest and crop monitoring models. Attendees will gain insights into how to apply multi-modal foundation models for critical challenges in agriculture, like ecosystem monitoring in farming.

About the Speaker

Paula Ramos has a PhD in Computer Vision and Machine Learning, with more than 20 years of experience in the technological field. She has been developing novel integrated engineering technologies, mainly in Computer Vision, robotics, and Machine Learning applied to agriculture, since the early 2000s in Colombia. During her PhD and Postdoc research, she deployed multiple low-cost, smart edge & IoT computing technologies, such as farmers, that can be operated without expertise in computer vision systems. The central objective of Paula’s research has been to develop intelligent systems/machines that can understand and recreate the visual world around us to solve real-world needs, such as those in the agricultural industry.

Multi-modal AI in Medical Edge and Client Device Computing

In this live demo, we explore the transformative potential of multi-modal AI in medical edge and client device computing, focusing on real-time inference on a local AI PC. Attendees will witness how users can upload medical images, such
as X-Rays, and ask questions about the images to the AI model. Inference is executed locally on Intel's integrated GPU and NPU using OpenVINO, enabling developers without deep AI experience to create generative AI applications.

About the Speaker

Helena Klosterman is an AI Engineer at Intel, based in the Netherlands, Helena enables organizations to unlock the potential of AI with OpenVINO, Intel's AI inference runtime. She is passionate about democratizing AI, developer experience, and bridging the gap between complex AI technology and practical applications.

The Business of AI

The talk will focus on the importance of clearly defining a specific problem and a use case, how to quantify the potential benefits of an AI solution in terms of measurable outcomes, evaluating technical feasibility in terms of technical challenges and limitations of implementing an AI solution, and envisioning the future of enterprise AI.

About the Speaker

Milica Cvetkovic is an AI engineer and consultant driving the development and deployment of production-ready AI systems for diverse organizations. Her expertise spans custom machine learning, generative AI, and AI operationalization. With degrees in mathematics and statistics, she possesses a decade of experience in education and edtech, including curriculum design and machine learning instruction for technical and non-technical audiences. Prior to Google, Milica held a data scientist role in biotechnology and has a proven track record of advising startups, demonstrating a deep understanding of AI's practical application.

Photo of Stuttgart AI, Machine Learning and Computer Vision Meetup group
Stuttgart AI, Machine Learning and Computer Vision Meetup
See more events
FREE