Aug 29 - Visual Agents Workshop Part 3: Teaching Machines to See and Click

Name: Aug 29 - Visual Agents Workshop Part 3: Teaching Machines to See and Click
Start: 2025-08-29T18:00:00+02:00
End: 2025-08-29T19:00:00+02:00

Network event

111 attendees from 44 groups hosting

Hosted By Rome AI, Machine Learning and Computer Vision Meetup

public group

Aug 29 - Visual Agents Workshop Part 3: Teaching Machines to See and Click

Details

Welcome to the three part Visual Agents Workshop virtual series...your hands on opportunity to learn about visual agents - how they work, how to develop them and how to fine-tune them.

Date and Time

Aug 29, 2025 at 9 AM Pacific

Register for the Zoom

Part 3: Teaching Machines to See and Click - Model Finetuning

From Foundation Models to GUI Specialists

Foundation models, such as Qwen2.5-VL, demonstrate impressive visual understanding, but they require specialized training to master GUI interactions. In this final session, you'll transform a general-purpose vision-language model into a GUI specialist that can navigate interfaces with human-like precision.

We'll explore modern fine-tuning strategies specifically designed for GUI tasks, from selecting the right architecture to handling the unique challenges of coordinate prediction and multi-step reasoning. You'll implement training pipelines that can handle the diverse formats and platforms in your dataset, evaluate models on metrics that actually matter for GUI automation, and deploy your trained model in a real-world testing environment.

About the Instructor

Harpreet Sahota is a hacker-in-residence and machine learning engineer with a passion for deep learning and generative AI. He’s got a deep interest in RAG, Agents, and Multimodal AI.

Events in Artificial Intelligence

Computer Vision Machine Learning Data Science Open Source

Rome AI, Machine Learning and Computer Vision Meetup

See more events

Rome AI, Machine Learning and Computer Vision Meetup

Online event

Link visible for attendees

Rome AI, Machine Learning and Computer Vision Meetup

public group

Aug 29 - Visual Agents Workshop Part 3: Teaching Machines to See and Click

FREE