New benchmarks in vision-language models for real-world use. Google Research

Name: New benchmarks in vision-language models for real-world use. Google Research
Start: 2023-11-02T10:00:00-07:00
End: 2023-11-02T11:00:00-07:00

Hosted By

Sophia A.

New benchmarks in vision-language models for real-world use. Google Research

Details

In this talk, Yonatan Bitton, a Research Scientist at Google Research, will present VisIT-Bench and WHOOPS!, two benchmarks designed to elevate the field of vision-language models. VisIT-Bench focuses on real-world applications and includes 592 test queries across 70 different 'instruction families.' It goes beyond traditional benchmarks like VQAv2 and COCO by covering tasks from simple object recognition to game playing and creative generation.

Notably, VisIT-Bench features a unique reference-free auto-evaluation method that aligns closely with human evaluations, highlighting that current top-performing models surpass a GPT-4 reference in only 27% of cases.

Meanwhile, WHOOPS! aims to test visual commonsense through the generation of deliberately unusual images. It introduces specialized tasks and includes zero-shot and end-to-end models to meet these challenges. Both benchmarks are designed to be dynamic and open for participation, encouraging ongoing development and evaluation in the field of vision-language models.

Events in Artificial Intelligence Data Science

Deep Learning Machine Learning Data Science using Python

BuzzRobot

See more events

BuzzRobot

Online event

This event has passed

BuzzRobot

public group

New benchmarks in vision-language models for real-world use. Google Research