Beyond Pass@k: Breadth-Depth Metrics for Reasoning Boundaries

Name: Beyond Pass@k: Breadth-Depth Metrics for Reasoning Boundaries
Start: 2026-04-21T18:30:00+03:00
End: 2026-04-21T19:30:00+03:00
Location: Politehnica Business Tower

Hosted by lutu.adrian.catalin

Bucharest Deep Learning

Details

Bucharest Deep Learning is back with another exciting session! Join us for a deep dive into Reasoning and RLVR benchmarking alongside Marius Dragoi, presenting his recent paper on evaluating the true limits of LLMs.

The Talk: Beyond Pass@k: Breadth-Depth Metrics for Reasoning Boundaries

Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a powerful paradigm to improve Large Language Models on reasoning tasks. However, assessing the actual reasoning boundary of these models relies heavily on the Pass@k metric, which can be highly misleading. Given a large number of trials, Pass@k can produce correct answers due to random guessing.

The paper introduces Cover@τ, a novel metric that better measures the improvement given by RLVR. Cover@tau highlights a clear trade-off between the variaty of problems solved and the reliability of the models problem solving. This new approach reveals a different ranking of popular RLVR algorithms and provides a much more accurate perspective on true reasoning boundaries.

Logistics:

Date & Time: Tuesday, April 21 | 18:30 - 19:30
Location: FMI New Building (Politehnica Business Tower)
Address: Bulevardul Iuliu Maniu, nr. 15G, Etaj 5, Room 503

Bucharest Deep Learning

Beyond Pass@k: Breadth-Depth Metrics for Reasoning Boundaries

Bucharest Deep Learning

Details

Related topics

You may also like