[PDG 434] Scaling LLM Test-Time Compute (can be More Effective than Parameters)

Name: [PDG 434] Scaling LLM Test-Time Compute (can be More Effective than Parameters)
Start: 2025-05-06T20:00:00+02:00
End: 2025-05-06T22:00:00+02:00

Hosted By

DavidFarago

[PDG 434] Scaling LLM Test-Time Compute (can be More Effective than Parameters)

Details

Link to article: https://arxiv.org/pdf/2408.03314
Title: Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
Track: Scaling Laws
Content: We study how effectively LLM performance improves when given additional inference-time compute. Analyzing two strategies—verifier-guided search and adaptive response refinement—we find effectiveness varies by prompt difficulty. A "compute-optimal" strategy improves efficiency by over 4× compared to best-of-N sampling, and smaller models with inference-time compute can outperform 14× larger models under equal FLOPs budgets.
Slack link: ml-ka.slack.com, channel: #pdg. Please join us -- if you cannot join, please message us here or to mlpaperdiscussiongroupka@gmail.com.

In the Paper Discussion Group (PDG) we discuss recent and fundamental papers in the area of machine learning on a weekly basis. If you are interested, please read the paper beforehand and join us for the discussion. If you have not fully understood the paper, you can still participate – everyone is welcome! You can join the discussion or simply listen in. The discussion is in German or English depending on the participants.

Events in Artificial Intelligence Deep Learning

Machine Learning Natural Language Processing Neural Networks

AI Paper Discussion Group Karlsruhe

See more events

AI Paper Discussion Group Karlsruhe

public group

Online event

Link visible for attendees

AI Paper Discussion Group Karlsruhe

public group

[PDG 434] Scaling LLM Test-Time Compute (can be More Effective than Parameters)

FREE