Guided Speculative Inference for Efficient Test-Time Alignment of LLMs [ONLINE]

Name: Guided Speculative Inference for Efficient Test-Time Alignment of LLMs [ONLINE]
Start: 2025-08-07T20:30:00+02:00
End: 2025-08-07T21:30:00+02:00

Hosted By

Munich N. and Tomas R.

Guided Speculative Inference for Efficient Test-Time Alignment of LLMs [ONLINE]

Details

Deriving compute-efficient methods for steering LLMs toward high-reward outputs at inference time is an important line of research in test-time scaling. In this talk, I’ll introduce Guided Speculative Inference (GSI), a new algorithm that uses speculative drafts from a small auxiliary model and a reward-likelihood tilt to provably approximate the optimal reward-regularized policy of a larger model. I will begin by motivating test-time scaling and reviewing prior approaches like soft best-of-n and reward-guided speculative decoding. Then I’ll describe the GSI algorithm, its theoretical guarantees, and its strong empirical gains on reasoning benchmarks.

Jonathan Geuter is a PhD student in Applied Mathematics at Harvard University, and previously obtained a Bachelor's and Master's degree in Mathematics from TU Berlin.

His research interests lie in statistical machine learning, optimal transport, generative modelling, LLMs, and test-time scaling methods.

Events in Machine Learning Natural Language Processing

Neural Networks Data Science using Python Intellectual Discussions

Munich🥨NLP

See more events

Munich🥨NLP

Online event

Link visible for attendees

Munich🥨NLP

public group

Guided Speculative Inference for Efficient Test-Time Alignment of LLMs [ONLINE]

FREE