Self Verify / Self Check - Techniques for Better LLM Agents


Details
This month (note the new date!) we revisit some of the key ideas of getting LLM Agents to improve their outputs using self checking and verification.
"Building a Self Checking Coding Agent with LangGraph and Gemini Flash" - Sam Witteveen
In this talk Sam will look at some of the techniques used to get Agents to check themselves and their intermediary outputs to improve the final results they produce. This is a common techniques used papers and systems like AlphaCodium , Github Workspaces and Devin. Sam will also show how these ideas benefit from fast and small model like Gemini Flash.
"Cryptic Checking Systems" - Martin Andrews
Martin will describe a rather unusual research topic that he's been delving into since the beginning of the year... It may be an esoteric NLP task, but cryptic crosswords offer a rich testing ground for reasoning about word problems - and (now) checking the output of LLMs (both large-scale, like Gemini Pro, and local-scale) using a proof checking system.

Self Verify / Self Check - Techniques for Better LLM Agents