Skip to content

Details

This week's topic: Semi-structured natural language for LLMs

As described in Thoughtworks Technology Radar Vol. #29.

We’ve had success in various applications using a semi-structured natural language for LLMs. Structured inputs, such as a JSON document, are clear and precise and give the model an indication of the type of response being sought. Constraining the response in this way helps narrow the problem space and can produce more accurate completions, particularly when the structure conforms to a domain-specific language (DSL) whose syntax or schema is provided to the model. We’ve also found that augmenting the structured input with natural language comments or notations produces a better response than either natural language or structured input alone. Typically, natural language is simply interspersed with structured content when constructing the prompt. As with many LLM behaviors, we don’t know exactly why this works, but our experience shows that putting natural language comments in human-written code also improves the quality of output for LLM-based coding assistants.

Zoom link will be added about 10 min before the event starts.

Discussion Resources :

LLM Based Multi-Agent Generation of Semi-structured Documents from Semantic Templates in the Public Administration Domain by Emanuele Musumeci, Michele Brienza, Vincenzo Suriani, Daniele Nardi, and Domenico Daniele Bloisi
https://arxiv.org/html/2402.14871v1

Semi-Structured Chain-of-Thought: Integrating Multiple Sources of Knowledge for Improved Language Model Reasoning by Xin Su1, Tiep Le, Steven Bethard, and Phillip Howard
https://arxiv.org/html/2311.08505v2

Decoding the Powerhouse: A Deep Dive into Semi-structured and Multi-modal RAG by Pankaj Pandey
https://medium.com/@pankaj_pandey/decoding-the-powerhouse-a-deep-dive-into-semi-structured-and-multi-modal-rag-845f0e015987

Semistructured Data: Challenges and Solutions by Tim Filzinger
https://konfuzio.com/en/semistructured-data/

AI Algorithms
Software Development

Members are also interested in