Skip to content

Details

One-Pass to Reason + The (Hyper-spherical) Dynamics of Hallucinations | Two 30-Min Talks

Details
This will be a journal club event
Two Talks:

  1. One-Pass to Reason: Token Duplication and Block-Sparse Mask for Efficient Fine-Tuning on Multi-Turn Reasoning (NeurIPS 2025 Workshop) Link to Paper
  2. The (Hyper-spherical) Geometry and Dynamics of Hallucinations.
    Based on Papers: “A Geometric Taxonomy of Hallucination in LLMs”, Marin 2026 https://arxiv.org/pdf/2602.13224v3
    “How Transformers Reject Wrong Answers: Rotational Dynamics of Factual Constraint Processing”, Marin 2026 https://arxiv.org/abs/2603.13259
    “Text Corpora as Concept Fields: Black-Box Hallucination and Novelty Measurement”, Kersting et al. 2026 https://arxiv.org/pdf/2605.05103"

Speakers

  1. Ritesh Goru, Member of Technical Staff at DevRev
  2. Connor Favreau, PhD, Principal Data Scientist at Central Health

Abstract

1. Fine-tuning Large Language Models (LLMs) on multi-turn reasoning datasets requires N (number of turns) separate forward passes per conversation due to reasoning token visibility constraints, as reasoning tokens for a turn are discarded in subsequent turns. We propose duplicating response tokens along with a custom attention mask to enable single-pass processing of entire conversations. We prove our method produces identical losses to the N-pass approach while reducing time complexity from to and maintaining the same memory complexity for a transformer based model. Our approach achieves significant training speedup while preserving accuracy.

2. RAG and LLM-as-Judge systems are among the most common approaches used for hallucination mitigation and detection. However, these methods are not foolproof, often requiring additional layers of validation while lacking deterministic or directly measurable notions of semantic correctness. They can also introduce substantial computational and architectural overhead.
A growing class of hallucination detection techniques instead grounds hallucination measurement in specific geometric or dynamical properties of embedding space, derived either from the hidden representations of the model itself or from separate embedding models.
This presentation surveys these embedding-based approaches, including hyperspherical geometry, semantic regionality, trajectory-based methods, and emerging physics-inspired dynamical techniques for understanding and detecting hallucinations in large language models.

Info
Austin Deep Learning Journal Club is group for committed machine learning practitioners and researchers alike. The group typically meets every first Tuesday of each month to discuss research publications. The publications are usually the ones that laid foundation to ML/DL or explore novel promising ideas and are selected by a vote. Participants are expected to read the publications to be able to contribute to discussion and learn from others. This is also a great opportunity to showcase your implementations to get feedback from other experts.

Sponsors:
Thank you to Station Austin for sponsoring Austin Deep Learning. Station Austin is the center of gravity for entrepreneurs in Texas. They bring together the best entrepreneurs in the state and connect them with their first investors, employees, mentors, and customers. To sign up for a Station Austin membership, click here.

Related topics

Events in Austin, TX
Artificial Intelligence
Artificial Intelligence Machine Learning Robotics
Deep Learning
Machine Learning
Natural Language Processing

You may also like