RoPE-LIME + Efficient AI Inference | Two 30 Min Talks

Name: RoPE-LIME + Efficient AI Inference | Two 30 Min Talks
Start: 2026-03-03T19:00:00-06:00
End: 2026-03-03T21:00:00-06:00
Location: Capital Factory | Captain America (8th Floor)

Hosted by Rakshak T. and Eric T H.

Austin Deep Learning

Details

Details
This will be a journal club event
Two Talks:

RoPE-LIME: RoPE-Space Locality + Sparse-K Sampling for Efficient LLM Attribution Link to Paper
Efficient AI Inference: Beyond Low-Bit Compute. (Ref Paper 1, Ref Paper 2)

Speakers

Ritesh Goru and Isaac Picov (original authors of paper)
Kunle Olutomilayo, PhD

Abstract

1. Explaining closed-source Large Language Model (LLM) outputs is challenging because API access prevents gradient-based attribution, while perturbation methods are costly and noisy when they depend on regenerated text. We introduce Rotary Positional Embedding Linear Local Interpretable Model-agnostic Explanations (RoPE-LIME), an open-source extension of gSMILE that decouples reasoning from explanation: given a fixed output from a closed model, a smaller open-source surrogate computes token-level attributions from probability-based objectives (negative log-likelihood and divergence targets) under input perturbations. RoPE-LIME incorporates (i) a locality kernel based on Relaxed Word Mover’s Distance computed in RoPE embedding space for stable similarity under masking, and (ii) Sparse-K sampling, an efficient perturbation strategy that improves interaction coverage under limited budgets. Experiments on HotpotQA (sentence features) and a hand-labeled MMLU subset (word features) show that RoPE-LIME produces more informative attributions than leave-one-out sampling and improves over gSMILE while substantially reducing closed-model API calls.

2. The talk will examine the main drivers of inference efficiency, including compute, memory traffic, latency, and energy, and provide an overview of low-bit quantization methods along with their practical challenges. Kunle will also briefly use the final few minutes to share selected milestones from his lab. The discussion will draw in part on foundational work by Mark Horowitz on the energy cost of computation and a recent survey by Kai Liu et al. on low-bit model quantization.

Info
Austin Deep Learning Journal Club is group for committed machine learning practitioners and researchers alike. The group typically meets every first Tuesday of each month to discuss research publications. The publications are usually the ones that laid foundation to ML/DL or explore novel promising ideas. This is also a great opportunity to showcase your implementations to get feedback from other experts.

Sponsors:
Thank you to Station Austin for sponsoring Austin Deep Learning. Station Austin is the center of gravity for entrepreneurs in Texas. They bring together the best entrepreneurs in the state and connect them with their first investors, employees, mentors, and customers. To sign up for a Station Austin membership, click here.

Austin Deep Learning

RoPE-LIME + Efficient AI Inference | Two 30 Min Talks

Austin Deep Learning

Details

Related topics

You may also like