Omar Khattab

13.5K posts

Omar Khattab banner
Omar Khattab

Omar Khattab

@lateinteraction

Asst professor @MIT CSAIL @nlp_mit. Research includes https://t.co/VgyLxl0oa1, https://t.co/ZZaSzaRaZ7 (@DSPyOSS), RLMs, and GEPA. Prev: CS PhD @StanfordNLP. Research @Databricks.

Cambridge, MA Katılım Aralık 2022
3.4K Takip Edilen34.8K Takipçiler
will brown
will brown@willccbb·
SOTA method for variance reduction:
will brown tweet media
English
1
0
6
163
Omar Khattab
Omar Khattab@lateinteraction·
In my biased view, they are the kinds of things that help start mini-fields around new problems and new algorithmic paradigms. Stay tuned.
English
1
0
20
711
Omar Khattab
Omar Khattab@lateinteraction·
We're gearing up to release two research efforts I've been extremely excited about for quite some time. Y'all will really love these.
English
6
4
72
1.6K
Omar Khattab retweetledi
ACM Conference on AI and Agentic Systems
🎤 Keynote announcement: @trq212 (Thariq Shihipar), Member of Technical Staff on Claude Code at @AnthropicAI, is keynoting #CAIS2026. Thariq's "Lessons from Building Claude Code" series on Skills, prompt caching, tool design, and "unhobbling" is required reading for anyone building agentic systems. We're thrilled to have him. 📍 San Jose · May 26–29 🔗 caisconf.org
ACM Conference on AI and Agentic Systems tweet media
English
2
18
101
14.2K
Omar Khattab
Omar Khattab@lateinteraction·
which is the lowest I’ve seen if this is on a single-core or even few-core CPU
English
1
0
7
1.6K
John Kim
John Kim@johnkimdw·
I’m thrilled to share that I’ll be starting my CS PhD at @NorthwesternU this fall, advised by @ManlingLi_! I’ll be researching areas in trustworthy AI and spatial intelligence to build reliable AI systems that are grounded in the physical world. I’m also happy to announce that I was awarded the @NSF GRFP fellowship, which will support my PhD for 3 years! This wouldn’t have been possible without my wonderful mentors @nunompmoniz, @Meng_CS, @frank_liu_01, @NoahZiems, and countless others who’ve guided me throughout my undergrad. And so… I guess I won’t be leaving the midwest :)
English
5
4
74
9.5K
Omar Khattab
Omar Khattab@lateinteraction·
@hxiao TTC for embeddings already has a name, it's late interaction :D
English
2
0
32
1.8K
Han Xiao
Han Xiao@hxiao·
another thought after ICLR is Test-Time Compute (TTC) for embedding models. My thesis: given a trained embedding model, can we improve retrieval quality by spending more compute at inference: multiple encoding rounds, conditional branching, if-else gates - all purely from the embedding geometry, training-free, no prior and no LLM helping? Despite Noam Brown saying small models like GPT-2 wouldn't benefit from TTC, I still wanted to explore whether embedding models can "think longer." Here, instead of generating LLM tokens, we re-encode, compare, gate, and amplify query vectors based on what the first-pass retrieval tells us. So the agent designs embedding "programs" (each a DAG over the model's own vectors with branches and gates), runs them on 3 retrieval benchmarks × 2 models (jina-v5 nano & small). And here you go: some interesting TTC programs that only use the given embedding model.
Han Xiao tweet media
English
3
4
36
4.6K
Omar Khattab retweetledi
MIT CSAIL
MIT CSAIL@MIT_CSAIL·
MIT PhD student Alex Zhang (@a1zhang) explains how AI models are "mismanaged geniuses" that could take on a much wider range of tasks. Full video: tinyurl.com/bddd5vdx
English
5
49
445
76.2K
Omar Khattab retweetledi
DSPy
DSPy@DSPyOSS·
"yo dawg, i heard you like RLMs and GEPA, so i put GEPA in your RLM so you can RLM while you GEPA"
Sam Hogan 🇺🇸@samhogan

We’re introducing HALO 😇 Hierarchal Agent Loop Optimizer HALO is an RLM-based agent optimization technique capable of recursively self-improving agents by analyzing their execution traces and suggesting changes. This work is inspired by the Mismanaged Genius Hypothesis proposed by @a1zhang and @lateinteraction earlier this month. tldr; we improved performance on AppWorld (Sonnet 4.6) from 73.7 --> 89.5 (+15.8) by giving HALO-RLM access to harness trace data and asking it to identify issues. The feedback from HALO surfaced failures in the harness such as hallucinated tool calls, redundant arguments in tools, refusal loops, and semantic correctness issues. Each issue mapped cleanly to a direct prompt update. We then fed these finding into Cursor (Opus 4.6), and asked the coding agent to update the underlying harness. We repeated this trace -> HALO-RLM analysis -> code update loop until the score plateaued. Today we’re open-sourcing the core HALO-RLM framework, evals, and data for further review.

English
11
42
617
48.6K
Omar Khattab retweetledi
Sajjadur Rahman
Sajjadur Rahman@subZero_saj·
🥁 We are thrilled to announce the keynote speakers for DASHSys Workshop @ VLDB 2026. 🧵
English
1
1
9
2.3K
Omar Khattab retweetledi
Sam Hogan 🇺🇸
Sam Hogan 🇺🇸@samhogan·
We’re introducing HALO 😇 Hierarchal Agent Loop Optimizer HALO is an RLM-based agent optimization technique capable of recursively self-improving agents by analyzing their execution traces and suggesting changes. This work is inspired by the Mismanaged Genius Hypothesis proposed by @a1zhang and @lateinteraction earlier this month. tldr; we improved performance on AppWorld (Sonnet 4.6) from 73.7 --> 89.5 (+15.8) by giving HALO-RLM access to harness trace data and asking it to identify issues. The feedback from HALO surfaced failures in the harness such as hallucinated tool calls, redundant arguments in tools, refusal loops, and semantic correctness issues. Each issue mapped cleanly to a direct prompt update. We then fed these finding into Cursor (Opus 4.6), and asked the coding agent to update the underlying harness. We repeated this trace -> HALO-RLM analysis -> code update loop until the score plateaued. Today we’re open-sourcing the core HALO-RLM framework, evals, and data for further review.
Sam Hogan 🇺🇸 tweet media
English
58
122
1.3K
122.3K
Omar Khattab retweetledi
Pau
Pau@hugemensa·
v2 for xtr-warp-rs is out, adding sharding support to the indices The entire search pipeline has been rewritten around efficient transfers and new kernels that enable parallelization and scheduling optimizations, all while staying true to the WARP formula Details below 👇
English
2
2
15
5.7K
Omar Khattab retweetledi
Lakshya A Agrawal
Lakshya A Agrawal@LakshyAAAgrawal·
Excited to share that my ICLR 2026 Oral Talk for GEPA is available on YouTube. I go deeper into why GEPA works better than prior optimization techniques, along with touching on many aspects of GEPA! youtu.be/HbGah-uP1fI
YouTube video
YouTube
Lakshya A Agrawal tweet media
Lakshya A Agrawal@LakshyAAAgrawal

Thrilled to present GEPA as an Oral Talk and Poster at ICLR 2026 this Friday in Rio! 🇧🇷 Apr 24 Oral Session 3A (Agents), 10:30 AM BRT, Amphitheater Poster Session 4, 3:15 PM, Pavilion 3 x.com/LakshyAAAgrawa… Let's recap what's happened since we released GEPA last year 🧵

English
9
46
240
29.4K
Omar Khattab
Omar Khattab@lateinteraction·
GEPA: GrEmlin PAreto prompt optimization?
English
1
0
20
1.7K