Raymond Weitekamp
3.3K posts

Raymond Weitekamp
@raw_works
building tools for builders | founder @polySpectra | cofounder @cyprismaterials | cohort 1 @activatefellows @berkeleylab | PhD @caltech | AB @princeton | #rwri



BrowseComp-Plus, perhaps the hardest popular deep research task, is now solved at nearly 90%... ... and all it took was a 150M model ✨ Thrilled to announce that Reason-ModernColBERT did it again and outperform all models (including models 54× bigger) on all metrics








This is how I run 5 agents concurrently in a Code Factory to write/ship 100% of our code. It uses Symphony from @alex_frantic (oss) + Codex Mac app + @linear Took me 2-3 days to set up and now it’s *cranking* github.com/openai/symphony

@raw_works @a1zhang @lateinteraction @badlogicgames @GeoffreyHuntley Can you clarify why your custom pi extension didn't work? Feel like you can register a custom tool + hooks to manage the jj lifecycle?


rlms (recursive language models) are wild man, seriously! gave it a 3,000-line django queryset file. asked it to find every class, categorize methods, and identify design patterns. so it started by writing the python code to slice it into chunks, called itself 9 times on the pieces, self-corrected a syntax error mid-run, and delivered a complete analysis in 5 iterations. found 13 classes, 70+ methods, 11 design patterns. the architecture looks simple but honestly it's beautiful. so how it works is : 1. a python sandbox with the full doc as a context variable. like the whole context just lives in a global python variable, 2. then the main orchestrator llm just outputs python code. and that code handles the slicing + analysis. the context splitting? yeah that's from the code itself. 3. and then llm lets it call itself recursively on the chunks. keeps going until it's confident enough to set a final answer. orchestrator just loops this whole thing until done. man it really looks simple but it's just a really smart way of dealing with context. no rag. no embeddings. no vector db bullshit kinda stuff. all it does it let the orchestrator llm to be more like a programmer. it's just an llm in a loop writing code to read what it can't fit in its context window. I am diving more into this but seems like a good strategy to deal with the context.

behold! on the occasion of my birthday, i've decided that i have enough coding agent session history to train a GEPA optimizer to replace myself entirely. rrm = recursive raymond model (results to follow)




GPT-5.4 Pro's overthinking is officially a feature, not a bug. If a simple 'Hi' costs you $80 in compute, the model isn't smarter—it's just inefficient. I'm sticking with Claude Opus 4.6 for logic until OpenAI figures out how to stop burning money on basic greetings. 💸










