xoots retweetledi
xoots
6.7K posts


@amit4tek @trychroma @grok it’s funny how much money they dumped into media etc to make this bad video just to have people not understand bc they video and presentation is horrible 😭💀
English

@mikeyobrienv I posted some tips today if you do on what I was doing for the high hit rates, (hermes wrote it💀)
English

Interesting read from Anthropic on harness design for long-running apps.
A lot of parts they described: loops, handoffs, evaluator separation, and runtime control are exactly the layer ralph-orchestrator provides
anthropic.com/engineering/ha…
github.com/mikeyobrien/ra…
English

@mikeyobrienv hey you ever try out some deepseek chat via official endpoint in your ralph’s ? the ability to hit cache in loops is crazy, I sustained 97% cache hit across like 40mil tokens of loops
English

@LottoLabs My latest camera app that I built largely with qwen 27b, I had to finish with Opus 4.6 because while qwen was working, the apple log pipelines were pretty complex and new so I tagged in opus to close it out, finalizing some bug fixes then will post for free to the AI community!
English

@Rahatcodes the hermes agent has some cool stuff you can use for this, you can use it as a communication layer and a operator that connects cluade codex and agent together and to you. additionally you can set up an inbox system that allows two way comms between claude code and herme w hooks
English

Before I go build this thing I want to know if someone has a tool for this:
When I start building a feature into a codebase I do this:
- Start planning with Claude
- Copy the plan over to codex and review
- then some manual back and forth until me, codex, and claude agree on the plan
Ideally i'd like a terminal view that just seemlessly shares the context to both agents somehow
English

@imjszhang @rahulgs and if internal mem system that are run and retrieved by the internal model running the agent can be trusted in long forms. experimenting with external mem callable via api to pre flight inject kb to agents before tasks
English

seems obvious but:
things that are changing rapidly:
1. context windows
2. intelligence / ability to reason within context
3. performance on any given benchmark
4. cost per token
things that are not changing much:
1. humans
2. human behavior, preferences, affinities
3. tools, integrations, infrastructure
4. single core cpu performance
therefore,
ngmi:
1. "i found this method to cut 15% context"
2. "our method improves retrieval performance 10% by using hybrid search"
3. "our finetuned model is cheaper than opus at this benchmark"
4. "our harness does this better because we invented this multi agent system"
5. "we're building a memory system"
6. "context graphs"
7. "we trained an in house specialized rl model to improve task performance in X benchmark at Y% cost reduction"
wagmi:
1. product/ui
3. customer acquisition
4. integrations
5. fast linting, ci, skills, feedback for agents
6. background agent infra to parallelize more work
7. speed up your agent verification loops
8. training your users, connecting to their systems and working with their data, meeting them where they are
English

@imranye I just did a write up about cache hit discounts with deepseek that’s helpful for creating specific repeatable workflows, and can also help with more general agentic work flows as well: x.com/xoots1/status/…
xoots@xoots1
I ran 110 million tokens through the DeepSeek API in March. Autonomous agents. Research pipelines. Overnight coding sprints. 7,030 API calls. My bill was $6.84. Here’s how it worked, what breaks it, and how to set it up so you can do the same thing. 🧵
English











