Daniel Hails
136 posts


Does anyone know how 218 tool calls with 138 messages in @conductor_build only takes up ~14% of 1M context?
I don't see any traces of auto-compaction anywhere.
Is this some hack with only keeping the latest N tool calls/messages?

English

👀 I built something similar to test dynamic workers from @Cloudflare.
Not really about SLOs, more about treating deploys as an eval loop.
Each deploy is a bet.
If it stays live and passes evals, it sticks.
If it fails on real traces, it gets rolled back.
The traces get fed to a code agent.
It takes the current working island and tries to find the next hop from there.
With good observability, you can replay those traces in a simulated environment and test changes before going live again.
So production becomes more of a control loop.
You explore, revert safely, and iterate forward.
At some point this becomes continuous. Too much to manage manually.
So you need an infra harness to run it.
I can clean it up and share it if it resonates, built to learn more than needing it myself.
English

Has anyone built a AI-powered regression monitor for rollbacks? Like a sentinel?
Instead of running a real healthcheck of your services, you have an agent monitor post-deployment, and if regression/bug, it rollsback & tries to fix?
So sorta like the old "known good" version monitors for its newer iteration and says "LGTM"?
English

IOCS:
models[.]litellm[.]cloud
checkmarx[.]zone
RSA Public Key (used to encrypt stolen data before exfil):
MIICIjANBgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEAvahaZDo8mucujrT15ry+ 08qNLwm3kxzFSMj84M16lmIEeQA8u1X8DGK0EmNg7m3J6C3KzFeIzvz0UTgSq6cV
...
rn3JMF0xZyXNRpQ/fZZxl40CAwEAAQ==
(4096-bit RSA — fingerprint this for threat intel sharing)
English

@Kautukkundan @sabeshbharathi @ritam5013 @c_engines > We discussed the engineering architecture behind OpenClaw and why it kinda sucks
Say more?
English

Hosted lob'sided 🦞 at Conscious Engines HQ
> More than 70% attendees setup their first OpenClaw instance
> Everything from Mac Minis, RasPi to Cloud setups
> @sabeshbharathi and "Meridian" (his molty) facilitated the session
> @ritam5013 demo'ed "Summer" (Summer >> OpenClaw). Summer has her own blog btw
> We discussed the engineering architecture behind OpenClaw and why it kinda sucks
> What it means to be proactive
> oh, and best 2 setups won lob'sided Merch




English

@steveruizok Write a hook - that’s the best way to block anything you don’t want.
English

Some more from my list who are here on twitter and some of my favourites:
@samwhoo -- samwho.dev -- load balancing, bloom filters, hashing, memory allocation
@redblobgames -- redblobgames.com -- A* pathfinding, hexagonal grids, 2D visibility
@vicapow & @lewislehe -- setosa.io/ev -- eigenvectors, markov chains, PCA
@distillpub -- distill.pub -- feature visualization, gaussian processes
@JackSchaedler -- jackschaedler.github.io -- DSP, handwriting recognition
@KuninDaniel -- seeing-theory.brown.edu -- probability & stats
@worrydream -- worrydream.com/LadderOfAbstra… -- Ladder of Abstraction
@evanliin -- tinytpu.com -- TPU architecture
@gordic_aleksa -- aleksagordic.com/blog/matmul -- GPU matmul kernels
@puddingviz -- pudding.cool -- data-driven visual essays
@ncasenmare -- ncase.me -- trust, polygons, anxiety, spaced repetition
English

@SIGKITTEN @cursor_ai huggingface.co/sweepai/sweep-… is the best I've come across; although the lack of harness means I've not played with it much.
English

@CaptAlbertoR @vincent_koc @openclaw The session_list is a config issue - there’s some incantation in the config that will let you do it - I can try and dig it up.
Do you use sandboxes?
English

@vincent_koc @openclaw Hey Vincent, running a 7-agent multi-business setup. From the trenches:
sessions_list only returns 1 sesh, not all across agents
No per-model thinkingDefault, only global (#38171, #11120)
openai-codex 429s don't trigger fallback chain (#24102). Annoying for multi-model setups
English

Looking for input / feedback for anyone developing plugins or looking to develop plugins on @openclaw
- What hooks/features are you looking for?
- What is stopping you from plugins?
- Any key pain points?
- How do you test when developing?
DM's open or drop comments here :)
English

A few ones:
- Ability to access the callGatewayTool
- Make it possible to override existing tools
- Ability to inject messages (e.g. Ralph /loops are super painful at a plugin level)
- Better guidance around getting types to work when developing locally - think I’ve got a fix, but feels clumsy.
- Agent end hook - needs to be before the agent fully exits the loop so it’s possible to make it continue.
English

@DanNeidle We can only hope for a good retrospective -- but if it is a month I agree the likelihood of exploitation goes way up.
I've seen simple exploits hang around for years though, so it's sadly possible.
English

@djrhails one month would be a very long time. On average takes 15 days for vulnerability to be exploited and this was a much stupider-than-average vulnerability.
English

I'd make a bet this is related to the GOV.UK One Login - so active since 9th Feb 2026?
x.com/DanNeidle/stat…
Dan Neidle@DanNeidle
I see some weird things but this takes the biscuit. A vulnerability in the Companies House website, that let anyone view the private dashboard of any one of the five million registered companies, see directors' personal details. And modify them.
English

@thesophiaxu You might be interested in github.com/folk-js/allio - gets full 'DOM' for apps rather than always parsing from screenshots.
I use that for my version of this.
English

@nap_borntoparty @modkin_mp @realmcore_ Appreciate the response - frankly I just think wear it as a badge of honour (everyone's building on the shoulders of giants), rather than see as something to obfuscate - which may be a harsh read.
English

@djrhails @modkin_mp @realmcore_ It's not *actually* a fork but I get what you're saying. Opencode is not upstream git wise internally. We directly use code from the opencode server for our server. Our tui and the agent are completely built from scratch.
Do you think that qualifies as a fork?
English

@nap_borntoparty @modkin_mp @realmcore_ Come on man - it's okay to be a fork - it's not cool to deny it (your npm is literally a wrapper over a bun compiled opencode + some extra prompts).
I could be wrong - if so point me at the real code.
English

@modkin_mp @realmcore_ It's not a fork! The client-server architecture is heavily inspired. The TUI and the agent are completely different
English

Github is so ripe for disruption.
1. There's no local + remote CI check feedback loop. You have to push things up to a PR for the checks to run.
2. The comment/thread UI is awful: you just end up with a massive list of comments after every agentic review/CI check.
3. Why can't I ask Copilot to "fix X, Y, Z and push up a commit" directly in the the UI?
Seems like OpenAI will come out with something here. @bcherny when Claude Code Hub?
English

@andy_matuschak I feel it - for sure! More boon than curse; but taking time to completely checkout helps.
English

@_buggles My partner and I call them "open tabs" -- and the mental overload of multiple open tabs on a browser is the same.
The concept of "Blips" are worth looking into as well from @andy_matuschak.
English













