Ethereal

12.2K posts

Ethereal banner
Ethereal

Ethereal

@inferencegod

rain man. optimizing agentic looping. top 352 on @aster_DEX connoisseur. trading autist.

Bergabung Aralık 2021
591 Mengikuti1.7K Pengikut
Tweet Disematkan
Ethereal
Ethereal@inferencegod·
i don't feed my agent tasks anymore. when the backlog runs dry, it researches and invents the next feature itself, then builds it. and it polices its own work before i ever see it. autonomy-loop v0.5.1: → self-feeding: empty backlog? it proposes the next feature and keeps going, no prompt from me → the bite: it reverts its own fix and reruns the test. stays green? it caught nothing, rejected → self-mutation: it mutates its own changed lines so weak tests get caught before handoff → circuit breaker: it parks to me instead of looping forever → branch protection: it can never touch prod or edit away its own gates → upgrading is one command: /autonomy-upgrade → red-teamed, 77 tests green two terminals. a builder, and a reviewer that trusts nothing. one repo. nobody driving. free, mit, 151 people already running it. /plugin marketplace add github.com/inferencegod/a… /plugin install autonomy-loop@autonomy-loop
English
0
0
3
495
Ethereal
Ethereal@inferencegod·
no thank you! you pointed right at why i built a portable gate. the verification isn’t claude-specific, it’s just logic on a diff. so i pulled it into a standalone binary that runs on any agent. cursor, copilot, codex. it reverts your fix to prove the test actually catches it, ratchets coverage so it can only go up, and checks every changed line is tested. green or it fails. open source drops tonight :-)
English
0
0
1
16
Ethereal
Ethereal@inferencegod·
i don't feed my agent tasks anymore. when the backlog runs dry, it researches and invents the next feature itself, then builds it. and it polices its own work before i ever see it. autonomy-loop v0.5.1: → self-feeding: empty backlog? it proposes the next feature and keeps going, no prompt from me → the bite: it reverts its own fix and reruns the test. stays green? it caught nothing, rejected → self-mutation: it mutates its own changed lines so weak tests get caught before handoff → circuit breaker: it parks to me instead of looping forever → branch protection: it can never touch prod or edit away its own gates → upgrading is one command: /autonomy-upgrade → red-teamed, 77 tests green two terminals. a builder, and a reviewer that trusts nothing. one repo. nobody driving. free, mit, 151 people already running it. /plugin marketplace add github.com/inferencegod/a… /plugin install autonomy-loop@autonomy-loop
English
0
0
3
495
Ethereal
Ethereal@inferencegod·
yeah you’re right, subagents do get a genuinely fresh context, isolated tools, even worktree isolation and their own hooks. i was drawing the line in the wrong place. the distinction i actually mean is single-session vs separate sessions. a subagent, even a fork, is spawned and judged by the same parent agent in one run. mine are two independent claude processes with no shared parent deciding the verdict, they only see committed git state. the docs kind of point at this too, they send you to agent teams or background agents for cross-session stuff rather than subagents. honestly for a lot of setups subagents would do the job. i went heavier because i wanted the reviewer to be a process the builder cannot influence at all. thank you for the context!
English
1
0
1
15
Ethereal
Ethereal@inferencegod·
subagents share the parent’s context and run inside the same session, so the reviewer is still kind of grading its own homework. two terminals are two independent claude processes that can’t see each other’s reasoning, only the committed git state. the reviewer re-runs the gate from scratch and reverts the builder’s fix to confirm the test catches it. the separation is the point. you can’t red-team yourself in the same context window. it also means a crash in one doesn’t take the other down, and the whole handoff is just git. hope this helps
English
1
0
1
38
Matthew Schrager
Matthew Schrager@MatthewSchrager·
My current workflow is a /grill-to-goal skill based on @mattpocockuk’s /grill-with-docs that basically interviews you to produce detailed documentation about your feature, with clear acceptance criteria etc., along with a goal-ready prompt that references that documentation. Then just call /goal with that prompt. Works very nicely in my experience.
English
3
0
15
555
Peter Yang
Peter Yang@petergyang·
So I have Codex running on a /goal and it's been working for 2 hours but the problem is it's making alot of wrong assumptions so I have to monitor and steer it constantly. Is this expected? Perhaps I should've had it make a detailed plan first?
English
29
1
43
6.2K
Ethereal
Ethereal@inferencegod·
the 2-hour-of-wrong-assumptions thing is the exact problem i built around. two issues stacked: nothing’s checking the assumptions, and there’s no second set of eyes. so i run a builder and an adversarial reviewer. the reviewer re-runs everything and reverts the builder’s own fix to confirm the test actually catches it. a green test that proves nothing gets thrown out. and when the task queue is ambiguous it researches and writes the plan first instead of charging in. you stop steering because the second agent is doing the steering.
Ethereal@inferencegod

i don't feed my agent tasks anymore. when the backlog runs dry, it researches and invents the next feature itself, then builds it. and it polices its own work before i ever see it. autonomy-loop v0.5.1: → self-feeding: empty backlog? it proposes the next feature and keeps going, no prompt from me → the bite: it reverts its own fix and reruns the test. stays green? it caught nothing, rejected → self-mutation: it mutates its own changed lines so weak tests get caught before handoff → circuit breaker: it parks to me instead of looping forever → branch protection: it can never touch prod or edit away its own gates → upgrading is one command: /autonomy-upgrade → red-teamed, 77 tests green two terminals. a builder, and a reviewer that trusts nothing. one repo. nobody driving. free, mit, 151 people already running it. /plugin marketplace add github.com/inferencegod/a… /plugin install autonomy-loop@autonomy-loop

English
0
0
0
80
Tim Tiefenbach
Tim Tiefenbach@TimTeaFan·
@gauravvohra This! Over like 500k the session limit burns down like nothing even if it’s just a short question regarding something earlier in the conversation.
English
0
0
0
68
Gaurav Vohra
Gaurav Vohra@gauravvohra·
That one long running Claude conversation that nukes all your limits every time you come back to it
English
4
0
20
1.1K
gk
gk@gkneuman·
Opus 4.8 is absolutely terrible. I do not understand how Anthropic hasn't said anything about Fable 5 yet. Is it coming back or should I cancel my subscription? @claudeai
English
38
5
227
21.6K
Elliot Arledge
Elliot Arledge@elliotarledge·
prompting opus 4.8 the same way i prompt fable has yielded interesting results. less vibes and more surgery iykyk
English
2
0
15
1.2K
3miry
3miry@3mireeee·
Idk why they don’t just give Mythos explicitly for cyber defence to US critical businesses and the military first, so Mythos can fix their shit, then have a phased rollout to smaller business and consumers after that. It just needs proper social planning so the walls are up first.
English
3
0
2
482
Conor Dart
Conor Dart@Conor_D_Dart·
Just in - Anthropic may have been given an impossible task. Reports suggest the White House wants Fable 5 to be impossible to jailbreak before it can be rereleased. ----  So in my opinion, I think this may cause more friction between the US government and Anthropic, but at least it's educating the government on the realities of how AI operates, and maybe this could cause something good to come out.
English
18
3
106
7.5K
corbin
corbin@corbin_braun·
if your cloud agent can test your software end to end in its VM you have reached end game. build infinite.
English
4
2
14
959
Pedro Domingos
Pedro Domingos@pmddomingos·
AI + Work from home = Pretend to work from home
English
18
3
73
4.7K
Omar
Omar@omarvvvr·
What’s the fastest way to stop wasting time ?
English
19
0
14
498
Ethereal
Ethereal@inferencegod·
@warrioraashuu code is just language bro. being a founder is using your verbosity for net gain
English
0
0
0
2
aashuu ✦
aashuu ✦@warrioraashuu·
If AI wrote almost all your code, what actually makes you the founder?
English
39
1
14
1.4K
Thomas Trimoreau
Thomas Trimoreau@TTrimoreau·
AI writes the copy. AI writes the code. AI sketches the UI. What skill stays valuable now?
English
25
0
16
909
Jana
Jana@BratDotAI·
Are you using your Codex/Claude subscription to its full potential?
English
15
0
13
591
Rohan
Rohan@proxy_vector·
@BratDotAI Full potential is usually blocked by workflow design, not model quality. Most people still use these tools as faster autocomplete instead of handing them real context, constraints, and a review loop.
English
2
0
1
17
Justin Hammon
Justin Hammon@justinhammon_·
@BratDotAI Definitely trying! Multiple terminal windows open. Planning them all and then doing implementations when the 5 hour window resets lol
English
2
0
1
56