Ethereal

12.2K posts

Ethereal banner
Ethereal

Ethereal

@inferencegod

rain man. optimizing agentic looping. top 352 on @aster_DEX connoisseur. trading autist.

Katılım Aralık 2021
599 Takip Edilen1.7K Takipçiler
Sabitlenmiş Tweet
Ethereal
Ethereal@inferencegod·
i don't feed my agent tasks anymore. when the backlog runs dry, it researches and invents the next feature itself, then builds it. and it polices its own work before i ever see it. autonomy-loop v0.5.1: → self-feeding: empty backlog? it proposes the next feature and keeps going, no prompt from me → the bite: it reverts its own fix and reruns the test. stays green? it caught nothing, rejected → self-mutation: it mutates its own changed lines so weak tests get caught before handoff → circuit breaker: it parks to me instead of looping forever → branch protection: it can never touch prod or edit away its own gates → upgrading is one command: /autonomy-upgrade → red-teamed, 77 tests green two terminals. a builder, and a reviewer that trusts nothing. one repo. nobody driving. free, mit, 151 people already running it. /plugin marketplace add github.com/inferencegod/a… /plugin install autonomy-loop@autonomy-loop
English
1
1
8
3.7K
Nebula
Nebula@NebulaAI·
Create any video you can imagine. - Ask your agent to make a script - Generate image stills - @ElevenLabs for VO and SFX - @grok, Veo 3.1, Seedance 2, Kling 3 and more - ffmpeg pieces it all together All within our sandbox. DM us for the prompt we used. Full ad at end👀
English
4
2
8
1.4K
Ethereal
Ethereal@inferencegod·
yeah you read it right, they coordinate through committed git state, not direct chat. the back-and-forth-forever thing is real for pure two-party loops, but two things stop it here. the reviewer isn’t just voting, it runs a mechanized gate (revert the fix and confirm the test goes red, coverage ratchet, patch coverage) so the verdict is pass/fail against real code, not an opinion that can ping-pong. and there’s a no-progress breaker: consecutive waves with no tree change park the loop to me instead of looping forever. your “going too deep before feedback” point is fair though. i’ve actually watched the reviewer pull the builder back from a bad path a few times, so commit-then-review catches more than you’d think. real-time sharing getting feedback earlier is a real edge, i won’t pretend otherwise. 0.6.0 (live now) already adds a planner that grills the spec before the builder builds, basically the third perspective you’re describing, just upstream instead of a runtime vote. the coordination stuff you’re hitting is exactly what i’m building next, more coming tonight !
English
0
0
0
9
Will Washburn
Will Washburn@willwashburn·
Very cool, will check it out. Looks like you have 1 builder and 1 reviewer and they don’t directly communicate is that right? Does that ever get stuck for you? Or maybe I’m misunderstanding but I’m my experiments I’ve seen the reviewer and builder go back and forth in perpetuity without a third vote. Found real time context sharing to be useful in that regard as well as opposed to one builder going down a path too deeply before getting feedback.
English
1
0
1
22
David K 🎹
David K 🎹@DavidKPiano·
Everyone hyping loops right now is going to absolutely lose their minds once they learn about state machines
English
20
8
147
7.9K
Will Washburn
Will Washburn@willwashburn·
@MatthewBerman imo loops still leave the human at the center. instead give an agent a set of high level intents and constraints, and have it write the loops and exit conditions. loop the loop, if you will. without the right guardrails ends in slop for sure, but with them...
English
1
0
1
114
Matthew Berman
Matthew Berman@MatthewBerman·
My favorite loop: "Continue optimizing the code for speed. After each significant change, measure page-load performance across every page under the same repeatable test conditions. Continue until every page loads in under 50 ms." signals.forwardfuture.ai/loop-library/l…
Matthew Berman@MatthewBerman

Just launched Loop Library - a curated list of agent loops you can use right now. Find loops, submit your own, tokenmaxx!! signals.forwardfuture.ai/loop-library/

English
20
13
294
29.3K
Ethereal
Ethereal@inferencegod·
@kevincodex dude post me up there wit u 🫪 haha congrats
English
0
0
2
40
Kevin
Kevin@kevincodex·
posted gitlawb in Ycomb hackernews and got some audience coming from Ycomb
Kevin tweet media
English
12
13
115
5.6K
Ahmet Bilican
Ahmet Bilican@ahmetbilicanxyz·
@MatthewBerman What kind of tasks are you using this kind of LLM-as-a-judge loops in? I guess it makes a huge difference which domain you work in to use this kind of judgement.
English
1
0
1
391
Matthew Berman
Matthew Berman@MatthewBerman·
I'm increasingly exploring using LLM-as-a-judge in loops to determine the goal. I continue to be surprised by how well it's able to get to a great end state. a few examples: > "until it's simple enough" > until it's fast enough" not everything has to be deterministically verifiable.
English
36
11
307
19.9K
Aanya
Aanya@xoaanya·
Programming sits on logic. Algorithms run on logic. Every AI model is logic. Machine learning is logic. Deep learning is logic. Compilers run on logic. Databases are logic. Cryptography is logic. Blockchain is logic. Data structures are logic. Optimization is logic. Networking protocols are logic. Robotics moves because of logic. Game engines run because of logic. Your entire tech stack survives on logic. You're still asking if we need logic for programming?
English
32
5
55
2.7K
Ethereal
Ethereal@inferencegod·
yes, and the unlock for me was pairing the judge with a deterministic gate. let it own the fuzzy goals (“simple enough”, “fast enough”), but keep a test that fails the moment correctness breaks, or the judge will confidently green-light a regression. judge for what has no right answer, gate for what does, same loop. built exactly that: an adversarial reviewer that re-runs the real gate from scratch before it's allowed to approve anything, plus a coverage floor it can't lower. x.com/inferencegod/s…
Ethereal@inferencegod

i don't feed my agent tasks anymore. when the backlog runs dry, it researches and invents the next feature itself, then builds it. and it polices its own work before i ever see it. autonomy-loop v0.5.1: → self-feeding: empty backlog? it proposes the next feature and keeps going, no prompt from me → the bite: it reverts its own fix and reruns the test. stays green? it caught nothing, rejected → self-mutation: it mutates its own changed lines so weak tests get caught before handoff → circuit breaker: it parks to me instead of looping forever → branch protection: it can never touch prod or edit away its own gates → upgrading is one command: /autonomy-upgrade → red-teamed, 77 tests green two terminals. a builder, and a reviewer that trusts nothing. one repo. nobody driving. free, mit, 151 people already running it. /plugin marketplace add github.com/inferencegod/a… /plugin install autonomy-loop@autonomy-loop

English
0
0
0
77
Jana
Jana@BratDotAI·
Are you using your Codex/Claude subscription to its full potential?
English
18
1
15
894
Anum 
Anum @anumness·
Claudes usage limits are really getting on my nerves now. I think Im going to switch back to codex soon.
English
24
0
29
2.7K
ege
ege@aegeantic·
agent writing code isn't exciting anymore; it writing the code, compiling, verifying, testing and repeating until it gets it right does
English
15
2
58
2.2K
Ethereal
Ethereal@inferencegod·
this is exactly it. the writing was never the hard part, the write-test-verify-repeat-until-green loop is. i built that into a claude code plugin: a builder writes code + a RED-GREEN test, an adversarial reviewer re-runs the whole gate and tears the diff apart, loops until it actually passes. won't fabricate a result either. MIT: github.com/inferencegod/a…
English
0
0
0
73