agusti

54.1K posts

agusti banner
agusti

agusti

@bleuonbase

Software Engineer.

Katılım Eylül 2013
22.6K Takip Edilen35.6K Takipçiler
Sabitlenmiş Tweet
agusti
agusti@bleuonbase·
I'm not a ML researcher but got a bit nerd-sniped by OAI new parameter-golf challenge I setup my pi-autoresearch loop on it, of course. I asked my clanker to do some research about all related papers that could help it come up with better ideas etc It ended up making this Knowledge Base it's nothing revolutionary, mostly notes and links to related papers. i hope its useful to some golf.agustif.com also if you have any feedbacks or ideas hmu
agusti tweet media
OpenAI@OpenAI

Are you up for a challenge? openai.com/parameter-golf

English
24
10
392
42.4K
Rhys
Rhys@RhysSullivan·
@juemrami yeah i've got a good handle on it at the point, moreso what i was looking for is stylistic patterns i.e having a LiveLayer and a TestLayer being exported from the same place
English
1
0
8
1.5K
Rhys
Rhys@RhysSullivan·
What are some good large open source Effect codebases with established patterns in them? I've got two: - github.com/AnswerOverflow… effect v3 - github.com/rhyssullivan/e… effect v4 They're both decent but wanting to improve the amount of references I can give my agent
English
34
16
439
40.8K
agusti
agusti@bleuonbase·
@JustJake AIs will bridge, wont replace
English
0
0
0
109
Jake
Jake@JustJake·
Very wrong, very dangerous You want your APIs to do the exact same thing, every time AI is great at many things; reproducibility is not one of them
Naval@naval

AIs replace UIs and APIs.

English
211
238
5.4K
256.5K
agusti
agusti@bleuonbase·
@dboskovic it should be a fun (and productive) experience for what i've heard
English
0
0
0
219
agusti retweetledi
David Boskovic
David Boskovic@dboskovic·
we were spending too much time "carrying water" between agents - here's a spec build it - address the pr comments - fix the failing CI so we made an agent for overseeing the SDLC e2e (Autobuild) we invited 10 startups to use it last week in SF (NYC next week!) what it does: - plans out entire feature builds across dozens of PRs - oversees the coding agents as they work - babysits PRs and addresses human and agentic review - conducts security, performance, and architectural reviews - QAs the work and records videos of the outcomes - monitors logs for issues after staging release - collects ux feedback from humans and address them - indexes all the concepts in your codebase - automatically writes updates to your team about what shipped - knows the current rollout state of features - maintains running sandboxes with a full dev env - dogfoods features before reporting success - engages with you in slack as it builds - automatically fixes reported issues - nags you for PR reviews when needed - optimizes your CI so it's not shit (big bottleneck for velocity) we're planning on making this the most insane building experience for established companies with a focus on quality/safety and human collaboration while accelerating velocity by 1-2 orders of magnitude if you want to join us in NYC next week (Thur/Fri - May 7/8) or future workshops lmk - we're onboarding up to 50 companies at a time by helping you ship 12 weeks of roadmap in 2 days - a sort of reset on baseline velocity no cost to attend beyond the inference you burn (you'll build a lot so not for the faint of heart)
English
7
20
66
8.5K
agusti
agusti@bleuonbase·
this was a really magical experience to be part of, can recommend.
David Boskovic@dboskovic

we were spending too much time "carrying water" between agents - here's a spec build it - address the pr comments - fix the failing CI so we made an agent for overseeing the SDLC e2e (Autobuild) we invited 10 startups to use it last week in SF (NYC next week!) what it does: - plans out entire feature builds across dozens of PRs - oversees the coding agents as they work - babysits PRs and addresses human and agentic review - conducts security, performance, and architectural reviews - QAs the work and records videos of the outcomes - monitors logs for issues after staging release - collects ux feedback from humans and address them - indexes all the concepts in your codebase - automatically writes updates to your team about what shipped - knows the current rollout state of features - maintains running sandboxes with a full dev env - dogfoods features before reporting success - engages with you in slack as it builds - automatically fixes reported issues - nags you for PR reviews when needed - optimizes your CI so it's not shit (big bottleneck for velocity) we're planning on making this the most insane building experience for established companies with a focus on quality/safety and human collaboration while accelerating velocity by 1-2 orders of magnitude if you want to join us in NYC next week (Thur/Fri - May 7/8) or future workshops lmk - we're onboarding up to 50 companies at a time by helping you ship 12 weeks of roadmap in 2 days - a sort of reset on baseline velocity no cost to attend beyond the inference you burn (you'll build a lot so not for the faint of heart)

English
6
0
10
319
agusti
agusti@bleuonbase·
@b_nnett omg this is supercool thx for building it been on my mind for a while how could i safely extend codex beyond one day hacks on top that get reverted ty Bennett for open sourcing it will try it right away
English
1
0
2
2.9K
Bennett
Bennett@b_nnett·
Codex ++ | Now open source Add your own tweaks, features, fix bugs. Anything. Here's me first tweak to add custom keyboard shortcuts:
Bennett tweet media
English
35
47
1.1K
858.6K
agusti
agusti@bleuonbase·
what would you do if you had unlimited gpt 5.5 for 72 hours?
English
4
0
8
299
agusti
agusti@bleuonbase·
the intoxicating power-trip of tasteful sponsored unlimited tokenmaxxing
English
1
0
2
160
agusti
agusti@bleuonbase·
Goodhart’s Law Metrics get gamed. → Use multi-metric dashboards; rotate/refresh metrics; audit for gaming; tie to outcomes.
English
0
0
1
92
agusti
agusti@bleuonbase·
Conway’s Law Architecture mirrors org silos. → Design target architecture first; align team boundaries to components/APIs.
English
1
0
2
68
agusti
agusti@bleuonbase·
Pareto (80/20) Chasing the last 20% burns time. Define “good enough” exit criteria; ship at 80%; backlog the rest.
English
1
0
3
308