akira

4.8K posts

akira banner
akira

akira

@realmcore_

Making an autonomous swe • @0xrandomlabs Incepto ne desistam Pax aeternum Memento Mori

elysium Katılım Kasım 2021
770 Takip Edilen9.6K Takipçiler
Mario Zechner
Mario Zechner@badlogicgames·
smilarly great signals: - managing agents is like managing a team of humans - but i have review agents - spec is all you need
English
9
7
285
7K
Josh
Josh@JoshPurtell·
The story for harness opt in multi-agent settings is likely conclusive imo Long horizon, tbd
Josh tweet media
justin@justinsunyt

@JoshPurtell especially for multi-agent yes!! here raw intelligence !== output quality at all

English
3
0
6
1.4K
akira
akira@realmcore_·
@arb8020 Not necessarily true, but theres a trick to it
English
0
0
1
69
arb8020
arb8020@arb8020·
you can refactor anything in two weeks
English
4
0
11
2.7K
akira
akira@realmcore_·
@tejasybhakta It’s quite stable in general in how it handles problem solving It’s code is still too dense and abstracted but mostly pretty good
English
0
0
1
26
Tejas Bhakta
Tejas Bhakta@tejasybhakta·
@realmcore_ It’s the first openai model I’ve thought is good. Concerningly good at writing kernels
English
1
0
1
94
akira
akira@realmcore_·
Been a bit quiet around here We were in fact Cooking Learned a ton about the models and automating the tools as well Will share in due time Needless to say GPT 5.5 is a very very interesting model, and the shape over which autonomy works is jagged
akira tweet media
English
4
0
20
952
akira
akira@realmcore_·
What a great great day To fight the goblins in the coding dungeon
English
0
0
6
332
akira retweetledi
Pranjali Awasthi
Pranjali Awasthi@raidingAI·
Announcing @slashyai iMessage bot The first email client with a blue bubble You can do anything on Slashy via text. → Draft & send emails in seconds → Get pinged the second a customer emails you → Schedule or reschedule anything instantly → Build automations that actually work → Send voice memos → Literally anything else you could want And yes we still have a web/mobile/desktop app :) And yes we are still just $30 per month to get started
English
11
14
79
4.6K
arb8020
arb8020@arb8020·
if i had a nickel for every time there was a large twitter presence who went to go work there to work on performance and then left within the year. i'd have two nickels. which isn't a lot. but its weird that it happened twice.
arb8020 tweet media
English
1
0
9
1.2K
akira
akira@realmcore_·
I mean end to end dev setup actually! Certain dev setups lend themselves particularly well to specific models and more generally full autonomy Ex: I notice 5.4 Xhigh has a strong bias towards intermediate type validation/transformation If you codebase is written in a way where this is expected then the resulting code will be much more acceptable than in a codebase where there is no validation or where it is centralized Same for error handling patterns Anything in particular you find 5.4 to be better for than 5.3?
English
0
0
0
58
Ryan Brewer
Ryan Brewer@ryanbrewer·
@realmcore_ We have free will for model choices and aren’t constrained to the newest one if that’s what you mean. We haven’t changed much in terms of prompting / setup
English
1
0
0
58
akira
akira@realmcore_·
@ryanbrewer Presumably you also have everything set up for this to be the case?
English
1
0
0
64
Garry Fan
Garry Fan@VJain47·
@realmcore_ If the labs had me they would’ve solved everything
English
1
0
2
36
akira retweetledi
akira
akira@realmcore_·
@Gana_L_ So that you dear reader can share in the black hole sun of economically viable super intelligence (The alternative was all caps)
English
0
0
0
15
Gana
Gana@Gana_L_·
@realmcore_ Why are u talking like gpt 5.4?
English
1
0
1
37
akira
akira@realmcore_·
gpt 5.4 How do you guys get this model to not do random algorithmic garbage and just write straightforward procedural code I do not see why it should be this hard for a model to write code like a first year college student probably skill issue tbh
English
24
1
116
115.6K
akira
akira@realmcore_·
@merlindru Yeah opus is great at matching intent
English
0
0
1
54
merlin
merlin@merlindru·
@realmcore_ main reason i use Opus for anything beyond very focused changes i feel like if there's exactly one good way to do something, GPT-5.4 does beautifully as soon as something is even slightly open ended it goes haywire really hoping Spud fixes this (tmrw?)
English
1
0
0
163
akira
akira@realmcore_·
5.4 is so incredibly prone to overabstraction
English
5
0
38
5.4K