spion

29.2K posts

spion

spion

@spion

Fullstack SWE. ex-Apple. Prefer insightful discussion to debate. Rust, TypeScript, Effect, SolidJS, localfirst, devops, keto, stats/science, audio/DSP

London, England Katılım Şubat 2008
1.2K Takip Edilen1.4K Takipçiler
spion
spion@spion·
@thdxr did LLMs cause open source software developers to re-evaluate if they should continue to publish open source?
English
0
0
0
41
spion
spion@spion·
@onehappyfellow (Under the constraints of a non-GCed, high-performance language that is memory-safe, its actually not bad at all. The issue is that GCs exist and are good)
English
1
0
4
100
spion
spion@spion·
@onehappyfellow Rust is not tasteless, its just that the constraints the language picked to start with were unfortunate.
English
1
0
3
323
One Happy Fellow
One Happy Fellow@onehappyfellow·
rust is a good language, it's just tasteless go is a programming language
English
28
7
176
10.6K
spion retweetledi
David Cramer
David Cramer@zeeg·
1) not surprising whatsoever 2) this is exactly what I keep saying about models not being powerful enough today the fact that they can do so much with lossy compression is amazing, but there's no magic here imo (for transformers) context windows need to be 1-2 orders of magnitude larger for the future people keep saying is reality, and even then the compute is probably not worth it
Lossfunk@lossfunk

🚨 Shocking: Frontier LLMs score 85-95% on standard coding benchmarks. We gave them equivalent problems in languages they couldn't have memorized. They collapsed to 0-11%. Presenting EsoLang-Bench. Accepted to the Logical Reasoning and ICBINB workshops at ICLR 2026 🧵

English
14
4
103
9.8K
Dmitrii Kovanikov
Dmitrii Kovanikov@ChShersh·
Everyone talks about scalability. Nobody talks about cpluplusability.
English
24
6
166
6.8K
spion
spion@spion·
@DanielW_Kiwi (things = code output. models we useful before that too for understanding and debugging and small tightly supervised changes)
English
1
0
2
21
Daniel 🦔
Daniel 🦔@DanielW_Kiwi·
I'm really interested in knowing what causes this assessment vs the 1000x results others are claiming. It's very hard to understand the real differences here. Is it a difference in opinion on code quality. Is it a difference in driving the tools?
vaxry@vaxryy

All the AI talk, so I actually tried, but after 400 thousand tokens the result is pretty bad, I am writing this by hand. It will take days instead of 2 hours but at least it will work properly...

English
17
0
19
2.1K
spion
spion@spion·
@headinthebox @boleroo wishlist: language where types are versioned entities in append-only store; compiler generates bidirectional transforms between versions (you can override default), database data tagged with schema version, automatically project through transforms on read
English
0
0
0
4
spion
spion@spion·
@headinthebox @boleroo Thats true, but you can't cheaply validate them unless you are the only user. And if there are other users and you come in with zero commitments, they may be less invested in committing their time too.
English
1
0
1
9
spion
spion@spion·
@headinthebox maybe you should then play for more than 10 minutes?
English
0
0
2
88
Erik Meijer
Erik Meijer@headinthebox·
I mean, I get the Luddites who see their craft evaporate in front of their eyes. But if you played with one of the coding agents for just 10 minutes, it must be crystal clear what the future is.
English
13
2
36
5.2K
spion
spion@spion·
@zeeg brb writing complain/SKILL.md
English
0
0
0
40
spion
spion@spion·
@zeeg Which does get me thinking. What if we define a user-agent? 😀
English
1
0
0
40
David Cramer
David Cramer@zeeg·
where's all those billion dollar businesses built from gas town and other slop farms? oh
English
26
9
359
43.8K