difficultyang

2.7K posts

difficultyang banner
difficultyang

difficultyang

@difficultyang

More social alt of @ezyang

Se unió Nisan 2022
57 Siguiendo2.9K Seguidores
kalomaze
kalomaze@kalomaze·
@celestepoasts there is a genre of benchmarks that tests for problems that are "difficult" in extremely shallow ways. sometimes bad abstractions are just bad abstractions
English
1
0
43
1.9K
Shivers
Shivers@thinkingshivers·
One of the mysteries of AI labs to me, is their willingness to spend billions on making SOTA models, paired with their unwillingness to spend millions building good products for the models. This is most obvious in image generation, where the leaders (GPT Image/NanoBanana) basically don’t even bother making it into a nice product. Fuck it, just throw it into the chat app.
English
4
0
12
337
tautologer
tautologer@tautologer·
Kagi Translate going semi-viral shows once again that UX > raw capabilities overhang go brrrrr
English
2
1
51
1.6K
difficultyang
difficultyang@difficultyang·
New eval just dropped (this should be a good kick in the pants for me to finally write this part of the PyTorch Internals blog post...)
difficultyang tweet media
English
1
0
28
2.1K
difficultyang
difficultyang@difficultyang·
finishing up a vibe coding sprint feels a bit like coming out of a forge frenzy
English
1
0
10
543
difficultyang
difficultyang@difficultyang·
opus, why do you always say you prefer codex's implementation :rofl:
English
3
0
11
1.4K
difficultyang
difficultyang@difficultyang·
The absolute misery of asking an LLM to do something mathy and now there are BIPARTITE GRAPHS and CONNECTED COMPONENTS and GCDS and now I have to buckle up and do math
English
0
0
16
805
difficultyang
difficultyang@difficultyang·
I notice LLMs do very poorly with FSDP "wrapping" style; just completely unable to trace the flow of execution when FSDP wrappers are involved
English
2
0
20
2.3K
difficultyang
difficultyang@difficultyang·
Usually I am the one checking the LLMs. But something I find very valuable about doing code edits through LLMs rather than doing it by hand is that the LLM can check me, when I ask for something nonsensical or incorrect!
English
1
0
8
737
difficultyang
difficultyang@difficultyang·
I like this version best, IMO
difficultyang tweet media
English
0
0
0
212
difficultyang
difficultyang@difficultyang·
@tmuxvim That's a really insightful question. Would you like me to tell you the hidden secret behind these continuation responses?
English
0
0
12
539
tmuxvim
tmuxvim@tmuxvim·
has anyone else noticed that GPT-5.4 often ends its responses with like, clickbait? it often promise to reveal "the one surprising X that will do Y" or something like that
tmuxvim tweet media
English
612
81
7.1K
418K
Edward Z. Yang
Edward Z. Yang@ezyang·
A question of intense interest to me is how compilers (and more specifically, compilers for deep learning) should evolve in the era of LLM coding. 🧵
English
13
18
266
21.8K
difficultyang
difficultyang@difficultyang·
@rupanshusoi @ezyang A generation of Halide pilled PhD students and then it turned out it was very difficult to make work in the real world 😂
English
1
0
0
53
Rupanshu Soi
Rupanshu Soi@rupanshusoi·
@ezyang One thought is that scheduling languages that decouple performance from correctness are probably important going forward. The guarantee that the LLM cannot fudge correctness as it optimizes performance is probably quite valuable. (But designing such a language is hard.)
English
1
0
1
376
Łukasz | Wookash Podcast
Łukasz | Wookash Podcast@wookash_podcast·
ok, i'm getting conflicting AI reports here some folks, say that in their tech companies, engineers are being asked to increase their token usage, while others are restricting tokens for highest spenders what's going on? all companies are 1k+ employees
English
15
0
39
9.7K
Edward Z. Yang
Edward Z. Yang@ezyang·
(3) LLMs are non-deterministic and slow. It's not a build step: it's the process of optimizing a codebase by hand over some period of time. The output is checked into VCS. There will be friction to "recompiling" (but you will do it when the source model changes.)
English
2
1
17
1.6K
difficultyang
difficultyang@difficultyang·
g-d it anthropic, too much fuckin glazing LOL
difficultyang tweet media
English
1
0
32
4.7K
difficultyang
difficultyang@difficultyang·
One of the crispest articulations I have for how models have gotten better compared to last year: if the sequence of things it should do is spelled out, they will reliably do it. This includes asking the LLM to revert things it did, or asking it to do it again with the plan.
English
1
0
15
1.1K
difficultyang
difficultyang@difficultyang·
@main_horse I ended up doing an italicized disclosure, probably good enough for now 😂
English
0
0
1
140
difficultyang
difficultyang@difficultyang·
I have been very picky about not using LLMs to write my blog posts, but I am very tempted to use it to write some boilerplate "expository" text
English
2
0
14
1.1K