Todd Fisher

4.8K posts

Todd Fisher banner
Todd Fisher

Todd Fisher

@taf2

Living and working on https://t.co/eminYBe5tr

Maryland, US Entrou em Ağustos 2008
463 Seguindo329 Seguidores
jason liu
jason liu@jxnlco·
When you gotta bike home from work but your codex needs to finish a task.
English
205
59
1.8K
243.3K
Todd Fisher
Todd Fisher@taf2·
@GregKamradt So I ran a llm fine tune over the weekend and it took many hours - codex monitors the long running process and can even send text message updates via CTM or any sms capable service - it’s really good at babysitting a long running process
English
0
0
0
19
Greg Kamradt
Greg Kamradt@GregKamradt·
When OAI employees say, “I let codex run all night…” What framework they use? How do you set up the task so it has enough work for 8 hours?
English
297
17
1.8K
323.9K
Todd Fisher
Todd Fisher@taf2·
Best usecase while outside waking for local ai on my phone - a personal pocket heater.
English
0
0
0
7
Ash DCosta
Ash DCosta@softwareweaver·
@vllm_project Cool. Does this version work well with Codex? Codex cli was giving me errors connecting when using vllm to host Qwen 3.6 27B model with the responses api compatibility issues.
English
1
0
2
569
vLLM
vLLM@vllm_project·
vLLM v0.20.0 is here! 752 commits from 320 contributors (123 new). 🎉 Highlights: DeepSeek V4, Hunyuan v3 preview support, CUDA 13 / PyTorch 2.11 / Transformers v5 baseline, FA4 as default MLA prefill, TurboQuant 2-bit KV (4× capacity), vLLM IR foundation. Thread 👇
vLLM tweet media
English
21
78
666
65.8K
Todd Fisher retweetou
Stefan Streichsbier
Stefan Streichsbier@s_streichsbier·
As requested, I also asked GPT-5.5 (low) and Opus 4.7 (high) to fix the same bug. GPT-5.5 (low) identified the correct root cause in 4m 14s and produced an almost identical fix in 2m 47s, using 164k tokens in total. Opus 4.7 (high) “churned” for 6m 23s, using 87.7k tokens, and went down a completely wrong path. It’s not even close, but that’s not a surprise.
Stefan Streichsbier@s_streichsbier

I've completely changed my mind about 5.4 vs 5.5. Gave them the exact same task to investigate a fairly tricky bug. GPT-5.5 identified the bug and proposed a fix in 6m 59s using 117k tokens. GPT-5.4 took 8m 51s using 201k tokens, but it didn't find the bug and is asking for more information to investigate. Call me impressed.

English
90
88
2.1K
422.4K
Todd Fisher
Todd Fisher@taf2·
qwen models are good but the censorship in these models can really cause problems with refusals on text that might be mis-taken as political. working on abliteration now to see if maybe these can be safer
English
0
0
0
8
Todd Fisher
Todd Fisher@taf2·
@LLMJunky Did neural net stuff in college 2001-2002 but considering where ai is not I can’t claim any real involvement until 2023 and even still I’d just consider myself a casual consumer of it
English
1
0
1
89
am.will
am.will@LLMJunky·
How long have you been in AI? Where the OGs at? 👇
English
57
0
46
7.6K
Luce
Luce@lucyshow11·
What’s up doc!!! 🤪
Luce tweet media
English
307
4K
38K
2.3M
Tibo
Tibo@thsottiaux·
@raffichill Me too, me too. What a time to be alive
English
31
0
264
10K
Raffi
Raffi@raffichill·
I let Codex use my computer today
English
4
0
158
11.9K
Nate Berkopec
Nate Berkopec@nateberkopec·
If you're doing AI dev, you need to act like your system is rooted by North Korea. You cannot leave knives out in the kitchen, you cannot leave the passwords out on the counter. People are putting too much trust in alignment and not doing enough to "keep honest agents honest".
English
4
7
63
4.2K
Todd Fisher retweetou
Ahmad
Ahmad@TheAhmadOsman·
PRO TIP vLLM telling you to use `--enforce-eager` to avoid OOM because CUDA Graphs “don’t have enough VRAM”? Don’t jump straight to eager mode Try this first: - lower `--max-model-len`, ex: 4k - let CUDA Graph compile (which will be cached by torch.compile) - restart, then raise context back up You can keep the CUDA Graph performance gains without hitting OOM
Ahmad tweet media
English
14
17
214
10K
Todd Fisher
Todd Fisher@taf2·
They used my code to create 5.5 - I’m using it to replace itself
English
0
0
1
10
Todd Fisher
Todd Fisher@taf2·
Doing data privacy cleanup using new OpenAI model and set codex up with sms notify script via CTM to give me status updates … codex session been running for 20+ minutes just cooking while do other things
Todd Fisher tweet media
English
0
0
0
17
Nandkishor
Nandkishor@devops_nk·
AWS Lambda in 40 seconds.
English
52
208
3.1K
199.3K
Todd Fisher
Todd Fisher@taf2·
@TheAhmadOsman wait a second are you saying i can troll claude by mentioning HERMES.md ? 🤣
English
1
0
6
1.5K
Ahmad
Ahmad@TheAhmadOsman·
Anthropic is not a serious company lmao
Om Patel@om_patel5

THIS GUY LOST $200 IN ONE DAY BECAUSE THE STRING "HERMES.md" WAS IN HIS GIT COMMITS HERMES.md is a real convention used in AI agent projects. it's a system prompt specification file. not some obscure edge case he's on claude max 20x at $200 a month. yesterday claude code hit him with "you're out of extra usage" out of nowhere his dashboard showed 13% weekly usage. 0% current session. 86% of his plan was sitting there untouched but $200.98 in extra usage already burned through what should have been covered by his subscription he tried logout & login, different models, fresh installs and nothing worked anthropic support sent the ai bot (four rounds of the same scripted response). eventually they just gave up on him so he started binary searching repos and commits manually on his own time until he found the trigger the string "HERMES.md" in a recent git commit message uppercase, with the .md extension, anywhere in your commit history that's it claude code includes recent commits in its system prompt and something server side flags HERMES.md and quietly routes you off your max plan onto API rate billing > AGENTS.md? fine > README.md? fine > HERMES without .md? fine > lowercase hermes.md? fine > uppercase HERMES.md? you're getting charged API rates he reported it. anthropic support acknowledged the bug three times, called it an "authentication routing issue", thanked him for finding it then refused to refund the $200 so the man pays $200 a month for max, lost another $200 to a billing bug they confirmed, did anthropic's QA work for free on his weekend, and got a "thank you for your patience" in return check your commit history before claude code quietly drains your account too

English
26
41
913
121.4K