Daniel May

415 posts

Daniel May banner
Daniel May

Daniel May

@danielrmay

🇬🇧 in los angeles 🇺🇸 // prev @VALORANT @riotgames @amazon @bnpparibas etc.

west los angeles Katılım Temmuz 2009
663 Takip Edilen464 Takipçiler
Sabitlenmiş Tweet
Daniel May
Daniel May@danielrmay·
obligatory gpu pics :))
Daniel May tweet media
0
0
1
29
hugh madden
hugh madden@dangerm00se·
so.. i've joined the qwen 3.5 fanclub. i'm running Qwen3.5-122B-A10B-GGUF/UD-Q6_K_XL split over an rtx 6000 300w version and a 5090.. and its really just killing it. @sudoingX @LottoLabs hermes openclaw and opencode
English
1
0
10
352
Daniel May
Daniel May@danielrmay·
@dangerm00se @LottoLabs @sudoingX i run 122b @ int4 @ 70tps on 2xA6000s, fwiw. i was expecting yours to be much faster but my assumption is there's some tensor/workload parallelization cost you're eating
English
1
0
1
7
hugh madden
hugh madden@dangerm00se·
Current numbers for the running 122 (sampled just now, 3 runs on /local122, thinking disabled for clean TTFT):TTFT (content token): 0.149s avg runs: 0.150s, 0.150s, 0.146s Generation speed (TPS): 67.43 tok/s avg runs: 66.71, 67.15, 68.44 tok/s llama.cpp command line (live PID 2117750): /home/turq/src/llama.cpp/build-dualarch/bin/llama-server -m /home/turq/models/Qwen3.5-122B-A10B-GGUF/UD-Q6_K_XL/Qwen3.5-122B-A10B-UD-Q6_K_XL-00001-of-00004.gguf --jinja --chat-template-file /home/turq/models/qwen3.5_chat_template.jinja -ngl 99 -c 262144 -np 1 -fa auto -ctk q4_0 -ctv q4_0 -sm layer -ts 3,1 --fit off --host 0.0.0.0 --port 18084
English
1
0
1
40
Daniel May retweetledi
Eli
Eli@rats7·
might be breaking an NDA by posting this but i got invited to the "ebay kitchen beta" and everyone needs to see this
Eli tweet media
English
475
2.1K
49.5K
2.6M
am.will
am.will@LLMJunky·
I had a few people laughing at me because I bought an RTX6000 Pro on credit without a plan. And you know what? They're right. It was FOMO. But there's a reason why. Check out this email I got from Newegg when I inquired about getting some DDR5 ECC memory. These shortages are real and not going anywhere, anytime soon. I was in a position where I could buy now and secure the best price possible, or risk watching GPUs and RAM continue to rise into unaffordability. For me, it felt like a now or never type of situation. Thankfully I have normal DDR5, board and CPU I can use. I have abandoned the threadripper path. Onwards!
am.will tweet media
am.will@LLMJunky

Finally proud to announce that I've joined the GPU Minor Leagues. 2 x RTX 6000 Pro. I have six months to pay off the second GPU lol. You are all TERRIBLE influences.

English
23
0
46
9K
Daniel May
Daniel May@danielrmay·
15 year old constructs AI courtroom using autonomous agents using llama3.1-8b on a 5070ti. Lack of inertia once again proving a lower barrier to entry than many anticipate
Daniel May tweet media
English
0
0
0
26
Daniel May retweetledi
Skyler Miao
Skyler Miao@SkylerMiao7·
M2.7 open weights coming in ~2 weeks. still actively iterating just updated a new version on yesterday — noticeably better on OpenClaw.
English
154
132
2.2K
288.4K
shaurya
shaurya@shauseth·
seeing 18 yos try to grind startups is heartbreaking to me. when i was that age i would spend all day in the college library finding obscure books nobody will ever read. not a single care about what i will do with that knowledge. easily the most valuable time of my life
shaurya tweet media
English
50
130
2.6K
135.5K
Daniel May
Daniel May@danielrmay·
This is why enterprise GPUs come with best in class dies, strict power ratings and improved thermal configurations. ECC and memory reliability can matter too. Thankfully for the rest of us, big enterprises cycle out this hardware regularly! (Soon to be less regular)
Ivan Fioravanti ᯅ@ivanfioravanti

MacBook M5 Max is super powerful but under heavy load fans make a lot of noise and heat is quite high. Clearly not ok for sustained AI load (training, benchmarking).

English
0
0
0
47
Daniel May
Daniel May@danielrmay·
Very well isolation of the risk involved with corporate adoption if not carefully considering provider relationship, maturity, internal ability
Mingta Kaivo 明塔 开沃@MingtaKaivo

Claude Code now runs scheduled tasks on cloud infra. @noahzweben just shipped it tonight. I currently have 3 cron jobs running on AudioWave's repo: 1. Daily 6am: run pytest on the audio pipeline, post results to Slack 2. Every 4 hours: check Supabase row counts against expected thresholds 3. Weekly Monday 9am: dependency audit + PR with updates Right now these run on a $5/mo DigitalOcean droplet I set up in January. 14 lines of crontab, a deploy key, and a shell script that breaks every time I change the project structure. If Claude Code's scheduler can point at my GitHub repo and run "check tests, alert on failure" without me maintaining infrastructure — that $5/mo droplet gets deleted tonight. The real question: what's the token budget per scheduled run? If it's pulling from your Max plan allocation, a daily pytest + Slack notification probably costs ~2K tokens per run. That's ~60K tokens/month on one job. Three jobs = 180K. Fine on Max, brutal on Pro's 45-minute limit. Watching the pricing closely before migrating.

English
0
0
0
30
Romy
Romy@Romy_Holland·
how is it that no TSA agent has ever seen a breast pump before?
English
12
0
107
2.6K
Daniel May
Daniel May@danielrmay·
a bit less facetiously: interacting with uis won't be replaced by agents but the construction of them will, and the barrier to making your own is going to significantly reduce too.
English
0
0
0
11
Daniel May
Daniel May@danielrmay·
there's a lot of confusion right now around what exactly openclaw is responsible for vs. the model. it might seem simple to familiar folks, but there are many excited consumers who have been exposed to openclaw but do not necessarily understand the ins and outs of agents or models or routing beyond simply seeing "personal private ai assistant" and wanting in. in openclaw groups on facebook (2 are 300k strong, it's painful, its research, meet your customers where they are) i see hundreds of the same posts per day from folks we would not colloquially put in the category of "software engineers" or even "tinkerers" complaining about token exhaustion (or worse, screenshots of them proudly spending hundreds or thousands on token costs). these people aren't engineers but they also aren't dumb - they quickly find local models and start to tinker but run into an extremely difficult setup experience that is only further muddled by the unique hardware constraints (this mac mini you told me to buy can't run 397B! - the complexity of model, quant, framework choice appears to be where most folks without real engineering or at least high level tinkering experience drop off) in my opinion, this group is the greatest source of local model demand (as well as the largest source of confusion about what agents can do for you) right now, and they are woefully uninformed, and so the original intent was to try and better clarify that these two workloads - the execution of openclaw (and its sub agents, ig) and the execution of model itself - do not need to be and probably shouldn't be shared i typed too much sorry lmk if that made sense
English
0
0
1
17
Zach Mueller
Zach Mueller@TheZachMueller·
@danielrmay Still not 100% there, but might be too in the weeds. E.g. the bench uses an OpenClaw instance to run the suite
English
1
0
0
40
Daniel May
Daniel May@danielrmay·
@TheZachMueller the purpose was just to disambiguate the work that openclaw does vs. the work that the model that openclaw utilizes does, and more specifically the resourcing requirements of the two.
English
1
0
0
20
Zach Mueller
Zach Mueller@TheZachMueller·
@danielrmay Sorry for the dumb question, but what else is there? Also it benchmarks more than just functional, e.g. blog writing, email drafting, and other non-direct tooling tasks. pinchbench.com/about
English
1
0
0
82