Daniel May

415 posts

Daniel May

@danielrmay

🇬🇧 in los angeles 🇺🇸 // prev @VALORANT @riotgames @amazon @bnpparibas etc.

west los angeles Katılım Temmuz 2009

663 Takip Edilen464 Takipçiler

Sabitlenmiş Tweet

Daniel May@danielrmay·3d

obligatory gpu pics :))

Daniel May@danielrmay·10m

@corg_e opnsense

Nederlands

corgi@corg_e·42m

seriously though, are we supposed to just buy spacex now? how do they ban foreign routers when there’s only one US option… or is that the point?

sysadafterdark@sysadafterdark

The FCC has banned all foreign made consumer routers. fcc.gov/document/fcc-u…

English

554

Daniel May@danielrmay·14m

@dangerm00se @LottoLabs @sudoingX ya

hugh madden@dangerm00se·14m

@danielrmay @LottoLabs @sudoingX The 5090 is on pciex8 I need a riser, yours on 2x16?

English

hugh madden@dangerm00se·16h

so.. i've joined the qwen 3.5 fanclub. i'm running Qwen3.5-122B-A10B-GGUF/UD-Q6_K_XL split over an rtx 6000 300w version and a 5090.. and its really just killing it. @sudoingX @LottoLabs hermes openclaw and opencode

English

352

Daniel May@danielrmay·17m

@dangerm00se @LottoLabs @sudoingX i run 122b @ int4 @ 70tps on 2xA6000s, fwiw. i was expecting yours to be much faster but my assumption is there's some tensor/workload parallelization cost you're eating

English

hugh madden@dangerm00se·16h

Current numbers for the running 122 (sampled just now, 3 runs on /local122, thinking disabled for clean TTFT):TTFT (content token): 0.149s avg runs: 0.150s, 0.150s, 0.146s Generation speed (TPS): 67.43 tok/s avg runs: 66.71, 67.15, 68.44 tok/s llama.cpp command line (live PID 2117750): /home/turq/src/llama.cpp/build-dualarch/bin/llama-server -m /home/turq/models/Qwen3.5-122B-A10B-GGUF/UD-Q6_K_XL/Qwen3.5-122B-A10B-UD-Q6_K_XL-00001-of-00004.gguf --jinja --chat-template-file /home/turq/models/qwen3.5_chat_template.jinja -ngl 99 -c 262144 -np 1 -fa auto -ctk q4_0 -ctv q4_0 -sm layer -ts 3,1 --fit off --host 0.0.0.0 --port 18084

English

Daniel May@danielrmay·1h

blocking agent traffic will be commonplace and it's going to harm the frontiers the worst

Mark Cuban@mcuban

Have we seen an “Agent” DDOS attack yet ? Isn’t it inevitable ?

English

Daniel May retweetledi

Eli@rats7·1d

might be breaking an NDA by posting this but i got invited to the "ebay kitchen beta" and everyone needs to see this

English

475

2.1K

49.5K

2.6M

Daniel May@danielrmay·2h

@LLMJunky what are u doing with it now

English

am.will@LLMJunky·9h

I had a few people laughing at me because I bought an RTX6000 Pro on credit without a plan. And you know what? They're right. It was FOMO. But there's a reason why. Check out this email I got from Newegg when I inquired about getting some DDR5 ECC memory. These shortages are real and not going anywhere, anytime soon. I was in a position where I could buy now and secure the best price possible, or risk watching GPUs and RAM continue to rise into unaffordability. For me, it felt like a now or never type of situation. Thankfully I have normal DDR5, board and CPU I can use. I have abandoned the threadripper path. Onwards!

am.will@LLMJunky

Finally proud to announce that I've joined the GPU Minor Leagues. 2 x RTX 6000 Pro. I have six months to pay off the second GPU lol. You are all TERRIBLE influences.

English

Daniel May@danielrmay·1d

lots of unhappy individuals now as a result of not understanding this

Ahmad@TheAhmadOsman

When running LLMs locally, the bottleneck isn’t just “VRAM size” It’s: - memory bandwidth - interconnect (PCIe vs NVLink vs RDMA) - inference engine (vLLM, TensorRT-LLM, SGLang) Unified Memory is way slower than VRAM btw

English

Daniel May@danielrmay·1d

15 year old constructs AI courtroom using autonomous agents using llama3.1-8b on a 5070ti. Lack of inertia once again proving a lower barrier to entry than many anticipate

English

Daniel May retweetledi

Skyler Miao@SkylerMiao7·1d

M2.7 open weights coming in ~2 weeks. still actively iterating just updated a new version on yesterday — noticeably better on OpenClaw.

English

154

132

2.2K

288.4K

Daniel May@danielrmay·2d

does it still involve sharing all my data

ℏεsam@Hesamation

guys I think Anthropic is in the final stages of building a cheaper and safer OpenClaw. > integrated with your Claude subscription so you won’t need to pay for additional APIs > no need for Mac Minis > better memory and context > fewer security holes and vulnerabilities

English

Daniel May@danielrmay·2d

@shauseth what was your favorite

English

shaurya@shauseth·2d

seeing 18 yos try to grind startups is heartbreaking to me. when i was that age i would spend all day in the college library finding obscure books nobody will ever read. not a single care about what i will do with that knowledge. easily the most valuable time of my life

English

130

2.6K

135.5K

Daniel May@danielrmay·2d

This is why enterprise GPUs come with best in class dies, strict power ratings and improved thermal configurations. ECC and memory reliability can matter too. Thankfully for the rest of us, big enterprises cycle out this hardware regularly! (Soon to be less regular)

Ivan Fioravanti ᯅ@ivanfioravanti

MacBook M5 Max is super powerful but under heavy load fans make a lot of noise and heat is quite high. Clearly not ok for sustained AI load (training, benchmarking).

English

Daniel May@danielrmay·2d

Very well isolation of the risk involved with corporate adoption if not carefully considering provider relationship, maturity, internal ability

Mingta Kaivo 明塔开沃@MingtaKaivo

Claude Code now runs scheduled tasks on cloud infra. @noahzweben just shipped it tonight. I currently have 3 cron jobs running on AudioWave's repo: 1. Daily 6am: run pytest on the audio pipeline, post results to Slack 2. Every 4 hours: check Supabase row counts against expected thresholds 3. Weekly Monday 9am: dependency audit + PR with updates Right now these run on a $5/mo DigitalOcean droplet I set up in January. 14 lines of crontab, a deploy key, and a shell script that breaks every time I change the project structure. If Claude Code's scheduler can point at my GitHub repo and run "check tests, alert on failure" without me maintaining infrastructure — that $5/mo droplet gets deleted tonight. The real question: what's the token budget per scheduled run? If it's pulling from your Max plan allocation, a daily pytest + Slack notification probably costs ~2K tokens per run. That's ~60K tokens/month on one job. Three jobs = 180K. Fine on Max, brutal on Pro's 45-minute limit. Watching the pricing closely before migrating.

English

Daniel May@danielrmay·2d

personally I don’t yet feel ready to give this big seal of approval, but I’m working on it (with 122b admittedly)

Benjamin Marie@bnjmn_marie

For OpenClaw, just use Qwen3.5 27B! Q4 GGUFs match the original's model accuracy You don't need expensive hardware or models

English

Daniel May@danielrmay·2d

@Romy_Holland pregnancy requires sex

English

Romy@Romy_Holland·2d

how is it that no TSA agent has ever seen a breast pump before?

English

107

2.6K

Daniel May@danielrmay·2d

a bit less facetiously: interacting with uis won't be replaced by agents but the construction of them will, and the barrier to making your own is going to significantly reduce too.

English

Daniel May@danielrmay·2d

websites are not going to be replaced with apps only zoomers want to do things on their phones normal people want keyboards and mice

Kyle Gawley@kylegawley

UIs are not going to be replaced with agents Only lonely nerds want to sit and chat with their computer Normal people just click buttons

English

Daniel May@danielrmay·2d

there's a lot of confusion right now around what exactly openclaw is responsible for vs. the model. it might seem simple to familiar folks, but there are many excited consumers who have been exposed to openclaw but do not necessarily understand the ins and outs of agents or models or routing beyond simply seeing "personal private ai assistant" and wanting in. in openclaw groups on facebook (2 are 300k strong, it's painful, its research, meet your customers where they are) i see hundreds of the same posts per day from folks we would not colloquially put in the category of "software engineers" or even "tinkerers" complaining about token exhaustion (or worse, screenshots of them proudly spending hundreds or thousands on token costs). these people aren't engineers but they also aren't dumb - they quickly find local models and start to tinker but run into an extremely difficult setup experience that is only further muddled by the unique hardware constraints (this mac mini you told me to buy can't run 397B! - the complexity of model, quant, framework choice appears to be where most folks without real engineering or at least high level tinkering experience drop off) in my opinion, this group is the greatest source of local model demand (as well as the largest source of confusion about what agents can do for you) right now, and they are woefully uninformed, and so the original intent was to try and better clarify that these two workloads - the execution of openclaw (and its sub agents, ig) and the execution of model itself - do not need to be and probably shouldn't be shared i typed too much sorry lmk if that made sense

English

Zach Mueller@TheZachMueller·2d

@danielrmay Still not 100% there, but might be too in the weeds. E.g. the bench uses an OpenClaw instance to run the suite

English

Daniel May@danielrmay·2d

to be more clear, you can "run the model that openclaw utilizes as for its reasoning, tool selection and decisionmaking" on a 16gb card

Zach Mueller@TheZachMueller

Most impressive part about this for me is you can run your OpenClaw, with decently reliable performance, on a 16GB card. Intelligence too cheap to meter.

English

113

Daniel May@danielrmay·2d

@TheZachMueller the purpose was just to disambiguate the work that openclaw does vs. the work that the model that openclaw utilizes does, and more specifically the resourcing requirements of the two.

English

Zach Mueller@TheZachMueller·2d

@danielrmay Sorry for the dumb question, but what else is there? Also it benchmarks more than just functional, e.g. blog writing, email drafting, and other non-direct tooling tasks. pinchbench.com/about

English

Daniel May@danielrmay·2d

"how you wanna take pics with the belts in the ring, but don't help in the gym?"

Ahmad@TheAhmadOsman

seriously, if you don't want your friends to win you've gotta be dumb it's a positive sum game, lift your friends up and good karma will find its way back to you

English

Keşfet

@corg_e @dangerm00se @LottoLabs @sudoingX @LLMJunky @shauseth @elonmusk @BarackObama