Joe Muller

10.3K posts

Joe Muller banner
Joe Muller

Joe Muller

@BosonJoe

Local AI enthusiast, part time philosopher

Virginia, USA Katılım Nisan 2019
965 Takip Edilen6.2K Takipçiler
Sabitlenmiş Tweet
Joe Muller
Joe Muller@BosonJoe·
Brain Spawn V6! 🍴 Forking 🗺️ Plan Mode 💬 Chat History 🏷️ Labeling Manage your agent swarm in VS Code ---->
English
1
1
8
1.9K
Bryan Johnson
Bryan Johnson@bryan_johnson·
This is it. Everything learned spending millions on longevity. From: Your Immortal Unc and Auntie. To: Our Immortal nieces and nephews. 0. Sleep is the world's most powerful drug. 1. Be in your bed for 8 hours 2. Same bedtime every night, any time before midnight 3. Don’t eat right before bed 4. Calm foods for dinner 5. No screens 1 hour before bed 6. Avoid added sugar (be aware it’s in everything) 7. Avoid all things in an American convenience store 8. Avoid fried foods 9. Shoes off at the door 10. Eat whole foods, particularly veggies fruits nuts legumes berries 11. Walk a little after meals or air squats 12. Get your heart rate high routinely 13. Lift heavy things 14. Stretch daily 15. Water pik, floss, brush, tongue scrape, morning and night 16. Make an effort to drink water 17. Get sunlight when you wake up (UV is low) 18. Protect skin in midday sun 19. Stand up straight 20. See at least one friend once a week 21. Avoid plastic where you can (in all things) 22. Circulate air in rooms 23. When stressed, breathe, learn to calm your body 24. Go to the dentist 25. Avoid sitting for long times 26. Protect your hearing, the world is too loud 27. Alcohol is bad for you 28. Finish coffee before noon 29. Avoid bright lights after sunset 30. If obese, look into a GLP 31. Sleep in a cold room 32. Texting while driving is dangerous 33. Turn off all notifications 34. Limit social media use 35. Don’t smoke anything 36. If you struggle to sleep, read a physical book before bed 37. 1 hour before bed have a calm wind down routine: bath, read, light walk, listen to music 38. The body is a clock and loves routine. Have a daily morning and evening schedule. 39. Avoid long distance travel where you can 40. Baby steps first: incorporate new things slowly 41. Do less… most things don’t work. Bonus points if you get your blood checked. Start here, it will change your life.
English
1K
4.7K
42.5K
5.5M
ClaudeDevs
ClaudeDevs@ClaudeDevs·
Usage limits are up, effective today we're: 1) Doubling Claude Code's 5-hour limits for Pro, Max, Team and seat-based Enterprise plans 2) Removing peak hours limit reduction on Claude Code for Pro and Max plans 3) Substantially raising our API rate limits for Opus models
Claude@claudeai

We’ve agreed to a partnership with @SpaceX that will substantially increase our compute capacity. This, along with our other recent compute deals, means that we’ve been able to increase our usage limits for Claude Code and the Claude API.

English
1.5K
3.2K
41.4K
3.9M
Claude
Claude@claudeai·
Effective today, we are: 1) Doubling Claude Code’s 5-hour rate limits for Pro, Max, and Team plans; 2) Removing the peak hours limit reduction on Claude Code for Pro and Max plans; and 3) Substantially raising our API rate limits for Opus models.
English
1.3K
4K
44.6K
9M
Claude
Claude@claudeai·
We’ve agreed to a partnership with @SpaceX that will substantially increase our compute capacity. This, along with our other recent compute deals, means that we’ve been able to increase our usage limits for Claude Code and the Claude API.
English
4.8K
12.1K
131K
23.7M
Joe Muller
Joe Muller@BosonJoe·
@zhijianliu_ Would this work with a quant of the Gemma 4 31B model? Ex. LilaRest/gemma-4-31B-it-NVFP4-turbo Or does it only work with the full model?
English
0
0
1
351
Zhijian Liu
Zhijian Liu@zhijianliu_·
DFlash for Gemma 4: Up to 6x Faster. ⚡⚡ Great to see MTP land natively in Gemma 4 today. If you want to push it further, try DFlash — open source, same quality, more speed!! github.com/z-lab/dflash
Google for Developers@googledevs

Gemma 4: Now up to 3x Faster. ⚡ Same quality, way more speed. Our new MTP drafters allow Gemma 4 to predict multiple tokens at once, effectively tripling your output speed without compromising intelligence.

English
73
183
1.5K
464K
Joe Muller
Joe Muller@BosonJoe·
@sudoingX How do you give the model access to a browser?
English
0
0
0
107
Sudo su
Sudo su@sudoingX·
qwen 3.6-27b dense q4 on a single 3090 just knocked down 10 out of 10 tests at 40 tok/s on the first particle swarm benchmark i wrote for local agentic coding. it built the two files exactly to spec. then it used browser automation tools to open the page, read the test hud, find the failing tests, iterated through the code, patched tests.js, adjusted hue mapping, boosted mouse attraction force, and landed all 10 green checkmarks. this was actual dev behavior. not just generating code but also debugging its own output. i spent the next 8 minutes playing with the result. the boids flocking feels alive, the trail-blend is cinematic, three palettes cycle with space, mouse burst fires particles from click, drag paints a line through the swarm. this is what local ai looks like now. single 3090. hermes agent. no frameworks. no tricks. dropping the video next.
Sudo su tweet mediaSudo su tweet mediaSudo su tweet mediaSudo su tweet media
Sudo su@sudoingX

dude! the new qwen 3.6-27b dense is hammering my single 3090 at 100% gpu utilization. the spiky pattern on nvtop is the hermes agent autonomously thinking, calling tools, reading results, thinking again. this model is so cool to talk to. waits for tool outputs, reads them, selfcorrects, keeps going. no stalls, no loops, no hand holding. anyone running a single 3090 or any 24gb tier card should try this. same llama.cpp flags from last sweep, same hermes agent install. three commands and you are watching your own hardware think.

English
15
9
127
77.1K
Henry Shi
Henry Shi@henrythe9ths·
Tax season is here and a connector is all it takes to make @claudeai way more useful. Checkout what we just shipped: Connect TurboTax or Aiwyn Tax (formerly Column Tax) to Claude to estimate your refund, see what you may owe, and get a better understanding on the forms before you file.
Henry Shi tweet mediaHenry Shi tweet media
English
81
87
1.3K
1M
snoopy jpg
snoopy jpg@snoopy_dot_jpg·
so, i didn’t end up with an offer from , which in truth is a bit of a gut punch, but two really positive things that came out of all this effort and toil: - i received feedback that i comfortably met the technical bar and passed the interview loop. that was satisfying to hear! the decision came down to not finding a team fit, which is pretty lame tbqh - the process has been really rewarding. it ruined my life for a few weeks, but it also activated me enough to commit a serious amount of time to upleveling myself. i’ve gotten a much deeper grounding in RL, done a lot of really fun experimentation, and gained a lot of precision in how i work through ideas. this was really cool! overall 2/10 experience. do not recommend. but fun in a sick way
English
34
5
644
51K
Joe Muller
Joe Muller@BosonJoe·
@sudoingX @stableAPY How do you recommend getting those numbers? I have a 5090 rtx but no experience with real benchmarking. Do you have a go to suite of tests for new models? Thanks!
English
0
0
0
30
Sudo su
Sudo su@sudoingX·
@stableAPY drop your numbers when you test it. curious how mamba performs on metal vs cuda.
English
1
0
4
1.4K
Sudo su
Sudo su@sudoingX·
i pointed hermes agent at nvidia's nemotron cascade 2 30B-A3B on a single RTX 3090 24GB. IQ4_XS quant by bartowski, 187 tok/s, 625K context. had it discover its own hardware, create an identity file, then build a full GPU marketplace UI from a single prompt. it one shotted it. first attempt no iteration. qwen 3.5 35B-A3B on the same hardware same 3090 24GB took an iteration to recover from a blank screen on the same type of build. 24 days between these two models releasing. same active parameters, completely different architectures and cascade 2 through hermes agent just keeps going. this model goes on and on. feast your eyes. more iterations and tests dropping soon. nvidia really cooked. no special flags needed. nvidia optimized this mamba MoE so well it just runs. flash attention auto enabled, context auto allocated. the model does the work not the config. but i compiled llama.cpp from source and i'm not sure how it performs on other engines. if you ran nemotron on any hardware drop your numbers below. RTX, AMD, Mac, whatever. model, quant, tok/s, engine. i want to see if it holds everywhere or just on llama.cpp.
Sudo su tweet mediaSudo su tweet mediaSudo su tweet media
Sudo su@sudoingX

nvidia's 3B mamba destroyed alibaba's 3B deltanet on the same RTX 3090. only 24 days between releases. same active parameters, same VRAM tier, completely different architectures. nemotron cascade 2: 187 tok/s. flat from 4K to 625K context. zero speed loss. flags: -ngl 99 -np 1. that's it. no context flags, no KV cache tricks. auto-allocates 625K. qwen 3.5 35B-A3B: 112 tok/s. flat from 4K to 262K context. zero speed loss. flags: -ngl 99 -np 1 -c 262144 --cache-type-k q8_0 --cache-type-v q8_0. needed KV cache quantization to fit 262K. both models held a flat line across every context level. both architectures are context-independent. but nvidia's mamba2 is 67% faster at generating tokens on the exact same hardware and needs fewer flags to get there. same node, same GPU, same everything. the only variable is the model. gold medal math olympiad winner running at 187 tokens per second on single RTX 3090 a card from 6 years ago. nvidia cooked.

English
64
53
764
70.4K
Joe Muller
Joe Muller@BosonJoe·
@JohnGoldman I've been running everyday for the last month and have seen similar improvements. How long are you running each day?
English
1
0
2
1.1K
John Goldman ☀️
John Goldman ☀️@JohnGoldman·
Running consistently changes the way your heart performs. Age 50. Resting heart rate: 39. 39! Down from the 60’s and 70’s.
John Goldman ☀️ tweet media
English
104
8
492
145.1K
Joel - coffee/acc
Joel - coffee/acc@JoelDeTeves·
Qwen3.5-27B-GGUF with hermes agent is the way
English
14
6
284
14.8K
yutish
yutish@star_yutish·
i’m 19. i built an ai that lives in your dynamic island. reads emails. navigates. books Uber. you just talk. try out for free on the app store.
English
15
1
116
72.8K
Joe Muller
Joe Muller@BosonJoe·
Whatever the anthropic team did to Claude Code is really making my day worse 😓 Can't use ctrl+v Can't use ctrl+c Can't use any other keybindings while claude is in focus because it intercepts everything Pls fix 🙏 @trq212
Joe Muller tweet media
English
0
0
1
340
Joe Muller
Joe Muller@BosonJoe·
Bad things happen when you go this fast
Joe Muller tweet media
English
0
0
2
200
Joe Muller
Joe Muller@BosonJoe·
@gabriel1 You have too many ideas to be stuck at OpenAI
English
0
0
0
95
gabriel
gabriel@gabriel1·
on top of normal healthcare, for low stakes curiosity, i'd love a cheap clinic with all screening equipment. no doctors and no diagnoses i can book any screening, and they send files that ai reads. and i add it to my health context so i can continue asking chatgpt questions
English
48
8
471
32.5K
Joe Muller
Joe Muller@BosonJoe·
@gabriel1 You have too many ideas to be a wage slave at OpenAI
English
0
0
2
251
Joe Muller retweetledi
Google Research
Google Research@GoogleResearch·
Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: goo.gle/4bsq2qI
GIF
English
1K
5.8K
39K
19.3M
Joe Muller
Joe Muller@BosonJoe·
@giladvdn I imagine it was unintentional and it's probably related to the fancy stuff they've been doing related to pasting images Regardless, my hotkeys all no longer work
English
0
0
0
14
Joe Muller
Joe Muller@BosonJoe·
Claude Code just broke the basic paste functionality. Now you can't use Cmd+V Iou have to right-click and select paste 😑
English
1
0
1
227