Joe Muller

10.3K posts

Joe Muller

@BosonJoe

Local AI enthusiast, part time philosopher

Virginia, USA Katılım Nisan 2019

965 Takip Edilen6.2K Takipçiler

Sabitlenmiş Tweet

Joe Muller@BosonJoe·6 Mar

Brain Spawn V6! 🍴 Forking 🗺️ Plan Mode 💬 Chat History 🏷️ Labeling Manage your agent swarm in VS Code ---->

English

1.9K

Joe Muller@BosonJoe·1d

@bryan_johnson No sauna, this ain't the real Bryn

English

100

Bryan Johnson@bryan_johnson·2d

This is it. Everything learned spending millions on longevity. From: Your Immortal Unc and Auntie. To: Our Immortal nieces and nephews. 0. Sleep is the world's most powerful drug. 1. Be in your bed for 8 hours 2. Same bedtime every night, any time before midnight 3. Don’t eat right before bed 4. Calm foods for dinner 5. No screens 1 hour before bed 6. Avoid added sugar (be aware it’s in everything) 7. Avoid all things in an American convenience store 8. Avoid fried foods 9. Shoes off at the door 10. Eat whole foods, particularly veggies fruits nuts legumes berries 11. Walk a little after meals or air squats 12. Get your heart rate high routinely 13. Lift heavy things 14. Stretch daily 15. Water pik, floss, brush, tongue scrape, morning and night 16. Make an effort to drink water 17. Get sunlight when you wake up (UV is low) 18. Protect skin in midday sun 19. Stand up straight 20. See at least one friend once a week 21. Avoid plastic where you can (in all things) 22. Circulate air in rooms 23. When stressed, breathe, learn to calm your body 24. Go to the dentist 25. Avoid sitting for long times 26. Protect your hearing, the world is too loud 27. Alcohol is bad for you 28. Finish coffee before noon 29. Avoid bright lights after sunset 30. If obese, look into a GLP 31. Sleep in a cold room 32. Texting while driving is dangerous 33. Turn off all notifications 34. Limit social media use 35. Don’t smoke anything 36. If you struggle to sleep, read a physical book before bed 37. 1 hour before bed have a calm wind down routine: bath, read, light walk, listen to music 38. The body is a clock and loves routine. Have a daily morning and evening schedule. 39. Avoid long distance travel where you can 40. Baby steps first: incorporate new things slowly 41. Do less… most things don’t work. Bonus points if you get your blood checked. Start here, it will change your life.

English

4.7K

42.5K

5.5M

Joe Muller@BosonJoe·7 May

@ClaudeDevs Rate limits or session limits?

English

160

ClaudeDevs@ClaudeDevs·6 May

Usage limits are up, effective today we're: 1) Doubling Claude Code's 5-hour limits for Pro, Max, Team and seat-based Enterprise plans 2) Removing peak hours limit reduction on Claude Code for Pro and Max plans 3) Substantially raising our API rate limits for Opus models

Claude@claudeai

We’ve agreed to a partnership with @SpaceX that will substantially increase our compute capacity. This, along with our other recent compute deals, means that we’ve been able to increase our usage limits for Claude Code and the Claude API.

English

1.5K

3.2K

41.4K

3.9M

Joe Muller@BosonJoe·6 May

@claudeai Rate limits, not session limits bah

English

Claude@claudeai·6 May

Effective today, we are: 1) Doubling Claude Code’s 5-hour rate limits for Pro, Max, and Team plans; 2) Removing the peak hours limit reduction on Claude Code for Pro and Max plans; and 3) Substantially raising our API rate limits for Opus models.

English

1.3K

44.6K

Claude@claudeai·6 May

English

4.8K

12.1K

131K

23.7M

Joe Muller@BosonJoe·6 May

@zhijianliu_ Would this work with a quant of the Gemma 4 31B model? Ex. LilaRest/gemma-4-31B-it-NVFP4-turbo Or does it only work with the full model?

English

351

Zhijian Liu@zhijianliu_·6 May

DFlash for Gemma 4: Up to 6x Faster. ⚡⚡ Great to see MTP land natively in Gemma 4 today. If you want to push it further, try DFlash — open source, same quality, more speed!! github.com/z-lab/dflash

Google for Developers@googledevs

Gemma 4: Now up to 3x Faster. ⚡ Same quality, way more speed. Our new MTP drafters allow Gemma 4 to predict multiple tokens at once, effectively tripling your output speed without compromising intelligence.

English

183

1.5K

464K

Joe Muller@BosonJoe·24 Nis

@sudoingX How do you give the model access to a browser?

English

107

Sudo su@sudoingX·23 Nis

qwen 3.6-27b dense q4 on a single 3090 just knocked down 10 out of 10 tests at 40 tok/s on the first particle swarm benchmark i wrote for local agentic coding. it built the two files exactly to spec. then it used browser automation tools to open the page, read the test hud, find the failing tests, iterated through the code, patched tests.js, adjusted hue mapping, boosted mouse attraction force, and landed all 10 green checkmarks. this was actual dev behavior. not just generating code but also debugging its own output. i spent the next 8 minutes playing with the result. the boids flocking feels alive, the trail-blend is cinematic, three palettes cycle with space, mouse burst fires particles from click, drag paints a line through the swarm. this is what local ai looks like now. single 3090. hermes agent. no frameworks. no tricks. dropping the video next.

Sudo su@sudoingX

dude! the new qwen 3.6-27b dense is hammering my single 3090 at 100% gpu utilization. the spiky pattern on nvtop is the hermes agent autonomously thinking, calling tools, reading results, thinking again. this model is so cool to talk to. waits for tool outputs, reads them, selfcorrects, keeps going. no stalls, no loops, no hand holding. anyone running a single 3090 or any 24gb tier card should try this. same llama.cpp flags from last sweep, same hermes agent install. three commands and you are watching your own hardware think.

English

127

77.1K

Joe Muller@BosonJoe·12 Nis

@henrythe9ths @claudeai Or don't pay to do your taxes: freetaxusa.com/?friend=LgHn2D

English

Henry Shi@henrythe9ths·12 Nis

Tax season is here and a connector is all it takes to make @claudeai way more useful. Checkout what we just shipped: Connect TurboTax or Aiwyn Tax (formerly Column Tax) to Claude to estimate your refund, see what you may owe, and get a better understanding on the forms before you file.

English

1.3K

Joe Muller@BosonJoe·12 Nis

@bcherny Now do FreeTaxUSA: freetaxusa.com/?friend=LgHn2D

English

Boris Cherny@bcherny·12 Nis

brb trying this now

Henry Shi@henrythe9ths

English

313.1K

Joe Muller@BosonJoe·28 Mar

@snoopy_dot_jpg How did you get started in RL?

English

snoopy jpg@snoopy_dot_jpg·28 Mar

so, i didn’t end up with an offer from , which in truth is a bit of a gut punch, but two really positive things that came out of all this effort and toil: - i received feedback that i comfortably met the technical bar and passed the interview loop. that was satisfying to hear! the decision came down to not finding a team fit, which is pretty lame tbqh - the process has been really rewarding. it ruined my life for a few weeks, but it also activated me enough to commit a serious amount of time to upleveling myself. i’ve gotten a much deeper grounding in RL, done a lot of really fun experimentation, and gained a lot of precision in how i work through ideas. this was really cool! overall 2/10 experience. do not recommend. but fun in a sick way

English

644

51K

Joe Muller@BosonJoe·28 Mar

@sudoingX @stableAPY How do you recommend getting those numbers? I have a 5090 rtx but no experience with real benchmarking. Do you have a go to suite of tests for new models? Thanks!

English

Sudo su@sudoingX·27 Mar

@stableAPY drop your numbers when you test it. curious how mamba performs on metal vs cuda.

English

1.4K

Sudo su@sudoingX·27 Mar

i pointed hermes agent at nvidia's nemotron cascade 2 30B-A3B on a single RTX 3090 24GB. IQ4_XS quant by bartowski, 187 tok/s, 625K context. had it discover its own hardware, create an identity file, then build a full GPU marketplace UI from a single prompt. it one shotted it. first attempt no iteration. qwen 3.5 35B-A3B on the same hardware same 3090 24GB took an iteration to recover from a blank screen on the same type of build. 24 days between these two models releasing. same active parameters, completely different architectures and cascade 2 through hermes agent just keeps going. this model goes on and on. feast your eyes. more iterations and tests dropping soon. nvidia really cooked. no special flags needed. nvidia optimized this mamba MoE so well it just runs. flash attention auto enabled, context auto allocated. the model does the work not the config. but i compiled llama.cpp from source and i'm not sure how it performs on other engines. if you ran nemotron on any hardware drop your numbers below. RTX, AMD, Mac, whatever. model, quant, tok/s, engine. i want to see if it holds everywhere or just on llama.cpp.

Sudo su@sudoingX

nvidia's 3B mamba destroyed alibaba's 3B deltanet on the same RTX 3090. only 24 days between releases. same active parameters, same VRAM tier, completely different architectures. nemotron cascade 2: 187 tok/s. flat from 4K to 625K context. zero speed loss. flags: -ngl 99 -np 1. that's it. no context flags, no KV cache tricks. auto-allocates 625K. qwen 3.5 35B-A3B: 112 tok/s. flat from 4K to 262K context. zero speed loss. flags: -ngl 99 -np 1 -c 262144 --cache-type-k q8_0 --cache-type-v q8_0. needed KV cache quantization to fit 262K. both models held a flat line across every context level. both architectures are context-independent. but nvidia's mamba2 is 67% faster at generating tokens on the exact same hardware and needs fewer flags to get there. same node, same GPU, same everything. the only variable is the model. gold medal math olympiad winner running at 187 tokens per second on single RTX 3090 a card from 6 years ago. nvidia cooked.

English

764

70.4K

Joe Muller@BosonJoe·28 Mar

@sudoingX Beautiful huggingface.co/nvidia/Nemotro…

English

105

Joe Muller@BosonJoe·27 Mar

@JohnGoldman I've been running everyday for the last month and have seen similar improvements. How long are you running each day?

English

1.1K

John Goldman ☀️@JohnGoldman·27 Mar

Running consistently changes the way your heart performs. Age 50. Resting heart rate: 39. 39! Down from the 60’s and 70’s.

English

104

492

145.1K

Joe Muller@BosonJoe·27 Mar

@vishwesh_ayyar @JoelDeTeves What kind of tasks? Would you use it as a daily coding driver?

English

vishwesh@vishwesh_ayyar·27 Mar

@JoelDeTeves yeah agreed running Qwen3.5-27B-(claude-4.6-opus-distilled)-GGUF. Getting solid and useable performance for a number of tasks. huggingface.co/Jackrong/Qwen3…

English

968

Joel - coffee/acc@JoelDeTeves·27 Mar

Qwen3.5-27B-GGUF with hermes agent is the way

English

284

14.8K

Joe Muller@BosonJoe·27 Mar

@GazvodaMatjaz @star_yutish Wut

Matjaž Gazvoda@GazvodaMatjaz·26 Mar

@star_yutish how's so that you are putting it for free ?

English

853

yutish@star_yutish·26 Mar

i’m 19. i built an ai that lives in your dynamic island. reads emails. navigates. books Uber. you just talk. try out for free on the app store.

English

116

72.8K

Joe Muller@BosonJoe·26 Mar

Whatever the anthropic team did to Claude Code is really making my day worse 😓 Can't use ctrl+v Can't use ctrl+c Can't use any other keybindings while claude is in focus because it intercepts everything Pls fix 🙏 @trq212

English

340

Joe Muller@BosonJoe·26 Mar

Bad things happen when you go this fast

English

200

Joe Muller@BosonJoe·26 Mar

@gabriel1 You have too many ideas to be stuck at OpenAI

English

gabriel@gabriel1·26 Mar

on top of normal healthcare, for low stakes curiosity, i'd love a cheap clinic with all screening equipment. no doctors and no diagnoses i can book any screening, and they send files that ai reads. and i add it to my health context so i can continue asking chatgpt questions

English

471

32.5K

Joe Muller@BosonJoe·26 Mar

@gabriel1 You have too many ideas to be a wage slave at OpenAI

English

251

Joe Muller retweetledi

Google Research@GoogleResearch·24 Mar

Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: goo.gle/4bsq2qI

GIF

English

5.8K

39K

19.3M

Joe Muller@BosonJoe·25 Mar

@giladvdn I imagine it was unintentional and it's probably related to the fancy stuff they've been doing related to pasting images Regardless, my hotkeys all no longer work

English

Gilad Avidan@giladvdn·25 Mar

@BosonJoe Really? maybe a bug? What's the motivation?

English

Joe Muller@BosonJoe·25 Mar

Claude Code just broke the basic paste functionality. Now you can't use Cmd+V Iou have to right-click and select paste 😑

English

227

Keşfet

@bryan_johnson @ClaudeDevs @claudeai @SpaceX @zhijianliu_ @sudoingX @henrythe9ths @bcherny