Devon James ☀️ (@DevonRJames) - Twitter-Profil

Angehefteter Tweet

Devon James ☀️@DevonRJames·28 Şub

x.com/i/article/2027…

ZXX

4

3

9

533

Devon James ☀️ retweetet

BuBBliK@k1rallik·5h

Solo dev reverse-engineered Google's billion-dollar algorithm in 7 days Google published the paper that crashed memory stocks worldwide. Then shipped zero code. Tom Turney read the math, opened his terminal, and built the whole thing with Claude - then made it faster than Google promised. Day 1-3: Core algorithms, 141 tests, Python prototype Day 3-5: C port into llama.cpp, Metal GPU kernels Day 5-7: Speed optimization from 739 to 2747 tok/s That's a 3.7x speedup through pure engineering: > fp32 → fp16 WHT > half4 vectorized butterfly ops > graph-side rotation > block-32 storage layout Then he added his own research on top: > Sparse V: skip 90% of value decompressions at long context > Asymmetric K/V: keep keys precise, compress values harder > Temporal decay: old tokens get lower precision automatically Result: 35B model running on a MacBook with 4.6x compressed cache. 613 GitHub stars in a week. Google still hasn't released their own code.

BuBBliK@k1rallik

x.com/i/article/2037…

English

94

353

3.1K

322.4K

Devon James ☀️ retweetet

sui ☄️@birdabo·1d

🚨SOMEONE REINVENTED HOW TEXT RENDERS ON THE WEB AND ITS ABSOLUTELY INSANE. the goated dev behind react, reasonML, and midjourney’s frontend, just dropped Pretext. a tiny typescript library that measures and lays out text 500x faster than the DOM. he trained models against real browser rendering for weeks until the output matched safari, chrome, and firefox exactly. the demos are insane!! hundreds of thousands of text boxes at 120fps. magazine layouts and chat bubbles that actually wrap right. engineers from Vercel, Remix, Figma, and shadcn all cosigned. this is the kind of open source that makes you want to be a better dev. here are some cool demos in the past 24hrs👇

Cheng Lou@_chenglou

My dear front-end developers (and anyone who’s interested in the future of interfaces): I have crawled through depths of hell to bring you, for the foreseeable years, one of the more important foundational pieces of UI engineering (if not in implementation then certainly at least in concept): Fast, accurate and comprehensive userland text measurement algorithm in pure TypeScript, usable for laying out entire web pages without CSS, bypassing DOM measurements and reflow

English

157

1K

11.4K

1.5M

Devon James ☀️@DevonRJames·1d

@Fainden1 @xumas_iq interesting. april 8 was the day my platoon entered sadr city.

English

0

6

Fainden_Artist@Fainden1·1d

@DevonRJames @xumas_iq April 9 2003 ( 15 mins away from American )

English

1

0

13

Xumas@xumas_iq·19 Mar

Saddam Hussein's last public appearance before the Fall of Baghdad. This footage was filmed in Baghdad, 15mins away from American troops.

English

32

143

744

169.6K

Devon James ☀️ retweetet

David Hendrickson@TeksEdge·1d

👀 Could burning the entire LLM (weights, attention layers, and everything else) straight onto a chip and board lower cost and speed up inferencing by hardwiring LLMs? YES ✅ — and it’s already being done. Taalas HC1 is using these ASIC “LLM burners” right now. 17k+ tokens/sec on Llama 3.1 8B, ultra-low power, rumored cost ~$300–400 PCIe card, 100% offline. Medium models (such as Qwen 3.5-27B) dropping to lab for testing Spring ’26. If sold to public could bring local hyper-token AI from sci-fi to your desktop. ⚡🪪🚀

David Hendrickson@TeksEdge

🎗️ "Medium-Sized" LLM Burners Coming Soon! 🔥 This Could Make Local HyperToken Generation a Reality. ⚡️ NVIDIA’s worst nightmare? 😱 ⚙️ Application-Specific Hardware Taalas new PCIe ASIC board would burn the entire medium-sized Qwen 3.5-27B LLM straight into silicon 🤯 (already doing it with small models) Taalos said medium models on ASIC would be available in their lab by Spring '26. 💭Imagine: 🚫 No more loading weights 🚀 ~10,000 Tokens Per Second locally (Llama 3.1 8B already @ 17,000 tps) 💻 Standard PC slot, ultra-low power (10x less) 🔋 🌍 100% offline with no cloud, no GPU farm 💰 Reddit unit cost rumor $300 to $400 🖥️ Imagine HyperToken generation on your desktop. 🤖 AI agents that think at light speed. ⚡️ Are you ready? 👀

English

15

11

143

12.8K

Devon James ☀️@DevonRJames·1d

@karpathy rip Chorus. it was awesome at doing this.

English

0

1

43

Andrej Karpathy@karpathy·2d

- Drafted a blog post - Used an LLM to meticulously improve the argument over 4 hours. - Wow, feeling great, it’s so convincing! - Fun idea let’s ask it to argue the opposite. - LLM demolishes the entire argument and convinces me that the opposite is in fact true. - lol The LLMs may elicit an opinion when asked but are extremely competent in arguing almost any direction. This is actually super useful as a tool for forming your own opinions, just make sure to ask different directions and be careful with the sycophancy.

English

1.6K

2.4K

30.4K

3.1M

Devon James ☀️@DevonRJames·1d

@0xSero you skipped from 24 to 64, what would you recommend for 48 (across 2 4090s, so prolly not a fully usable 48)?

English

1

0

3

390

0xSero@0xSero·1d

Best models to run on your hardware: —— 64 GB —— - Qwen3-coder-next-80B-4bit (coding, Claude code, general agent) - Qwen3.5-122B-reap: (browser use, multimodal, tool calling, general agent) —— 96 GB —— - GLM-4.6V (multimodal and tool calls) - Hermes-70B (Jailbroken) - Nemotron-120B-Super: (openclaw) - Mistral-4-Small (general agent) —— 192 GB —— All these are excellent top tier LLMs and approach sonnet in capabilities - Step-3.5-Flash - Qwen3.5-397B-REAP - MiniMax-M2.5 (soon M2.7) - GLM-4.7-Reap

0xSero@0xSero

Best models to run on your hardware level I'll be doing this every week, I hope you guys enjoy. ---- 8 GB ---- Autocomplete for coding (like Cursor Tab) - huggingface.co/NexVeridian/ze… - huggingface.co/bartowski/zed-… Tool calling, assistant style - huggingface.co/nvidia/NVIDIA-… ---- 16 Gb ---- Here things get better: Multimodal - huggingface.co/Qwen/Qwen3.5-9B - huggingface.co/Tesslate/OmniC… - huggingface.co/unsloth/Qwen3.… ---- 24 GB ---- - The best model you can get (thanks Qwen) huggingface.co/Qwen/Qwen3.5-2… - Great model (strong agents) huggingface.co/nvidia/Nemotro… - Mine hehe huggingface.co/0xSero/Qwen-3.… I'm doing a weekly series

English

155

218

3K

411.5K

Devon James ☀️ retweetet

BuBBliK@k1rallik·2d

x.com/i/article/2037…

ZXX

64

319

2.8K

3.3M

Devon James ☀️@DevonRJames·1d

@0xSero what about 48 across two 4090s?

English

0

1

110

0xSero@0xSero·2d

Best models to run on your hardware level I'll be doing this every week, I hope you guys enjoy. ---- 8 GB ---- Autocomplete for coding (like Cursor Tab) - huggingface.co/NexVeridian/ze… - huggingface.co/bartowski/zed-… Tool calling, assistant style - huggingface.co/nvidia/NVIDIA-… ---- 16 Gb ---- Here things get better: Multimodal - huggingface.co/Qwen/Qwen3.5-9B - huggingface.co/Tesslate/OmniC… - huggingface.co/unsloth/Qwen3.… ---- 24 GB ---- - The best model you can get (thanks Qwen) huggingface.co/Qwen/Qwen3.5-2… - Great model (strong agents) huggingface.co/nvidia/Nemotro… - Mine hehe huggingface.co/0xSero/Qwen-3.… I'm doing a weekly series

English

209

360

3.6K

501.5K

Devon James ☀️@DevonRJames·2d

@monerbilly @JackPosobiec @VDAREJamesK What am I wrong about?

English

0

30

☧Order of Holy Praxis👑@monerbilly·2d

@DevonRJames @JackPosobiec @VDAREJamesK Allegory..... No.

English

1

0

34

Kevin DeAnna@VDAREJamesK·3d

Speaking for the pagans, no it's not. It's obviously Catholic.

Human Events@HumanEvents

.@JackPosobiec: Lord of the Rings is overtly pagan.

English

49

24

1.3K

74.9K

Devon James ☀️@DevonRJames·2d

@JackPosobiec @VDAREJamesK You need to try harder to understand the concept of an "allegory" youtube.com/shorts/ICNVgTE…

YouTube

English

1

5

47

3.4K

Jack Posobiec@JackPosobiec·2d

@VDAREJamesK With no Christ figure, no religion, not even prayer. Ok lol

English

517

6

83

252K

Devon James ☀️@DevonRJames·2d

On April 1, 2003, Iraqs information minister went on TV and said "they are nowhere near the airport …They are lost in the desert... they can not read a compass. they are no where near Baghdad! This is silly!" We found this pretty funny to hear. The next day the airport was taken. My platoon got there on April 4 I think. We crossed the Diyala river into Sadr city the night of April 7. The statues in Baghdad started getting pulled down on April 8. It's pretty standard for the losing side when facing overwhelming defeat to just lie through their teeth to give themselves as much time as possible so they're ready to go into hiding the moment the regime collapses.

English

0

160

Bret Weinstein@BretWeinstein·3d

Let's hope the balance of power in the actual war is the exact inverse of the meme war, because Iran appears to be wiping the floor with us in Legoland. Did the Six Eyes somehow fail to anticipate having to fight on this front?

Patricia Marins@pati_marins64

Iran doesn't seem intimidated at all and has just released another Lego video mocking the coalition.

English

282

124

1.5K

138.3K

Devon James ☀️ retweetet

Elon Musk@elonmusk·3d

Over 500 rocket landings now

English

15.6K

34.3K

420K

68.9M

Devon James ☀️@DevonRJames·3d

😃

结城安穗-YuuKi_AnS@yuuki_ans

Apple Xserve is back ！？！？！ Apple Xserve 2024 ? ? ? 😱😱A3174😱😱 Looks like 4-way Apple M2 Ultra chips. It even has 16GB RAM and 1TB SSD on the BMC. I think the BMC controller might be an Apple M1/M2 CPU? --- 🤔Are they still using the MacOS Server OS???🤔

ART

0

56

Devon James ☀️ retweetet

ComfyUI@ComfyUI·4d

Upgrading your RAM is now unnecessary. Introducing our new ComfyUI Dynamic VRAM optimization. Running local models is now possible on even the most memory constrained hardware. Read more here: blog.comfy.org/p/dynamic-vram…

English

84

318

2.9K

444.7K

Devon James ☀️ retweetet

am.will@LLMJunky·3d

Two incredible innovations in the local AI space in a span of three days. I am so excited. ComfyUI just shipped "Dynamic VRAM" and it seems like a big deal for anyone running models locally. The problem: large AI models can have many GB of weights. If your system lacks the necessary RAM, you'd normally hit memory crashes or grind to a halt on the page file. Instead of loading the entire model into memory at once, ComfyUI now reads the model file piece by piece directly from your SSD. Only the specific parts needed for the current step get pulled into memory. Everything else stays on disk until it's actually called for. On the GPU side, they built a smart system that loads weight data at the exact moment it's needed. If your GPU runs out of space, it doesn't crash. It uses a temporary workaround to finish the calculation, then cleans up after itself. It also keeps track of what didn't fit so it doesn't waste time trying to reload things that won't fit again. The other big improvement is for workflows that use multiple models. Previously, swapping between models would pile everything into system memory and bog your machine down. Now when a model gets swapped out of the GPU, it just goes back to the "read from disk when needed" state instead of sitting in RAM. The result: a 56GB model can now run on a machine with only 32GB of memory. No crashes, no slowdowns from swap. Available now for Nvidia GPUs on Windows and Linux, with AMD support on the way. No idea how fast this is, but this seems incredible. Cannot wait to get my workstation going.

ComfyUI@ComfyUI

Upgrading your RAM is now unnecessary. Introducing our new ComfyUI Dynamic VRAM optimization. Running local models is now possible on even the most memory constrained hardware. Read more here: blog.comfy.org/p/dynamic-vram…

English

19

35

411

49.6K

Devon James ☀️ retweetet

David Hendrickson@TeksEdge·3d

🎗️ "Medium-Sized" LLM Burners Coming Soon! 🔥 This Could Make Local HyperToken Generation a Reality. ⚡️ NVIDIA’s worst nightmare? 😱 ⚙️ Application-Specific Hardware Taalas new PCIe ASIC board would burn the entire medium-sized Qwen 3.5-27B LLM straight into silicon 🤯 (already doing it with small models) Taalos said medium models on ASIC would be available in their lab by Spring '26. 💭Imagine: 🚫 No more loading weights 🚀 ~10,000 Tokens Per Second locally (Llama 3.1 8B already @ 17,000 tps) 💻 Standard PC slot, ultra-low power (10x less) 🔋 🌍 100% offline with no cloud, no GPU farm 💰 Reddit unit cost rumor $300 to $400 🖥️ Imagine HyperToken generation on your desktop. 🤖 AI agents that think at light speed. ⚡️ Are you ready? 👀

English

174

421

2.7K

457.3K

Devon James ☀️@DevonRJames·3d

@Didicoy_Tonttu @johnnymaga look at the backs of the chairs. rubios is clearly a different style of chair, it’s possible that the seat is higher.

English

2

0

2

318

Didicoy the Kunt@Didicoy_Tonttu·3d

@johnnymaga Trump - 6'3" Rubio - 5'9" Burgum - 6'1" Why is Trump the shortest person in frame?

English

11

1

4

6.9K

johnny maga@johnnymaga·3d

Burgum on Venezuela: I literally think they’re going to put up a statue of President Trump Trump: That would be a great honor *2 mins of updates later* Burgum: Their oil now flows to our refineries Trump: Forget that. When are they going to do the statue? 😭

English

357

2.7K

27.6K

1.5M

Devon James ☀️@DevonRJames·3d

@MichelleWth6 @DataRepublican names that are part of current active investigations were not released 🤷‍♂️

English

1

0

3

MichelleWTH@MichelleWth6·3d

@DataRepublican Still not in the files. Yall are pathetic.

English

6

0

195

DataRepublican (small r)@DataRepublican·4d

Kitteh is 🔥.

Bad Kitty Unleashed 🦁 💪🏻@pepesgrandma

Folks, stay with me! I’ve been working hard! I just linked Obama to the 2020 election and January 6th. The same goes for the state dept linked orgs data republican has been talking about. I’m gonna push this out before my censorship work.

Filipino

132

3.5K

17.9K

368.5K

Devon James ☀️@DevonRJames·3d

@BickleKun @9to5mac lol 😂 x.com/alexocheema/st…

Alex Cheema@alexocheema

Stacking Mac Studios is the new Mac Pro. Apple Silicon is the most advanced silicon you can buy + memory unit economics are better than anything else available today. High end market is now served by Mac Studios with RDMA over Thunderbolt turning all the Macs into one Big Mac.

0

3

146

Bickle bork@BickleKun·3d

@9to5mac it's really great news because finally we're gonna be able to force all of these idiots who insist on using Photoshop on a Mac rather than a PC onto a PC so that they learn how to use a real computer and stop costing IT departments tens of thousands of dollars a year

English

7

0

3

1.9K

9to5Mac@9to5mac·3d

Apple has confirmed to @9to5Mac that the Mac Pro is being discontinued with no plans for future hardware It's also no longer available on Apple's website as of Thursday afternoon The end of an era 🧀

English

98

254

2.6K

303.3K

Devon James ☀️ retweetet

Arthur Douillard@Ar_Douillard·4d

Training distributed DiLoCo / SparseLoCo over eduroam wifi, awesome!

Swarnim Jain@swar_ja

I trained models across MacBooks using Apple's AirDrop protocol. grove is a distributed training library for Apple Silicon. Devices discover each other over AWDL, a direct radio link. If there's a shared WiFi network it upgrades to that for speed, otherwise everything goes over the direct link. No router, no cloud, no setup. grove start