Devon James ☀️ (@DevonRJames) - ملف تويتر

تغريدة مثبتة

Devon James ☀️@DevonRJames·28 Şub

x.com/i/article/2027…

ZXX

4

3

9

537

Devon James ☀️ أُعيد تغريده

Tom Turney@no_stp_on_snek·7h

the original TurboQuant paper tested on A100 with models up to 8B. 6 days later, a bunch of strangers on the internet had it built and running on: - Apple Silicon M1 through M5 - NVIDIA 3080 Ti through DGX Spark Blackwell - AMD RX 6800 XT and 9070 - a 10-year-old Tesla P40 - an 8GB MacBook Air - models from 3.8B to 70B across 6 architecture families - 30+ independent testers along the way we found new optimizations the paper didn't cover and failure modes it didn't test. the fact that a loose group of people across the world can read a paper, build implementations from scratch, stress-test across hardware none of us could individually afford, and push the research further in under a week is genuinely one of the best things about this era. the tools and the community make it possible. open source is something else.

English

33

241

2.3K

60.3K

Devon James ☀️@DevonRJames·7h

@volatilemarkts @BrianRoemmele @ShawnRyan762 tele-operated you mean

English

1

0

10

chadhurley@volatilemarkts·9h

@BrianRoemmele @ShawnRyan762 Again.. was this teleprompted? Or autonomous?

English

1

0

1

152

Brian Roemmele@BrianRoemmele·10h

Robot on @ShawnRyan762. Faster as faster we are approaching the “iPhone moment” for the century.

English

30

33

286

23.6K

Devon James ☀️ أُعيد تغريده

BuBBliK@k1rallik·17h

Solo dev reverse-engineered Google's billion-dollar algorithm in 7 days Google published the paper that crashed memory stocks worldwide. Then shipped zero code. Tom Turney read the math, opened his terminal, and built the whole thing with Claude - then made it faster than Google promised. Day 1-3: Core algorithms, 141 tests, Python prototype Day 3-5: C port into llama.cpp, Metal GPU kernels Day 5-7: Speed optimization from 739 to 2747 tok/s That's a 3.7x speedup through pure engineering: > fp32 → fp16 WHT > half4 vectorized butterfly ops > graph-side rotation > block-32 storage layout Then he added his own research on top: > Sparse V: skip 90% of value decompressions at long context > Asymmetric K/V: keep keys precise, compress values harder > Temporal decay: old tokens get lower precision automatically Result: 35B model running on a MacBook with 4.6x compressed cache. 613 GitHub stars in a week. Google still hasn't released their own code.

BuBBliK@k1rallik

x.com/i/article/2037…

English

143

797

6.4K

924.2K

Devon James ☀️ أُعيد تغريده

sui ☄️@birdabo·1d

🚨SOMEONE REINVENTED HOW TEXT RENDERS ON THE WEB AND ITS ABSOLUTELY INSANE. the goated dev behind react, reasonML, and midjourney’s frontend, just dropped Pretext. a tiny typescript library that measures and lays out text 500x faster than the DOM. he trained models against real browser rendering for weeks until the output matched safari, chrome, and firefox exactly. the demos are insane!! hundreds of thousands of text boxes at 120fps. magazine layouts and chat bubbles that actually wrap right. engineers from Vercel, Remix, Figma, and shadcn all cosigned. this is the kind of open source that makes you want to be a better dev. here are some cool demos in the past 24hrs👇

Cheng Lou@_chenglou

My dear front-end developers (and anyone who’s interested in the future of interfaces): I have crawled through depths of hell to bring you, for the foreseeable years, one of the more important foundational pieces of UI engineering (if not in implementation then certainly at least in concept): Fast, accurate and comprehensive userland text measurement algorithm in pure TypeScript, usable for laying out entire web pages without CSS, bypassing DOM measurements and reflow

English

166

1.1K

12K

1.6M

Devon James ☀️@DevonRJames·1d

@Fainden1 @xumas_iq interesting. april 8 was the day my platoon entered sadr city.

English

0

7

Fainden_Artist@Fainden1·1d

@DevonRJames @xumas_iq April 9 2003 ( 15 mins away from American )

English

1

0

13

Xumas@xumas_iq·19 Mar

Saddam Hussein's last public appearance before the Fall of Baghdad. This footage was filmed in Baghdad, 15mins away from American troops.

English

32

143

742

169.6K

Devon James ☀️ أُعيد تغريده

David Hendrickson@TeksEdge·1d

👀 Could burning the entire LLM (weights, attention layers, and everything else) straight onto a chip and board lower cost and speed up inferencing by hardwiring LLMs? YES ✅ — and it’s already being done. Taalas HC1 is using these ASIC “LLM burners” right now. 17k+ tokens/sec on Llama 3.1 8B, ultra-low power, rumored cost ~$300–400 PCIe card, 100% offline. Medium models (such as Qwen 3.5-27B) dropping to lab for testing Spring ’26. If sold to public could bring local hyper-token AI from sci-fi to your desktop. ⚡🪪🚀

David Hendrickson@TeksEdge

🎗️ "Medium-Sized" LLM Burners Coming Soon! 🔥 This Could Make Local HyperToken Generation a Reality. ⚡️ NVIDIA’s worst nightmare? 😱 ⚙️ Application-Specific Hardware Taalas new PCIe ASIC board would burn the entire medium-sized Qwen 3.5-27B LLM straight into silicon 🤯 (already doing it with small models) Taalos said medium models on ASIC would be available in their lab by Spring '26. 💭Imagine: 🚫 No more loading weights 🚀 ~10,000 Tokens Per Second locally (Llama 3.1 8B already @ 17,000 tps) 💻 Standard PC slot, ultra-low power (10x less) 🔋 🌍 100% offline with no cloud, no GPU farm 💰 Reddit unit cost rumor $300 to $400 🖥️ Imagine HyperToken generation on your desktop. 🤖 AI agents that think at light speed. ⚡️ Are you ready? 👀

English

15

11

143

13K

Devon James ☀️@DevonRJames·1d

@karpathy rip Chorus. it was awesome at doing this.

English

0

1

44

Andrej Karpathy@karpathy·2d

- Drafted a blog post - Used an LLM to meticulously improve the argument over 4 hours. - Wow, feeling great, it’s so convincing! - Fun idea let’s ask it to argue the opposite. - LLM demolishes the entire argument and convinces me that the opposite is in fact true. - lol The LLMs may elicit an opinion when asked but are extremely competent in arguing almost any direction. This is actually super useful as a tool for forming your own opinions, just make sure to ask different directions and be careful with the sycophancy.

English

1.7K

2.4K

30.6K

3.2M

Devon James ☀️@DevonRJames·2d

@0xSero you skipped from 24 to 64, what would you recommend for 48 (across 2 4090s, so prolly not a fully usable 48)?

English

1

0

3

407

0xSero@0xSero·2d

Best models to run on your hardware: —— 64 GB —— - Qwen3-coder-next-80B-4bit (coding, Claude code, general agent) - Qwen3.5-122B-reap: (browser use, multimodal, tool calling, general agent) —— 96 GB —— - GLM-4.6V (multimodal and tool calls) - Hermes-70B (Jailbroken) - Nemotron-120B-Super: (openclaw) - Mistral-4-Small (general agent) —— 192 GB —— All these are excellent top tier LLMs and approach sonnet in capabilities - Step-3.5-Flash - Qwen3.5-397B-REAP - MiniMax-M2.5 (soon M2.7) - GLM-4.7-Reap

0xSero@0xSero

Best models to run on your hardware level I'll be doing this every week, I hope you guys enjoy. ---- 8 GB ---- Autocomplete for coding (like Cursor Tab) - huggingface.co/NexVeridian/ze… - huggingface.co/bartowski/zed-… Tool calling, assistant style - huggingface.co/nvidia/NVIDIA-… ---- 16 Gb ---- Here things get better: Multimodal - huggingface.co/Qwen/Qwen3.5-9B - huggingface.co/Tesslate/OmniC… - huggingface.co/unsloth/Qwen3.… ---- 24 GB ---- - The best model you can get (thanks Qwen) huggingface.co/Qwen/Qwen3.5-2… - Great model (strong agents) huggingface.co/nvidia/Nemotro… - Mine hehe huggingface.co/0xSero/Qwen-3.… I'm doing a weekly series

English

166

232

3.2K

446.6K

Devon James ☀️ أُعيد تغريده

BuBBliK@k1rallik·2d

x.com/i/article/2037…

ZXX

71

361

3K

3.7M

Devon James ☀️@DevonRJames·2d

@0xSero what about 48 across two 4090s?

English

0

1

116

0xSero@0xSero·2d

Best models to run on your hardware level I'll be doing this every week, I hope you guys enjoy. ---- 8 GB ---- Autocomplete for coding (like Cursor Tab) - huggingface.co/NexVeridian/ze… - huggingface.co/bartowski/zed-… Tool calling, assistant style - huggingface.co/nvidia/NVIDIA-… ---- 16 Gb ---- Here things get better: Multimodal - huggingface.co/Qwen/Qwen3.5-9B - huggingface.co/Tesslate/OmniC… - huggingface.co/unsloth/Qwen3.… ---- 24 GB ---- - The best model you can get (thanks Qwen) huggingface.co/Qwen/Qwen3.5-2… - Great model (strong agents) huggingface.co/nvidia/Nemotro… - Mine hehe huggingface.co/0xSero/Qwen-3.… I'm doing a weekly series

English

213

370

3.7K

539.5K

Devon James ☀️@DevonRJames·3d

@monerbilly @JackPosobiec @VDAREJamesK What am I wrong about?

English

0

30

☧Order of Holy Praxis👑@monerbilly·3d

@DevonRJames @JackPosobiec @VDAREJamesK Allegory..... No.

English

1

0

34

Kevin DeAnna@VDAREJamesK·3d

Speaking for the pagans, no it's not. It's obviously Catholic.

Human Events@HumanEvents

.@JackPosobiec: Lord of the Rings is overtly pagan.

English

49

24

1.3K

75K

Devon James ☀️@DevonRJames·3d

@JackPosobiec @VDAREJamesK You need to try harder to understand the concept of an "allegory" youtube.com/shorts/ICNVgTE…

YouTube

English

1

5

47

3.4K

Jack Posobiec@JackPosobiec·3d

@VDAREJamesK With no Christ figure, no religion, not even prayer. Ok lol

English

517

6

83

252.5K

Devon James ☀️@DevonRJames·3d

On April 1, 2003, Iraqs information minister went on TV and said "they are nowhere near the airport …They are lost in the desert... they can not read a compass. they are no where near Baghdad! This is silly!" We found this pretty funny to hear. The next day the airport was taken. My platoon got there on April 4 I think. We crossed the Diyala river into Sadr city the night of April 7. The statues in Baghdad started getting pulled down on April 8. It's pretty standard for the losing side when facing overwhelming defeat to just lie through their teeth to give themselves as much time as possible so they're ready to go into hiding the moment the regime collapses.

English

0

162

Bret Weinstein@BretWeinstein·3d

Let's hope the balance of power in the actual war is the exact inverse of the meme war, because Iran appears to be wiping the floor with us in Legoland. Did the Six Eyes somehow fail to anticipate having to fight on this front?

Patricia Marins@pati_marins64

Iran doesn't seem intimidated at all and has just released another Lego video mocking the coalition.

English

282

124

1.5K

138.5K

Devon James ☀️ أُعيد تغريده

Elon Musk@elonmusk·4d

Over 500 rocket landings now

English

15.6K

34.3K

420.3K

69.3M

Devon James ☀️@DevonRJames·3d

😃

结城安穗-YuuKi_AnS@yuuki_ans

Apple Xserve is back ！？！？！ Apple Xserve 2024 ? ? ? 😱😱A3174😱😱 Looks like 4-way Apple M2 Ultra chips. It even has 16GB RAM and 1TB SSD on the BMC. I think the BMC controller might be an Apple M1/M2 CPU? --- 🤔Are they still using the MacOS Server OS???🤔

ART

0

56

Devon James ☀️ أُعيد تغريده

ComfyUI@ComfyUI·5d

Upgrading your RAM is now unnecessary. Introducing our new ComfyUI Dynamic VRAM optimization. Running local models is now possible on even the most memory constrained hardware. Read more here: blog.comfy.org/p/dynamic-vram…

English

84

318

2.9K

445.6K

Devon James ☀️ أُعيد تغريده

am.will@LLMJunky·4d

Two incredible innovations in the local AI space in a span of three days. I am so excited. ComfyUI just shipped "Dynamic VRAM" and it seems like a big deal for anyone running models locally. The problem: large AI models can have many GB of weights. If your system lacks the necessary RAM, you'd normally hit memory crashes or grind to a halt on the page file. Instead of loading the entire model into memory at once, ComfyUI now reads the model file piece by piece directly from your SSD. Only the specific parts needed for the current step get pulled into memory. Everything else stays on disk until it's actually called for. On the GPU side, they built a smart system that loads weight data at the exact moment it's needed. If your GPU runs out of space, it doesn't crash. It uses a temporary workaround to finish the calculation, then cleans up after itself. It also keeps track of what didn't fit so it doesn't waste time trying to reload things that won't fit again. The other big improvement is for workflows that use multiple models. Previously, swapping between models would pile everything into system memory and bog your machine down. Now when a model gets swapped out of the GPU, it just goes back to the "read from disk when needed" state instead of sitting in RAM. The result: a 56GB model can now run on a machine with only 32GB of memory. No crashes, no slowdowns from swap. Available now for Nvidia GPUs on Windows and Linux, with AMD support on the way. No idea how fast this is, but this seems incredible. Cannot wait to get my workstation going.

ComfyUI@ComfyUI

Upgrading your RAM is now unnecessary. Introducing our new ComfyUI Dynamic VRAM optimization. Running local models is now possible on even the most memory constrained hardware. Read more here: blog.comfy.org/p/dynamic-vram…

English

19

35

411

49.6K

Devon James ☀️ أُعيد تغريده

David Hendrickson@TeksEdge·3d

🎗️ "Medium-Sized" LLM Burners Coming Soon! 🔥 This Could Make Local HyperToken Generation a Reality. ⚡️ NVIDIA’s worst nightmare? 😱 ⚙️ Application-Specific Hardware Taalas new PCIe ASIC board would burn the entire medium-sized Qwen 3.5-27B LLM straight into silicon 🤯 (already doing it with small models) Taalos said medium models on ASIC would be available in their lab by Spring '26. 💭Imagine: 🚫 No more loading weights 🚀 ~10,000 Tokens Per Second locally (Llama 3.1 8B already @ 17,000 tps) 💻 Standard PC slot, ultra-low power (10x less) 🔋 🌍 100% offline with no cloud, no GPU farm 💰 Reddit unit cost rumor $300 to $400 🖥️ Imagine HyperToken generation on your desktop. 🤖 AI agents that think at light speed. ⚡️ Are you ready? 👀

English

174

422

2.7K

459.7K

Devon James ☀️@DevonRJames·4d

@Didicoy_Tonttu @johnnymaga look at the backs of the chairs. rubios is clearly a different style of chair, it’s possible that the seat is higher.

English

2

0

2

320

Didicoy the Kunt@Didicoy_Tonttu·4d

@johnnymaga Trump - 6'3" Rubio - 5'9" Burgum - 6'1" Why is Trump the shortest person in frame?

English

11

1

4

6.9K

johnny maga@johnnymaga·4d

Burgum on Venezuela: I literally think they’re going to put up a statue of President Trump Trump: That would be a great honor *2 mins of updates later* Burgum: Their oil now flows to our refineries Trump: Forget that. When are they going to do the statue? 😭

English

357

2.7K

27.6K

1.5M

Devon James ☀️@DevonRJames·4d

@MichelleWth6 @DataRepublican names that are part of current active investigations were not released 🤷‍♂️

English

1

0

3

MichelleWTH@MichelleWth6·4d

@DataRepublican Still not in the files. Yall are pathetic.

English

6

0

195

DataRepublican (small r)@DataRepublican·5d

Kitteh is 🔥.

Bad Kitty Unleashed 🦁 💪🏻@pepesgrandma

Folks, stay with me! I’ve been working hard! I just linked Obama to the 2020 election and January 6th. The same goes for the state dept linked orgs data republican has been talking about. I’m gonna push this out before my censorship work.

Filipino

132

3.5K

17.9K

368.7K

Devon James ☀️

اكتشف