David Hendrickson

14K posts

David Hendrickson banner
David Hendrickson

David Hendrickson

@TeksEdge

CEO & Founder | PhD | Startup Advisor | @Columbia | Author Generative Software Engineering https://t.co/9oqvHuTX5f | 🔔 Follow for AI & Vibe Coding Tips 👇

PNW Katılım Temmuz 2023
531 Takip Edilen6.7K Takipçiler
Sabitlenmiş Tweet
David Hendrickson
David Hendrickson@TeksEdge·
🎗️ "Medium-Sized" LLM Burners Coming Soon! 🔥 This Could Make Local HyperToken Generation a Reality. ⚡️ NVIDIA’s worst nightmare? 😱 ⚙️ Application-Specific Hardware Taalas new PCIe ASIC board would burn the entire medium-sized Qwen 3.5-27B LLM straight into silicon 🤯 (already doing it with small models) Taalos said medium models on ASIC would be available in their lab by Spring '26. 💭Imagine: 🚫 No more loading weights 🚀 ~10,000 Tokens Per Second locally (Llama 3.1 8B already @ 17,000 tps) 💻 Standard PC slot, ultra-low power (10x less) 🔋 🌍 100% offline with no cloud, no GPU farm 💰 Reddit unit cost rumor $300 to $400 🖥️ Imagine HyperToken generation on your desktop. 🤖 AI agents that think at light speed. ⚡️ Are you ready? 👀
David Hendrickson tweet media
English
178
424
2.7K
480.9K
mr-r0b0t
mr-r0b0t@mr_r0b0t·
10,814 ChatML + @NousResearch Hermes reasoning traces from DeepSeek V4 Pro for LoRA SFT on consumer GPUs. • 96 parallel workers, staggered 5s → 99.8% success • 76K tool calls across 20+ tools • 100% think blocks, JSON-repaired • 10.7% Hermes-specific This is the only dataset with all 8 Hermes-specific tools (memory, session_search, skills_list, delegate_task, skill_manage, skill_view, cronjob) used in realistic multi-turn agent conversations — 10.5% of all tool calls exercise capabilities that generic coding datasets don't cover. Every trace has reasoning blocks and ChatML formatting compatible with all 6 target models. HF dropping soon
mr-r0b0t tweet media
English
7
5
61
2.9K
Markets & Mayhem
Markets & Mayhem@Mayhem4Markets·
The narrative: open-weight models are really far from closed. The reality: Kimi-K2.6, an open-weight model, ranks just below the top 4 closed models at #5 on Artificial Analysis. It is tied with MiMo-V2.5-Pro, another open-weight model.
Markets & Mayhem tweet media
English
8
9
45
16.4K
AshutoshShrivastava
AshutoshShrivastava@ai_for_success·
Google is working on a new design for the Gemini app. This was spotted on iOS by r/u/TaxOld2989. Looks kind of cool.
AshutoshShrivastava tweet mediaAshutoshShrivastava tweet media
English
9
7
117
4.7K
David Hendrickson
David Hendrickson@TeksEdge·
🚨 AMD's New "Inference Box" Possible Hidden Sauce - A New Processor! Meet Strix Halo GORGON!! 💥 💪 Earlier, I posted that AMD is going to enter the Home Inferencing arena with its own branded LLM Box. BUT, we just learned what they might have up their sleeve, a new processor! ⚡️ 👀 Look at this 👇 AMD’s leaked Ryzen AI Max+ 495 “Gorgon Halo” looks like a serious Strix Halo 390 jump: 🧠 16C/32T vs 12C/24T (+33%) ⚡ PassMark CPU: 57,525 vs 41,552 (~+38%) 🎮 Radeon 8065S: 40 CUs vs 8050S 32 CUs (+25%) 🧮 Memory: 192GB spotted vs 128GB max (+50%) All this is based on a leak and speculation, nothing official yet.
David Hendrickson tweet media
David Hendrickson@TeksEdge

⁉️So get this, AMD is making a bold move to own the affordable personal inferencing market by launching a Mini PC in June, a 128GB Shared Memory Inferencing Box 🎇 They call it the ⬭ Halo Box. 🧾 It's a Ryzen AI MAX+ 395 (16 Zen 5 cores + 40 RDNA 3.5 CUs + XDNA 2 NPU) ✅ Up to 128GB LPDDR5X-8533 unified memory ✅ Full ROCm support + Day-0 AI model optimization 🧪 Built for local AI development (up to ~200B param models) 📈 Direct shot at NVIDIA’s $4,699 DGX Spark and could cost $2,000–$3,000 (as they do now) 🤔 Why launch now during the RAM shortage? While memory makers divert capacity to HBM for AI data centers (driving LPDDR5X prices to spike and NVIDIA to raise the price of DGX Spark by $700), AMD is making a bold move to own the affordable, high-memory AI mini-PC segment before the crisis worsens. 💡 My Speculation: AMD could be using its contracts, relationships, and strategic priority to secure better memory access than many traditional OEMs. This could give them an advantage in launching the Halo Box during the shortage. Smart timing or risky bet? 🔥 This is AMD aggressively fighting for the local AI developer market.

English
5
0
30
4K
David Hendrickson
David Hendrickson@TeksEdge·
🚀 Anthropic is in early talks to buy inference chips from UK startup Fractile. Their tech 👉 DRAM-less SRAM chips with "Memory Compute Fusion", memory and compute physically fused on the same die. Solves two major problems 💸 Skyrocketing DRAM costs & supply shortages 🧱 The memory wall bottleneck that slows down LLM inference How it compares ✅ Much more flexible than Taalas (not hardwired to one model) ⚡ Similar to Groq LPU but with tighter memory-compute integration 🔥 Claims significantly faster & cheaper than Nvidia GPUs This is the next wave of custom AI silicon
David Hendrickson tweet media
English
0
2
10
741
David Hendrickson
David Hendrickson@TeksEdge·
RAMageddon is going to be brutal for the average person.
English
3
0
7
416
David Hendrickson
David Hendrickson@TeksEdge·
This is really the best holistic LLM rating!
David Hendrickson@TeksEdge

🏆 LLMStats just dropped a fresh leaderboard update. This is my trusted ranking. 📊 The "TrueSkill" composite score is the real deal as the most conservative, battle-tested “Uber benchmark” in the game (μ − 3σ across GPQA, SWE-Bench, coding arenas & more). 👀 Current Standings 🏆 Overall #1 Claude Mythos Preview (@AnthropicAI) — 70.1Unreleased monster. 94.6% on GPQA Diamond. This thing is going to be an absolute banger 🚀 🥇 Best Open-Weights Kimi K2.6 (@moonshot) — 58.7Undisputed leader among open models right now. 90.5% GPQA + only $0.95/M tokens. Insanely good value 💎 Quick Hits 🏆 Gemini 3.1 Pro → Dominating coding arenas 👑 Llama 4 Scout → 10M context king ⚡ Mercury 2 → Fastest model at 1720 tps 🔥Bottom line If you care about real capability per dollar, Kimi K2.6 is the one to watch in the open-source world right now. And when Mythos drops… the game changes

English
0
0
3
328
David Hendrickson
David Hendrickson@TeksEdge·
🦉 Been continuing to test mystery model Owl-Alpha on my benchmark, and it appears to be a good model, maybe as good as Qwen3.6-27B. Does well on structured reasoning, but struggles most with tool use (according to my tests)
David Hendrickson tweet media
English
0
0
12
1.3K
David Hendrickson
David Hendrickson@TeksEdge·
💡vLLM just posted an update to v0.20.1 with bugs fixed to help run DeepSeek V4. Update your vLLM early and often to get the most inferencing power from your person compute.
vLLM@vllm_project

Running DeepSeek V4 from @deepseek_ai on @vllm_project? Upgrade to v0.20.1 — 10+ bug fixes and optimizations, fully tested and verified by the open source community! A huge thank you to @FireworksAI_HQ, @baseten, @novita, @lightseekorg, @daocloud, @nvidia, @redhatai and more for helping report, fix, and verify the stability and speed of vLLM. 🙏 🔧 DeepSeek V4 Productionization Reliability: • Persistent topk cooperative deadlock at TopK=1024 • AOT compile cache import error • Repeated RoPE cache initialization • Non-streaming tool-call type conversion (DSV3.2/V4) • torch inductor error on V4 ⚡ Optimizations: • Multi-stream pre-attention GEMM + configurable knob • BF16 / MXFP8 all-to-all on FlashInfer one-sided comm • PTX `cvt` for faster FP32 → FP4 conversion • Integrated `head_compute_mix_kernel` for head computation 📖 Full notes → github.com/vllm-project/v…

English
2
5
10
1.3K
David Hendrickson
David Hendrickson@TeksEdge·
🚫🤖 Tired of Copilot, Recall & AI bloat in Windows 11? I found a free tool that lets you rebuild Windows 11 without Copilot or any other AI features. NTLite (v2026.04) gives you full control ✅ Strip AI components from 25H2 ISOs/images ✅ Create cleaner, smaller installs ✅ Edit live systems too Link in ALT
David Hendrickson tweet media
English
1
1
1
325
David Hendrickson
David Hendrickson@TeksEdge·
Talk about a PC build! This one is big enough to live in but can it run Qwen3.6-27B? 😄 Meet the Superdome, a massive RGB-lit fish tank case that’s basically a gaming tower for humans.
David Hendrickson tweet media
English
0
0
4
409
David Hendrickson
David Hendrickson@TeksEdge·
📉 NVIDIA CEO Jensen Huang: "In China, we have now dropped to zero." 📰 I've been following this thread of stories for years now and this seems to be the final chapter. 📉 US export bans on AI chips took NVIDIA from ~95% market share to 0% in <2 years. "This policy has already largely backfired" according to an SCSP interview, Apr 30 2026. Why it matters 👇 🚀 Huawei built its own Ascend GPUs (910→950 series) post-sanctions. Full domestic stack (CANN software + SMIC fabs). 💡 DeepSeek V4 (1.6T-param flagship, Apr 2026) is now optimized for Huawei Ascend 950 supernodes — day-zero support + partial training on Chinese silicon. Sanctions = innovation catalyst? 🇨🇳💻
David Hendrickson tweet media
English
0
2
5
466
Ivan Fioravanti ᯅ
Ivan Fioravanti ᯅ@ivanfioravanti·
Ok you convinced me! Anyone will be able to choose whatever default they wanna see. 💪
English
3
3
34
3.1K
Ivan Fioravanti ᯅ
Ivan Fioravanti ᯅ@ivanfioravanti·
In a Local AI benchmark what should be the context length used as main reference for speed? We are all using very small context to showcase model speed in our hardware, but that's unrealistic.
English
43
1
20
6.6K