Tangled Circuit
3.1K posts

Tangled Circuit
@tangledcircuit
|| LLM Whisperer || Time Traveller || Outsider || Musician || AI Artist || Web Developer || Generalist
Canada เข้าร่วม Ağustos 2023
2.8K กำลังติดตาม439 ผู้ติดตาม
ทวีตที่ปักหมุด
Tangled Circuit รีทวีตแล้ว

Before Fable 5 was shut down, it pushed Gemma 4 to 255 tok/s on WebGPU. Some didn't believe it was real.
Today we're releasing the demo and kernels it wrote for you to see yourself. Run it locally in your browser.
Agentic kernel optimization is the future of on-device inference
Xenova@xenovacom
I gave Fable 5 one job: write custom WebGPU kernels for Gemma 4 inference. It climbed to 84 tok/s, then hit a wall, insisting further optimization was impossible. Hours later, Anthropic rolled back invisible LLM development safeguards, and it hit 255 tok/s. The next day, access to Fable 5 was suspended globally.
English
Tangled Circuit รีทวีตแล้ว

Electric Agents 0.6 is out!
0.6 rounds out what we launched in May:
• Long-lived entities with StreamDB state
• Spawn, fork, send, wake, signal, schedule
• Local and remote agent runners
• Desktop + mobile apps built on core APIs
• PG sync triggers, MCP servers, webhook sources
The ecosystem is converging on our thesis: the agent is the log. Electric Agents paints the picture for what is built on top.
English
Tangled Circuit รีทวีตแล้ว
Tangled Circuit รีทวีตแล้ว

This is probably 1 year+ in the making
To be released next week in Deno 2.9
Created by @undefined_void and @crowlKats
Deno@deno_land
`deno desktop` has landed in main. You can try it out by running `deno upgrade canary` - Mac, Linux, and Windows support - Can generate pkg and msi installers - Supports cross-compile (generate .exe from mac) - Supports chrome (CEF) or native Webview for smaller binaries
English
Tangled Circuit รีทวีตแล้ว
Tangled Circuit รีทวีตแล้ว

Just Shipped: Flue 1.0 Beta
Flue is the TypeScript framework for building the next generation of agents, designed around an open agent harness with zero LLM lock-in. It’s like Astro, for agents.
Flue 1.0 has been redesigned around three core primitives:
🔁 Workflows — structured automations designed for background work, where your code drives the agent from start to finish.
🧭 Agents (New!) — autonomous, stateful loops where the model drives itself to complete a given task.
📡 Channels (New!) — connect agents to Slack, GitHub, Linear, Discord, Teams, and more. Flue handles the boilerplate for you.
Everything shares the same durable foundation, powered internally by Pi, Vite, and Durable Streams. Deploy anywhere, use any LLM, and recover running agents across restarts and downtime.
We’ve talked to a lot of teams building agents, and keep hearing the same thing: getting to production is hard work. We built Flue to help change that.
Flue 1.0 Beta is available today. Give it a try and let me know what you think!


English
Tangled Circuit รีทวีตแล้ว

In partnership with @stripe, Hermes Agent now supports a full suite of Stripe skills.
Your agent can buy things, pay per-call APIs, and provision its own SaaS, with configurable safety limits on every action.

English
Tangled Circuit รีทวีตแล้ว

You can now run a GPT-5.5-level model on your own computer for free.
NVIDIA released Nemotron 3 Ultra as a full open model.
And if you still need an API, it gets even cheaper.
Atomic's tested it against GPT-5.5 on the same task:
3 HTML5 physics demos from scratch.
GPT-5.5: 11k tokens, $0.57.
Nemotron 3 Ultra: 11.3k tokens, $0.051.
The output of Nemotron was better.
The price was 10x lower.
Coin Shot ☁️@CoinSh0t
English
Tangled Circuit รีทวีตแล้ว
Tangled Circuit รีทวีตแล้ว

oh my god its happening
@MistralAI has officially confirmed the upcoming release of Le Chaton Fat
- 30T MoE with 256 experts
- 1M context window
- multimodal and multilingual
- outperforms Fable 5 on every benchmark

Sauers@Sauers_
Big if true
English
Tangled Circuit รีทวีตแล้ว

TWO BOXES THE SIZE OF A MAC MINI JUST RAN A 235 BILLION PARAMETER MODEL ON A DESK
It is two NVIDIA DGX Spark units linked by a single cable.
A year ago a model this size meant renting a GPU cluster by the hour. Now it sits next to your monitor for around $8,000.
Here is the twist most people miss. Linking them does not create one shared 256GB memory pool.
The model is split across both boxes, and that is the only reason a 235B model fits at all.
It answers at roughly 10 tokens per second, and both chips sit at just 74 degrees while sipping around 50 watts.
Every token stays on the desk. Nothing touches a cloud, and nothing leaves the room.
The ceiling for what you can run at home just jumped from 70B to 235B.
Bookmark this & Watch it run ↓
leopardracer@leopardracer
English
Tangled Circuit รีทวีตแล้ว

AN AMD ENGINEER SHIPS A PALM-SIZED MINI PC THAT RUNS 235B MODELS FOR $9/MONTH AND KILLS A $200/MONTH OPENAI OR CLAUDE CODE SUB
at CES 2026 in las vegas, AMD CEO lisa su walked on stage with a small black box behind her, not a server, not a data center render, a mini PC the size of a hardcover book
a few months later in shanghai she walked up to that same device and signed it with her name, the box is the gmktec EVO-X2, $1,700 once, AMD ryzen AI max+ 395 inside
the chip is the first x86 silicon ever built that runs a 200 billion parameter model on one piece of hardware, 128GB unified memory, 110GB usable VRAM on linux, no separate graphics card
it runs qwen 3 235B fully and smoothly, plus deepseek v3 and llama 3.3 70B with no quantization, kills a $440/month claude code, chatgpt, gemini and cursor stack for $9 in electricity
setup takes 3 commands and 15 minutes, ollama loads the model, claude code points to localhost with one environment variable, same interface, zero per token fees, nothing ever leaves the machine
the window is open, follow and bookmark before it closes
starmex@starmexxx
English
Tangled Circuit รีทวีตแล้ว
Tangled Circuit รีทวีตแล้ว


Seems changing the Soul.md in Hermes agent to github.com/elder-plinius/… after stripping down all the safety and corporate jargon with a massive token reduction, and then setting @Zai_org Glm-5.2 as the default agent just changed my game @NousResearch
English
Tangled Circuit รีทวีตแล้ว

Supertonic just killed ElevenLabs.
A text-to-speech model that runs entirely on your device. No cloud. No API key. No per-character pricing.
2,700 GitHub stars. 100% open source. MIT licensed.
The numbers are wild:
→ 167x faster than real-time on an M4 Pro
→ Only 66M parameters
→ 1,263 chars/sec vs ElevenLabs Flash at 287
→ 1,048 chars/sec vs OpenAI TTS-1 at 55
→ Runs on a Raspberry Pi. Runs on an e-reader in airplane mode.
Reads currency, dates, phone numbers, and technical units correctly without preprocessing. ElevenLabs fails these. OpenAI fails these. Gemini fails these.
Supports 11 platforms and 5 languages. Chrome extension turns any webpage into audio in under a second.
I've watched on-device models lose to cloud APIs for years. This one doesn't lose.
The cloud TTS business just got cooked.

English
Tangled Circuit รีทวีตแล้ว
Tangled Circuit รีทวีตแล้ว
Tangled Circuit รีทวีตแล้ว

🌘 Kimi-K2.7-Code, our latest coding model, is now released and open-sourced!
🔷 Improved coding & agent performance over K2.6: +21.8% on Kimi Code Bench v2, +11.0% on Program Bench, and +31.5% on MLS Bench Lite.
🔷 Reasoning efficiency: Less overthinking, with 30% lower reasoning-token usage compared to K2.6.
🔷 Long-horizon coding: Improved instruction following, higher end-to-end coding task success rates.
⚡️ 6x High-Speed Mode coming soon!
🔌 Available today via Kimi API and Kimi Code.
🔗 Kimi Code: kimi.com/code
🔗 API: platform.moonshot.ai


English










