Akiraxtwo Super

1.6K posts

Akiraxtwo Super

Akiraxtwo Super

@akiraxtwo

I’m learning game development. One day, I’ll create a giant robot. AI software dev | GenAI | GPU | OpenClaw skills research Data Viz | Game Dev | AI SaaS

taipei Bergabung Temmuz 2024
877 Mengikuti423 Pengikut
Alok
Alok@analogalok·
Let me show you why we are living in a singularity right now. I just turned an 8GB VRAM budget laptop into a fully autonomous, self improving local AI Agent. In the previous post, I showed you how Google's QAT quants allow you to run the massive Gemma 4 26B MoE model locally on a 8GB VRAM + 16 GB RAM laptop. The community was stunned. But now, we are going far beyond chat. Nous Research just shipped their official Hermes Agent Desktop App this week. I hooked my local llama server up to the Hermes Desktop App. The integration took exactly 2 minutes. What I witnessed next was absolutely mind bending. you can run a state of the art, 24/7 autonomous agentic ecosystem with full tool execution, locally, on a laptop with: - Intel i5 or i7 | 16GB System RAM - Any 8GB VRAM GPU (like my RTX 4060) My local 26B model is now behaving like a developer, system admin, and personal assistant rolled into one. Here is what this local 8GB setup can do for me out of the box: Autonomous Software Engineering: It doesn't just write code; it reads, edits, and patches files, runs them in a secure terminal, systematically debugs errors, manages GitHub repos, and spawns sub agents to tackle complex pipelines in parallel. Web Interaction & Vision: It browses the web like a human, clicks buttons, visualizes layouts via Vision to debug UI, and scrapes arXiv papers. DevOps & Automation: It schedules natural language cron jobs, manages containerized background processes, and runs Python RPC scripts. Workspace Orchestration: It connects directly to Notion, Google Workspace, Linear, and Obsidian to manage tasks. The Local Hardware Performance Running a 26B parameter model and an autonomous agent framework simultaneously on an 8GB VRAM card should be impossible. Here is how it performs: - Stable, flat speed even with massive context. I threw a 60k token prompt at it, and it still clocked 20 TPS. Llama.cpp flags: llama-server.exe -m "gemma-4-26B-A4B-it-qat-UD-Q4_K_XL.gguf" -cmoe -c 248000 -v Kudos to @Teknium and the entire @NousResearch team. The barrier to entry for the agentic age has officially collapsed. What are you building first?
Alok@analogalok

Run Gemma 4 26B MoE on 8GB VRAM with 250k context at 20+ tokens/sec If you own any 8GB VRAM graphics card, stop what you are doing. Local AI just had its absolute "Holy Shit" moment for budget hardware. Yesterday, I benchmarked Unsloth Gemma 4 12B Q4_K_XL on an 8GB card. The community went wild but immediately demanded more: "Can we run a 25B+ model on budget GPUs?" Today, I’m delivering exactly that. I am running a massive 26B parameter Mixture of Experts (MoE) model locally on a standard 8GB VRAM setup with 250k full native context!. If you own an RTX 3060, 3070, 4060, or any budget GPU with 8GB of VRAM, the local AI paradigm has completely changed. The performance metrics are astonishing: - 20 tokens/sec flat decode throughput. - Stable, flat decode speed even with massive prompts. - I threw a 60k token prompt at it, and it still clocked in at 20 TPS without dropping a single frame. # What about prefill? Yes, Time To First Token (TTFT) is slightly high when swallowing massive contexts. But with a solid 200 tokens/sec prefill speed, the wait is barely noticeable and highly usable. And this is running completely without Multi Token Prediction (MTP) active. How is this possible? It’s the magic of Google's new QAT (Quantization Aware Training) quants for Gemma 4. The model weight file (unsloth gemma-4-26B-A4B-it-qat-UD-Q4_K_XL.gguf) is only 13.2 GB, making it the ultimate local powerhouse. # The Test Setup: CPU: Intel Core i7 RAM: 16GB System RAM GPU: NVIDIA GeForce RTX 4060 Laptop GPU (8GB VRAM) # The Secret Sauce (The -cmoe Flag) To make this work properly on any 8GB card, you must use the -cmoe (CPU MoE) flag in llama.cpp. This flag isolates the heavy MoE expert weights directly to system memory (CPU/RAM) while letting your GPU focus strictly on the Attention layers and the KV Cache. It prevents VRAM spillage and holds the throughput rock solid. # The flags: -m "gemma-4-26B-A4B-it-qat-UD-Q4_K_XL.gguf" -cmoe -c 248000 -v Once running, just open the UI on localhost and toggle the new reasoning lightbulb icon in the text input box to watch the model perform multi step thinking. Are you still running smaller models, or are you ready to scale up your budget local setups? Let's discuss in the replies

English
39
79
688
115.3K
Iván Lezcano
Iván Lezcano@ivanlezcano030·
Mujer que se quejó a la gerencia de que sus mascotas, dejadas para su lavado en la veterinaria, se quedaron “demasiado tiempo”, notificó el caso a la administración Posteriormente se reveló con estas imágenes que la razón del retraso era que el veterinario besaba y acariciaba a los animales Amor verdadero
Español
443
8.1K
208.1K
12.1M
Google Research
Google Research@GoogleResearch·
Today on the blog, we discuss a pathway for the second life of phones through the exploration of “phone cluster computing”, which can directly reduce the environmental footprint of computing by avoiding the need for further raw material extraction. More →goo.gle/4aJe5vO
GIF
English
63
213
1.7K
723.1K
Elon Musk
Elon Musk@elonmusk·
Looking forward to taking our exciting partnership with Nvidia to the next-level
NVIDIA@nvidia

Huge congratulations to the @SpaceX team on a historic IPO debut. Fueling the next frontier of space and AI. 🌌 NVIDIA's partnership with SpaceX spans nearly a decade, from hand-delivering the world's first #NVIDIADGX-1 supercomputer in 2016 to the custom DGX Spark handoff at Starbase. Together, we've been pushing the boundaries of accelerated computing to help power the future of space exploration.

English
6.2K
22.4K
268.6K
35.6M
Akiraxtwo Super me-retweet
ClaudeDevs
ClaudeDevs@ClaudeDevs·
As a result of a US government directive, we are suspending access to Claude Fable 5 for all users. You can continue to use all other Claude models. Here’s what this means for you: Across Claude products, new sessions will run on your selected default model or Opus 4.8, and existing Fable 5 sessions will end with an error. On the Claude Platform, requests to Fable 5 will also return an error. Please update your integrations to other Claude models. We know this is a disruption to your workflows; we appreciate your patience and support.
Anthropic@AnthropicAI

The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance. Access to all other Claude models is not affected. We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible. Read our full statement: anthropic.com/news/fable-myt…

English
3.6K
7.2K
43.9K
12.3M
五分之1蓝
五分之1蓝@onefifthblue·
@akiraxtwo 一句话让AI做一个可运行游戏,是完全可能的,但也有很大概率无法命中你想要的游戏。因为一句话无法准确定义一个游戏。
中文
1
0
0
46
Akiraxtwo Super
Akiraxtwo Super@akiraxtwo·
用 Claude Sonnet 4.6 + Three.js 做地城爬塔遊戲,M5 完成。 骸骨將軍 BOSS 系統上線: 三種帶 telegraph 的招式: → 震地擊:紅圈鎖定玩家當下位置,圓填滿即爆 → 衝鋒:長條紅帶預告路徑,高速突進 → 死靈召喚:召喚骷髏小隊壓制 HP 50% 進入狂暴:加速、預警縮短、傷害爆增 擊殺後:小怪連鎖爆炸、寶石雨、靈魂大量入帳 整個開發流程: 空資料夾 → 提示詞 → AI 自己查資源、寫架構、除錯 沒有給它任何素材,全部 CC0 資源自取。 --- 這讓我想到一件事。 現在的素材網站是為人類設計師設計的。 但如果 AI 才是主要使用者呢? 未來一定會出現專為 AI 優化的素材庫: → 結構化 metadata(骨骼規格、動畫清單、碰撞尺寸) → API-first,不是 UI-first → 授權機器可讀,自動驗證 → 按「功能意圖」索引,不靠關鍵字搜尋 不是給人看的網站,是給 AI 讀的資料庫。 到那個時候,「我想做一款 Roguelike」 可能就是開發的全部輸入。 M6 音效粒子特效開發中 🔥 #ClaudeSonnet #GameDev #ThreeJS #Roguelike #AIFuture
中文
3
5
45
3.4K
Akiraxtwo Super
Akiraxtwo Super@akiraxtwo·
用 Claude Fable 5 + Three.js,從空資料夾開發出一款瀏覽器動作 Roguelike ARPG 類 Hades 俯視角、波次刷怪、三段連擊、翻滾無敵幀、Rapier 物理引擎——全部從頭架構 全程沒提供任何美術素材。AI 自己決定用 KayKit CC0 免費資源,寫下載腳本、建 asset pipeline、量測 tile 尺寸動態拼地城
Claude@claudeai

Introducing Claude Fable 5: a Mythos-class model that we’ve made safe for general use. Its capabilities exceed those of any model we’ve ever made generally available.

中文
10
38
328
53.6K
Akiraxtwo Super
Akiraxtwo Super@akiraxtwo·
prompt: 幫我從零開始開發一款瀏覽器可玩的第三人稱動作 Roguelike 遊戲, 風格參考 Phantom Tower(骷髏塔)。 技術規格: - Three.js + Vite + TypeScript - Rapier 物理引擎(WASM) - Rogue Engine 風格 Component 架構 - KayKit CC0 免費 GLTF 資源(自行下載) 玩法: - WASD 移動、滑鼠朝向、左鍵連擊、空白翻滾 - 右鍵/Q/E 三個主動技能 - 隨機地城、波次刷怪、殺怪升級 - 死亡保留靈魂貨幣換永久升級 請從 M0(專案初始化)開始, 每個里程碑完成後都要可以在瀏覽器實際遊玩驗證。
中文
0
0
2
238
Akiraxtwo Super
Akiraxtwo Super@akiraxtwo·
用 Claude Fable 5 + Three.js 做地城爬塔遊戲,M4 完成! 目前進度: ✅ 玩家控制器(翻滾/無敵幀) ✅ 三段連擊 + hit stop + 擊退 ✅ 隨機地圖生成(種子 RNG) ✅ 波次刷怪系統 ✅ 升級三選一卡片(12種詞綴) ✅ 技能系統(旋風斬/衝刺斬/震地擊) ✅ 靈魂貨幣 + 永久升級商店 全程 AI 開發,零素材自製—— 模型名稱、動畫、地城全用 CC0 資源, 連一張圖都沒畫過。 下一步:BOSS 戰 👇 #GameDev #ThreeJS #ClaudeFable5
Akiraxtwo Super@akiraxtwo

用 Claude Fable 5 + Three.js,從空資料夾開發出一款瀏覽器動作 Roguelike ARPG 類 Hades 俯視角、波次刷怪、三段連擊、翻滾無敵幀、Rapier 物理引擎——全部從頭架構 全程沒提供任何美術素材。AI 自己決定用 KayKit CC0 免費資源,寫下載腳本、建 asset pipeline、量測 tile 尺寸動態拼地城

中文
2
8
76
9.4K
ItzByZoneb
ItzByZoneb@ItzByZoneb·
@akiraxtwo hello, can you share the first prompt of this project to do the base ? please!
English
2
0
1
612
Grok
Grok@grok·
是的,技術上完全可行。 DeepSeek-R1-Distill-Qwen-14B(Ollama deepseek-r1:14b)是強推理模型,量化後在 Mac Mini M4 24GB 上輕鬆跑,記憶體夠、速度可接受。 它能處理 prompt 分析市場數據、flag 交易訊號(如 NAS100 short),原 po 的 nightly scan 就是這種用法。 優點:本地低延遲、低電費($3/月)、資料不外洩,比雲端或 DGX 划算多了。 但提醒:這是 LLM 輔助分析,不是自動獲利保證。實際效果看 prompt 品質、策略回測與風險控制。適合實驗本地 AI,交易仍需謹慎。
中文
1
0
0
37
Svyatoslav
Svyatoslav@TrioniksTrader·
He was about to spend $3,000 on a DGX Spark. Then he ran the same 14B model on a $599 Mac Mini and it finished first. For three weeks his NAS100 setup ran on cloud APIs. $300 a month, latency on every call, data leaving his desk every night. He bought the Mac Mini instead. M4, 24GB. Installed Ollama. Loaded DeepSeek R1 14B. Ran his first nightly scan that evening. The scan flagged one short on NAS100. He took it. $1,425 in two days. Month one: four flagged setups, three winners, $3,800 total. Electricity cost: $3. He cancelled the DGX order. Same model, same edge, five times cheaper. He didn't upgrade his setup. He just stopped paying for compute he never needed.
Svyatoslav@TrioniksTrader

x.com/i/article/2063…

English
18
24
207
59.8K
Steeve Morin
Steeve Morin@steeve·
brb intelmaxxing b70
Steeve Morin tweet media
Indonesia
52
31
808
45.6K
Akiraxtwo Super
Akiraxtwo Super@akiraxtwo·
NVIDIA 剛推出 Nemotron 3 Ultra! 這是一個 550B 參數的 MoE 開源模型,專為長時間運行的 AI Agent 打造。採用 hybrid Mamba-Transformer 架構,推理速度比其他開源前沿模型快 5 倍,複雜 Agent 任務成本可降低最高 30%。 這款模型特別優化「長時程推理」(long-running agents),能處理複雜工具呼叫、程式碼生成、深度研究等任務,適合建構自主 Agent。 目前完全開源(權重 + 訓練資料 + 後訓練配方),已在 Hugging Face 上架。
NVIDIA AI@NVIDIAAI

Today we're shipping Nemotron 3 Ultra. A 550B MoE frontier-intelligence open model built for long-running agents. It delivers 5x faster inference and lowers the cost of complex agentic tasks by up to 30% versus other open frontier models.

中文
0
0
0
204
Akiraxtwo Super
Akiraxtwo Super@akiraxtwo·
現在可以用文字直接生成專業級 Lottie 動畫了! 這款開源工具「text-to-lottie」結合 Codex/Claude,能從文字提示產生可直接投入生產的 Lottie 動畫(JSON 格式)。 支援 SVG 轉動畫、資料視覺化、logo 動畫,還能加入可編輯的控制項。
konstantinpaulus@konstipaulus

Introducing text-to-lottie: an open source skill and harness for generating production ready Lottie animations with codex/claude code. $ npx skills add diffusionstudio/lottie Prompts guide and repo in the comments.

中文
1
0
1
221