13.2K posts

✦ banner
✦

@indes_yo

這裡就紀錄我當下的想法 ✶

Taipei เข้าร่วม Ocak 2023
1.8K กำลังติดตาม1.7K ผู้ติดตาม
ทวีตที่ปักหมุด
✦
@indes_yo·
͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
4
0
19
7.4K
✦ รีทวีตแล้ว
Wildminder
Wildminder@wildmindai·
RotorQuant - upgraded TurboQuant. > 10x KV cache compression > 28% faster decoding > 5x faster prefill > 44x fewer parameters Same quality as full attention. 1/10th the memory. Ok, another massive VRAM discount for local LLMs. github.com/scrya-com/roto…
Wildminder tweet media
English
29
120
1.1K
44.2K
✦
@indes_yo·
@ethanhuang13 可能我M1 只拿來隔離跑Agent 還有一些輕度工作吧
中文
0
0
0
29
13
13@ethanhuang13·
@indes_yo 對我來說是瀏覽器最明顯
中文
1
0
2
78
13
13@ethanhuang13·
還沒有機會摸到 M5 雖然我沒有把 M3 Max 換掉的需求跟慾望 但是買了 Mac mini 以後 我一直能知覺到 M4 相比 M3 Max 的日常操作靈敏度提升(因為單核心性能有差) 所以還是不要去摸好了 #週間廢推
中文
2
0
11
1K
✦
@indes_yo·
我幾點幾年前有一個老師跟我說 他年輕的時候因為要學習3D繪圖 所以去花大錢買了一種 當年很強大的浮點運算卡 我也不確定是什麼卡 但就有點像現在的本地AI熱潮 這些先進硬體都還非常貴 可是能處理的東西非常有限
中文
0
0
0
36
✦ รีทวีตแล้ว
Sandro
Sandro@pupposandro·
Excited to release a Megakernel to make a 6-year-old RTX 3090 running Local LLMs faster than apple's latest M5 Max chip. not a benchmark trick. same model, same weights, one kernel change. the full breakdown is in the article below. Open-source, MIT licensed, you can reproduce it in one command.
Sandro tweet media
Sandro@pupposandro

x.com/i/article/2041…

English
27
64
701
50.6K
Master | 最強打野(穢土轉生)
講真 你如果發現你社群的分析師沒讀過大學或是讀學店 你還敢信他們嗎 台灣幣圈現況 學歷不是唯一出路 只是篩選手段 但95%都沒讀過書
中文
11
0
79
5.4K
✦
@indes_yo·
明年回頭看自己的電腦 應該是裝了很多垃圾吧
中文
0
0
0
23
✦ รีทวีตแล้ว
Geek Lite
Geek Lite@QingQ77·
为 Hermes 打造的 HUD 一款开源的 TUI 显示器 → 从 Hermes Agent 数据目录实时读取记忆、纠错、工具使用等状态并渲染成交互式 TUI → 9 个标签页覆盖仪表盘、成长对比、Cron 任务、项目追踪、健康检查、Prompt 模式等 → 4 套赛博朋克风格主题(Neural Awakening / Blade Runner / fsociety / Digital Soul) → 快照对比功能直观展示 Agent 从昨天到今天记住了什么、少犯了什么错 github.com/joeynyc/hermes…
Geek Lite tweet media
中文
7
24
201
24.2K
✦ รีทวีตแล้ว
ハカセ アイ(Ai-Hakase)🐾最新トレンドAIのためのX 🐾
【LLM推論が最大6倍速に!新技術「DFlash」が革命的】 推論速度を劇的に向上させる新手法が登場しました。精度を維持したまま、Qwen3.5などの最新モデルで400 tokens/s超えを記録しています。🚀 注目のポイント: ・ブロック拡散モデルにより複数単語を並列予測 ・最高峰手法「EAGLE-3」より最大2.5倍高速 ・サーバーコスト削減とUX向上の両立が可能 AIエージェントの並列処理もスムーズになる、まさに次世代の高速化技術です!✨ #LLM #DFlash
日本語
22
95
843
51.1K
✦
@indes_yo·
我沒辦法給人建議 我只能自己靜靜地下場玩
中文
0
0
0
6
✦
@indes_yo·
@Prince_Canuma Any new ideas to speed up the Prefill time?
English
1
0
0
141
Prince Canuma
Prince Canuma@Prince_Canuma·
Just implemented TriAttention in MLX and the results are wild! You can get up to 81% KV compression at 60K tokens for Gemma-4-31B-IT in BF16 🔥 Unlike TurboQuant, which quantizes KV cache values, TriAttention prunes low-importance tokens entirely by scoring keys using trigonometric series from pre-RoPE Q/K concentration and keeping only the top-B most important ones. The best part? Decode speed for BF16 stays locked at ~10 t/s while baseline drops to 8.7 at long contexts. This results scale well with the quantized version as well. Benchmarked on Gemma4-31B-it with MM-NIAH on M5 Ultra: ~1K → 3% saved ~7K → 34% saved ~15K → 52% saved ~30K → 69% saved ~60K → 81% saved KV cache capped at 0.82 GB regardless of context length. One-time calibration (~30s), then it just works during generation. One caveat: TriAttention by design is best suited for generative task (reasoning/code) and not retrieval tasks. PR will follow soon on MLX-VLM.
Prince Canuma tweet media
Yukang Chen@yukangchen_

We’re thrilled to open-source TriAttention! 🚀 🦞 Deploy OpenClaw (32B LLM) on a single 24GB RTX 4090 locally 💻Full code open-source & vLLM-ready for one-click deployment ⚡️ 2.5× faster inference speed & 10.7× less KV cache memory usage TriAttention is a novel KV cache compression method built on rigorous trigonometric analysis in the Pre‑RoPE space for efficient LLM long reasoning. Github Repo: github.com/WeianMao/triat… Paper Link: huggingface.co/papers/2604.04… Homepage: weianmao.github.io/tri-attention-…

English
23
50
515
44.8K
✦
@indes_yo·
@TD_CCK 把這種都丟一丟 全換成了半固態或全固態電池了
中文
0
0
1
434
NK Chen
NK Chen@TD_CCK·
真的是奉勸你各位: 千萬不要買這種帶插頭的行動電源。 (這顆在上新聞前就買了,沒輒)
NK Chen tweet media
中文
36
22
423
75.5K
✦
@indes_yo·
This reminds me that the iPads everyone keeps in the drawer finally have a use.
English
0
0
0
25
✦
@indes_yo·
@SpatiallyMe This reminds me that the iPads everyone keeps in the drawer finally have a use.
English
0
0
0
51
Phil Traut ᯅ
Phil Traut ᯅ@SpatiallyMe·
I just launched a free App that lets you open Mac Apps…from your iPhone. And close them with a swipe. A much faster drag and drop. And a ton of more features in the pipeline. It’s called choclift and you really should try it out. Available for free in the App Store for iOS and macOS.
English
152
316
4.6K
482.5K