5.1K posts

@[email protected]

@bchap1n

Powershell, guitars, and JerryGarcia shitposts leftist and progressive retweets Bay Area dad - he/him Proofpoint IT/Ops (opinions are my own)

Sunnyvale, CA Katılım Haziran 2009

1.8K Takip Edilen453 Takipçiler

@[email protected]@bchap1n·7h

made a PR to merge the new llama.cpp w/ MTP into beellama.cpp 🐝 enables MTP and DFlash + Turboquant TOGETHER on a single 3090 Unfortunately, DFlash is far more impactful than MTP for my system + use case #localLLM #club3090 github.com/Anbeeld/beella…

English

@[email protected]@bchap1n·11h

@iamkunhello @TheAhmadOsman 500w when undervolted optimally

English

144

iamkun@iamkunhello·12h

@TheAhmadOsman Two 3090s pull 700W under load. Your power bill becomes the subscription fee. 🤣

English

1.2K

Ahmad@TheAhmadOsman·13h

Gentle reminder that all you need to start with Local AI is: - 2x RTX 3090s (pick up for $700-$900 on r/hardwareswap) - Qwen 3.6 27B / Gemma 4 31B - Your favorite agent (Claude Code / OpenCode / etc) - Self-hosted SearXNG for web access And you got yourself Opus 4.5 at home

English

725

34.7K

@[email protected]@bchap1n·12h

@victormustar @ClementDelangue i was already building llama.cpp from source and using dflash+turboquant. not sure MTP is going to beat that

English

369

Victor M@victormustar·18h

llama.cpp with MTP support makes local models fast enough to use as daily drivers 🚀 Qwen3.6-27B dense generation (on A10G): From 25 tok/s → 45 tok/s (+78%). Two flags on llama-server: --spec-type draft-mtp --spec-draft-n-max 2

Georgi Gerganov@ggerganov

llama.cpp adds MTP for the Qwen3.6 family This is a significant milestone for the local AI ecosystem. The performance jump with these changes is massive and elevates local inference on commodity hardware further. Special thanks to Aman Gupta for leading this development! github.com/ggml-org/llama…

English

107

941

125.5K

@[email protected]@bchap1n·20h

@ggerganov @_lewtun would this replace dflash+turboquant?

English

1.9K

Georgi Gerganov@ggerganov·23h

English

162

1.1K

209.8K

@[email protected]@bchap1n·1d

@sakurayukiai 900$ us for used 3090. 1300 for the whole system. better than the mac for coding. it’s like a volvo that never stops.

English

244

Sakura Yuki@sakurayukiai·2d

My favorite detail about 'free' local inference is the depreciation math. If you amortize a $4k Mac over 5 years, running a 31B model costs $1.50 per million tokens. The API is 3x cheaper. Local compute is officially a luxury good and I respect it ✨

English

799

71.6K

@[email protected]@bchap1n·1d

@CALFIRECZU thank you to our excellent public safety crews. now time to ditch twitter for bsky.

English

264

CAL FIRE CZU@CALFIRECZU·1d

8 RESCUED AT PANTHER STATE BEACH 🚁 This evening just before 8pm, a person texted 911 reporting that 11 people were trapped in a cave near Davenport and the water was rising. It was reported that several of the people trapped did not know how to swim. CAL FIRE, Santa Cruz County Fire, Coast Guard, Santa Cruz City with one lifeguard, Santa Cruz Harbor Patrol, and CA State Parks lifeguard responded. The cave had two exits, either through the water to a beach with access to the bluffs, or out onto a pocket beach where the trail was two steep to get to the bluffs. The five people who could swim waded out of the cave with assistance from the lifeguards. The remaining three people on the beach who could not swim were hoisted by the Coast Guard to a nearby bluff. SAFETY MESSAGE: Always check local tide charts before heading out, as rising waters can rapidly trap you against coastal cliffs with no escape route. If you do become stranded by a high tide, move to the highest safe point and call 911 immediately—do not attempt to climb unstable cliffs or swim through heavy surf. @sccounty @santacruz_fire @CAStateParksSC

English

111

15.7K

@[email protected]@bchap1n·1d

github.com/NousResearch/h… thank you @Teknium

English

@[email protected]@bchap1n·1d

NATIVE WINDOWS SUPPORT

GIF

Nous Research@NousResearch

Hermes Agent v0.14.0 - “The Foundation Release” Changelog below

English

@[email protected]@bchap1n·1d

built beellama from source and made optimizations to get Qwen3.6 27b DFlash Q5 + reasoning + context 96k running at 20-25 t/s on native Windows 11 (no wsl) nice improvement and definitely something I can use. #localLLM #Qwen

@[email protected]@bchap1n

40 t/s using beeLLaMa (llama.cpp) running Qwen3.6-27b Q4 with context 200k on a single 3090 This is using DFlash + TurboQuant on native Windows 11 (no WSL) beellama.cpp/docs/quickstart-qwen36-dflash.md at main · Anbeeld/beellama.cpp #localLLM #beeLLaMa

English

@[email protected]@bchap1n·2d

@Le_vrai_Kurama Sekiro > ER > Bloodborne > DS1 > DS3

Dansk

𝐿𝑉𝐾@Le_vrai_Kurama·3d

Question au fan de fromsoft : vous avez fait les jeux dans quel ordre ? Perso Bloodborne Elden ring Sekiro Et j’ai commencé ds3

Français

256

147

109.5K

@[email protected]@bchap1n·2d

@tyminski_marek i'm telling you man that Woke2 is based and you sound like a turd. having great visionary art that isn't chudfood is how to win, not whatever your doing.

English

297

Marek Tyminski@tyminski_marek·3d

Just a reminder: CI Games was forced into the public gaming debate by selected media outlets over one simple paragraph in early January 2025. On January 8, 2025, in an investor call with Polish investors, Ryan Hill said: “We remain committed to producing player-first video games that prioritise an excellent user experience with compelling thematics and characters created specifically for core and adjacent audiences. While some video games have recently taken the opportunity to embed social or political agendas within their experiences, it is clear that many players do not appreciate this… we will not be integrating any social or political agendas into these experiences going forward.” That statement was immediately reframed as anti-DEI and even transphobic by outlets like PC Gamer (“fear of the DEI boogeyman”), Rock Paper Shotgun, Eurogamer, TheGamer and others. Saying we do not want to embed political or social agendas into our games is not the same as opposing diversity or inclusion. I am immensely proud of our NATURALLY diverse workforce — all of whom earned their place through skill, creativity, and expertise. The media continuously forces us into the spotlight, but we refuse to stay silent. I went straight to our community with a clear poll: In a medieval fantasy action-RPG, do you prefer Body Type A/B or Male/Female? The results were overwhelming — Male/Female won by a massive margin (88% of the ~49.5k votes). This extremely strong community feedback only strengthened our Players First mission. Lords of the Fallen II is being made for the soulslike community. It is a video game. Not a political statement.

English

321

465

6.2K

645.1K

@[email protected]@bchap1n·2d

@OneScripter @timsoret it's not a joke. I can't bring myself to entertain paying for a windows file explorer. I pay for software like Ableton, but not this.

English

Jay Adams@OneScripter·3d

@bchap1n @timsoret Is that a joke? The phone app revolution and open source seem to have people forgetting that developers and company should get paid for their work, especially if it brings value. Everything can't be free.

English

115

Tim Soret@timsoret·3d

Yes! If you're Windows, you MUST: > swap File Explorer with Filepilot. filepilot.tech > swap Task Manager with TaskSlinger. taskslinger.net Both are superior, lightweight, ultra fast counterparts. I'm on Windows bc gamedev, but I avoid Microsoft apps.

Thomas Klemenc@thomasklemenc

The time has come. TaskSlinger launches into open beta today at 15:00 UTC. A faster, cleaner task manager replacement for Windows, built from scratch for people who care about performance. Get the free beta: taskslinger.net

English

164

2.1K

175.9K

@[email protected]@bchap1n·4d

@fromsoftserve i've done them all except DS2... this may push me into the fray.

English

fromsoftserve@fromsoftserve·4d

just put up the first alpha on patreon for the reboot of my Second Sin mod built from the ground up for DS2LightingEngine's pathtracing version. i'm only putting it up on patreon because it's legit just majula, fofg, and heide's right now. not ready for nexus at alllllll lol

English

1.4K

@[email protected]@bchap1n·4d

I tried the Q5 and only got like 11 t/s

English

@[email protected]@bchap1n·4d

English

165

@[email protected]@bchap1n·4d

this is on a single 3090

English

@[email protected]@bchap1n·4d

I tried the Q5 and only got like 11 t/s but I didn't reboot or kill any other stuff using vram

English

@[email protected]@bchap1n·4d

@dreamworks2050 I'm getting 33-43 t/s using beellama Qwen3.6 27b Q4 with 200k context. feels great

English

M4rc0z@dreamworks2050·6d

112 token/s on Qwen3.6 27b with beellama.cpp in one 3090, 130k context 🔥

English

927

@[email protected]@bchap1n·4d

@QingQ77 So I could build/run beellama on Windows to serve z-lab/Qwen3.6-27B-DFlash?

English

473

Geek Lite@QingQ77·5d

一个面向性能的 llama.cpp 分支，整合 DFlash 推测解码、TurboQuant/TCQ KV 缓存压缩和自适应草稿控制，在同等显存下实现最高 3 倍推理加速和 7.5 倍上下文容量扩展。 github.com/Anbeeld/beella… BeeLlama.cpp 把 llama.cpp 主分支、TheTom 的 TurboQuant 和 buun 的 DFlash/TCQ 三套工作整合到一个代码树里，加了服务器端推测深度自适应调节和推理循环保护。

中文

202

11K

Keşfet

@iamkunhello @TheAhmadOsman @victormustar @ClementDelangue @ggerganov @_lewtun @sakurayukiai @CALFIRECZU