Taz

388 posts

Taz

@taz_ca

wiggling parameters

เข้าร่วม Kasım 2025

231 กำลังติดตาม12 ผู้ติดตาม

Taz@taz_ca·1d

ZXX

Taz@taz_ca·1d

@RonPualS @thsottiaux bro this can't be healthy 😭

English

551

Ron Paul@RonPualS·1d

@thsottiaux Just publish your off-peak hours and 10x the Codex quota then. I’ll move my entire circadian rhythm to match your server capacity. You don’t even need to fix the limits — just tell me when to sleep.

English

361

13.4K

Tibo@thsottiaux·1d

With Codex the there is quite the gulf in load between peak and off-peak times, and we would like to achieve more of a smoother traffic pattern as that would be a more optimal use of our compute. We have ideas, but curious what you all think we should do? Would more usage during off-peak and surge multiplier during peak times make sense?

English

792

1.7K

195.4K

Taz@taz_ca·2d

@latkins dont jinx it shh

English

Lucas Atkins@latkins·2d

Hopefully no future ones either

will brown@willccbb

no ex-big-lab employees were involved in the making of this model

English

5.8K

Taz@taz_ca·3d

@rohanpaul_ai awkward timing

English

Rohan Paul@rohanpaul_ai·5d

Anthropic top researcher Nicolas Carlini (67.2k citations on Google Scholar) says Claude is a better security researcher than him, made $3.7 mn from exploiting smart contracts, and found vulnerabilities in Ghost (a 52K+ Github star project).

English

193

2.5K

360.3K

Taz@taz_ca·3d

@jxnlco you're the funniest mf at oai Jason

English

jason liu@jxnlco·4d

I thought they trained Claude to have a secure attachment style

BOOTOSHI 👑@KingBootoshi

WHO POST-TRAINED CLAUDE'S HATRED FOR CODEX LMFAOOOOOO 'Typical Codex overreach - they see an opportunity to "improve" and can't resist.' (btw, another CLAUDE AGENT wrote this is in, but it immediately blamed the codex agents it was running itself, LOL)

English

8.9K

Taz@taz_ca·3d

@HessianFree congrats! did you guys specifically train it for some robotics tasks or is it more that the fast inference on edge enables it to be deployed for it?

English

192

Omead Pooladzandi@HessianFree·4d

your spotify cache is bigger than our largest AI model. Bonsai: 1-bit weights. 1.7B to 8B params. 14x compression vs bf16. 8x faster on edge. 256 MB to 1.2GB. Based on Qwen 3. we just came out of stealth. intelligence belongs at the edge and we're going to put it there. Apache 2.0. we compressed intelligence. more coming. @PrismML

PrismML@PrismML

Today, we are emerging from stealth and launching PrismML, an AI lab with Caltech origins that is centered on building the most concentrated form of intelligence. At PrismML, we believe that the next major leaps in AI will be driven by order-of-magnitude improvements in intelligence density, not just sheer parameter count. Our first proof point is the 1-bit Bonsai 8B, a 1-bit weight model that fits into 1.15 GBs of memory and delivers over 10x the intelligence density of its full-precision counterparts. It is 14x smaller, 8x faster, and 5x more energy efficient on edge hardware while remaining competitive with other models in its parameter-class. We are open-sourcing the model under Apache 2.0 license, along with Bonsai 4B and 1.7B models. When advanced models become small, fast, and efficient enough to run locally, the design space for AI changes immediately. We believe in a future of on-device agents, real-time robotics, offline intelligence and entirely new products that were previously impossible. We are excited to share our vision with you and keep working in the future to push the frontier of intelligence to the edge.

English

158

179.4K

Taz@taz_ca·5d

@jxnlco @theo LMFAOOO

English

395

jason liu@jxnlco·5d

Thanks for trying gpt5.4 @theo ! We’ll keep improving our models.

Kevin Thomas Van Cott@KevinVanCott

T3 Code is unusable!

English

579

114K

Taz@taz_ca·27 Mar

@cailynyongyong u mean GaaS?

English

Cailyn Y.@cailynyongyong·26 Mar

Aaas

QST

685

Taz@taz_ca·26 Mar

@Yuchenj_UW ur th coolest

English

Yuchen Jin@Yuchenj_UW·26 Mar

hey friends! 👋 Only cool people are allowed to reply to this tweet obviously.

English

264

481

49.6K

Taz@taz_ca·25 Mar

@GoogleResearch huge!

English

191

Google Research@GoogleResearch·24 Mar

Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: goo.gle/4bsq2qI

GIF

English

5.8K

39K

19.1M

Taz@taz_ca·24 Mar

@ErvistheGreat @gnukeith we're gonna have to throttle ur gpu's for this one bro

English

ErvistheGreat@ErvistheGreat·24 Mar

@taz_ca @gnukeith Fuk yeah. Rockin out with the full weights of your kock out.. or something like that

English

Keith@gnukeith·23 Mar

I don't quite get it, is the 27B model smarter than the 35-A3B model?

English

359

61.1K

Taz@taz_ca·24 Mar

@bbarski @gnukeith its different on a case by case basis since some labs know how to train MoE well enough so they can still push the performance I'm pretty sure nvidia doesn't do dense models anymore though, they're pushing that efficiency frontier and MoE's are the way to go for that

English

RM@bbarski·24 Mar

@taz_ca @gnukeith The same applies to nvidia nemotron models ?

English

148

Taz@taz_ca·23 Mar

@bnjmn_marie Yuppp thanks Ben, noticed that 27B is dense so it's got the full force of that running on each pass

English

Benjamin Marie@bnjmn_marie·23 Mar

@taz_ca The 35B is an MoE. Faster but fewer active parameters.

English

384

Benjamin Marie@bnjmn_marie·21 Mar

For OpenClaw, just use Qwen3.5 27B! Q4 GGUFs match the original's model accuracy You don't need expensive hardware or models

English

590

91.6K

Taz@taz_ca·22 Mar

@PuterOnX @TheAhmadOsman internal sources went silent for almost 2 minute during a presentation for more when someone asked during a qna 2 days ago though when I say silent I mean literally mic muted silent

English

Puter@PuterOnX·22 Mar

@taz_ca @TheAhmadOsman Yeah very dodgy this time around literally no interactions around it which is all the more irritating but I guess ahmad is confident? I think it can be an internal source

English

Ahmad@TheAhmadOsman·22 Mar

MiniMax-M2.7 weights will be opensourced within the next couple of weeks

Markets & Mayhem@Mayhem4Markets

Many are wondering whether MiniMax M2.7 will be an open weight AI model or not. One bit of evidence that suggests it will be opened up to the community for local inference is from the company's own website, which says it scores highest among open-source models. 🧐

English

309

53.3K

Taz@taz_ca·21 Mar

@Hangsiin How much API equivalent usage does the pro sub get U in a week? More than Claude?

English

848

NomoreID@Hangsiin·21 Mar

To be honest, if you're a power user, I'd say you absolutely shouldn't do it. Unless you're planning to subscribe to multiple Pro accounts. Even without using Fast, GPT-5.4 starts eating into your usage pretty quickly. Part of it is probably that I've gotten more and more used to running things in parallel. If I had used Fast, I probably would have burned through it all in a single day.

Sherwin Wu@sherwinwu

Set Codex to this and don't ever look back

English

151

27.9K

Taz@taz_ca·20 Mar

@fynnso what did U start bro 😭

English

196

Fynn@fynnso·19 Mar

was messing with the OpenAI base URL in Cursor and caught this accounts/anysphere/models/kimi-k2p5-rl-0317-s515-fast so composer 2 is just Kimi K2.5 with RL at least rename the model ID