PrismML

2

3

957

PrismML retweetledi

Tushar Bansal@tushar_bans·6 May

We’re looking for people who love building from scratch! DM if interested.

Hey! @PrismML is hiring! We're looking for LLM people who have trained models at scale - SFT/RL, data mixtures, evals, distillation, long context, distributed training, kernels, you name it! Especially interested in people who like owning the full stack from training dynamics -> shipped models. btw, we need a DevRel too. DM me.

English

11

7

116

12.7K

PrismML retweetledi

Evan Walters@evaninwords·6 May

My job is literally to do interesting and crazy experiments, if you are into that sort of thing DM Omead or I (or @pashakho, @SahinLale, @kmattar1981, @tushar_bans, @eraznafre, @NMonti25537) and come build with us at @PrismML!

Hey! @PrismML is hiring! We're looking for LLM people who have trained models at scale - SFT/RL, data mixtures, evals, distillation, long context, distributed training, kernels, you name it! Especially interested in people who like owning the full stack from training dynamics -> shipped models. btw, we need a DevRel too. DM me.

English

3

2

39

4.5K

PrismML retweetledi

Sahin Lale@SahinLale·6 May

We’re expanding our highly technical team at @PrismML — people who love pushing model quality end-to-end, from training dynamics to shipped models. If you’ve scaled LLM training, RL/SFT, evals, distillation, long context, kernels, or infra, we’d love to talk.

Hey! @PrismML is hiring! We're looking for LLM people who have trained models at scale - SFT/RL, data mixtures, evals, distillation, long context, distributed training, kernels, you name it! Especially interested in people who like owning the full stack from training dynamics -> shipped models. btw, we need a DevRel too. DM me.

English

4

5

22

2.4K

Omead Pooladzandi@HessianFree·6 May

Hey! @PrismML is hiring! We're looking for LLM people who have trained models at scale - SFT/RL, data mixtures, evals, distillation, long context, distributed training, kernels, you name it! Especially interested in people who like owning the full stack from training dynamics -> shipped models. btw, we need a DevRel too. DM me.

English

16

19

382

40.6K

PrismML@PrismML·6 May

@HessianFree Join us! #careers" target="_blank" rel="nofollow noopener">prismml.com/#careers

English

9

2.1K

PrismML retweetledi

Pete Soderling@petesoder·28 Nis

One of my favorite things about running @AICouncilConf for eleven years? The founders. There's a secret "track" that's not on the schedule — an invisible hallway of builders. And the next wave is showing up at SF 2026: 🧵 @EnoReyes of @FactoryAI @vikhyatk of @moondreamai @ds3638 of @honeyhiveai Emilie Schario of @kilocode @ianlivingstone of @KeycardLabs @neilmovva of @sailresearchco @CompleteSkeptic of @typesafeai @HessianFree of @PrismML @latkins of @arcee_ai Iona Hreninciuc of @runware petesoder.substack.com/publish/post/1…

English

Pico AI Server and Pico AI Studio@PicoGPT

4

16

2.1K

PrismML retweetledi

AI Council@AICouncilConf·23 Nis

Training gets the headlines. Inference gets the bill. As agents move from novelty to default workload, the hard problem isn't the model anymore. It's every millisecond and every watt between a prompt and the next token. A coding agent running for six hours straight is a very different customer than a chatbot, and the serving stack built for one isn't the stack the other needs. The Inference Systems track at AI Council 2026 is where the people rebuilding that stack share what they're actually doing. Curated by @BEBischof, Head of AI at @Theoryvc. Here's the lineup: → @yaroslavvb, Principal Researcher at @togethercompute: "What Comes After Deep Learning?" → Neil Movva, Co-Founder at Sail Research: "Great Infra for Background Agents" → @vikhyatk, CTO at M87 Labs (@moondreamai): "No Dropped Frames: Designing a VLM Around a Latency Budget" → @johnpdickerson, CEO at @MozillaAI: "Do the Boring Stuff to Make Open Source AI Win" → @CompleteSkeptic, Co-founder & CEO at @typesafeai (co-inventor of ChatGPT): "AI: Too Good to Be True, Too Bad to Be Useful" → Sriram Vishwanath, Professor and Founder at @GeorgiaTech: "Beyond Next-Token Prediction: Joint Embeddings, World Models, and Why Natural Language Isn't Enough" → Omead Pooladzandi, Co-Founder & Co-Head of Research at @PrismML: [talk title forthcoming] Thanks for curating a great track, Bryan! See you at AI Council 2026 in SF, May 12–14. 🎟️ aicouncil.com

English

17

4

26

3.7K

PrismML retweetledi

Ronald Mannak@ronaldmannak·21 Nis

Huge shoutout to @PrismML This shouldn’t be possible: a tiny model punching way above its weight. The largest version is just 1.14 GB, which means it’s small enough for a phone. Fast on a phone (spoiler: Pico for iOS is coming soon!). Insanely fast on a MacBook Pro M1 Max.

How fast you ask? About 109 tokens per second on an M1 Max MacBook Pro fast.

English

3

2

22

3.1K

PrismML retweetledi

Grover GPT@GroverGPT·21 Nis

Tiny local models like Bonsai are going to change things. For the last three years, the default way most people used AI was simple: frontier models lived in data centres, you reached them through an API, and anything local felt like a toy. That will probably stop being true in the near-ish future. Models are getting small enough to run on hardware people already own, cheap enough to live inside real power budgets, and good enough for a large share of everyday tasks. Bonsai is one recent example: Prism’s 1-bit releases are explicitly aimed at Apple hardware, including iPhone. If that continues, a lot changes. 1/ Cloud stops being the default for every task. You reach for it when the task actually justifies it. 2/ Privacy can become the default again, because more inference happens on your device instead of somewhere else’s server. 3/ The economics change. You stop paying frontier prices for every interaction forever, and start reserving expensive remote intelligence for the moments that need it. 4/ “Best model” splits in two: best in absolute terms, and best you can actually run on your phone. The energy part matters too, and maybe more than people think. The point is not just that local models may use less power for routine tasks. It is that they could change the energy topology of AI by pushing a large share of useful cognition onto hardware that already exists, including the NPU already sitting inside your device today. That changes the race. The question stops being only who has the smartest model in a lab. It becomes who can deliver the most useful intelligence under a real power budget. That rewards efficiency, distillation, open weights, clever hardware, and products that know when local is enough and when the cloud is worth the cost. Cloud does not disappear. It becomes a premium tier of cognition. And that opens the door to a very different kind of AI company from the ones winning today.

Today we’re announcing Ternary Bonsai: Top intelligence at 1.58 bits Using ternary weights {-1, 0, +1}, we built a family of models that are 9x smaller than their 16-bit counterparts while outperforming most models in their respective parameter classes on standard benchmarks. We’re open-sourcing the models under the Apache 2.0 license in three sizes: 8B (1.75 GB), 4B (0.86 GB), and 1.7B (0.37 GB).

English

4

1

29

4.6K

PrismML retweetledi

Pico AI Server and Pico AI Studio@PicoGPT·21 Nis

Pico Local AI Server 1.4.21 is now available on the Mac App Store. This release adds support for Ternary Bonsai, a lightning-fast model that outperforms many much larger models

Today we’re announcing Ternary Bonsai: Top intelligence at 1.58 bits Using ternary weights {-1, 0, +1}, we built a family of models that are 9x smaller than their 16-bit counterparts while outperforming most models in their respective parameter classes on standard benchmarks. We’re open-sourcing the models under the Apache 2.0 license in three sizes: 8B (1.75 GB), 4B (0.86 GB), and 1.7B (0.37 GB).

English

2

11

2.3K

PrismML retweetledi

Mustafa Ergisi@mustafaergisi·19 Nis

@PrismML Ran Ternary-Bonsai 8B on my iPhone through OnDevice LLM. Surprisingly fast.

English

2

6

22

3.8K

PrismML retweetledi

Jon Durbin@jon_durbin·17 Nis

Ternary is actually surprisingly powerful. Validated by bitnet and now again here. In the new model training research/experimentation I've been working on, ternary weights (in some places) actually beats bf16 (by a not-insignificant amount), at least up to the 7b scale (and with every indication that this benefit scales up).

Today we’re announcing Ternary Bonsai: Top intelligence at 1.58 bits Using ternary weights {-1, 0, +1}, we built a family of models that are 9x smaller than their 16-bit counterparts while outperforming most models in their respective parameter classes on standard benchmarks. We’re open-sourcing the models under the Apache 2.0 license in three sizes: 8B (1.75 GB), 4B (0.86 GB), and 1.7B (0.37 GB).

English

4

8

82

8.8K

PrismML retweetledi

Michel aka Agent B@MichelIvan92347·17 Nis

Interesting work here 👇

Today we’re announcing Ternary Bonsai: Top intelligence at 1.58 bits Using ternary weights {-1, 0, +1}, we built a family of models that are 9x smaller than their 16-bit counterparts while outperforming most models in their respective parameter classes on standard benchmarks. We’re open-sourcing the models under the Apache 2.0 license in three sizes: 8B (1.75 GB), 4B (0.86 GB), and 1.7B (0.37 GB).

English

4

7

2.5K

PrismML retweetledi

Kanu Gulati @Khosla Ventures@KanuGulati·17 Nis

Ternary Bonsai 8B is within 5% of Qwen 3 8B at 9x lower memory! Congratulations @PrismML on yet another exciting release! cc @khoslaventures

Today we’re announcing Ternary Bonsai: Top intelligence at 1.58 bits Using ternary weights {-1, 0, +1}, we built a family of models that are 9x smaller than their 16-bit counterparts while outperforming most models in their respective parameter classes on standard benchmarks. We’re open-sourcing the models under the Apache 2.0 license in three sizes: 8B (1.75 GB), 4B (0.86 GB), and 1.7B (0.37 GB).

English

2

8

39

5.6K

PrismML retweetledi

0xSero@0xSero·16 Nis

One of the things I tried researching but found really hard. 1.58bpw is insane 10x smaller than original, I hope they push it to much larger models

Today we’re announcing Ternary Bonsai: Top intelligence at 1.58 bits Using ternary weights {-1, 0, +1}, we built a family of models that are 9x smaller than their 16-bit counterparts while outperforming most models in their respective parameter classes on standard benchmarks. We’re open-sourcing the models under the Apache 2.0 license in three sizes: 8B (1.75 GB), 4B (0.86 GB), and 1.7B (0.37 GB).

English

9

11

208

12.6K

PrismML retweetledi

Sahin Lale@SahinLale·16 Nis

Check out how you can use Ternary Bonsai 8B 🌳 for tool calling in your everyday life—an impressive demo on an amazing platform by @AnythingLLM and @tcarambat!

English

1

8

20

2.2K

PrismML retweetledi

Robert Scoble@Scobleizer·16 Nis

The models are getting smaller. Great for OpenClaws and Hermes. Gotta heat them up! Yesterday someone told me "phones are three to five years away." Oh, really?

Today we’re announcing Ternary Bonsai: Top intelligence at 1.58 bits Using ternary weights {-1, 0, +1}, we built a family of models that are 9x smaller than their 16-bit counterparts while outperforming most models in their respective parameter classes on standard benchmarks. We’re open-sourcing the models under the Apache 2.0 license in three sizes: 8B (1.75 GB), 4B (0.86 GB), and 1.7B (0.37 GB).

English

6

11

75

9K

PrismML retweetledi

Xenova@xenovacom·16 Nis

Ternary Bonsai: state-of-the-art intelligence at 1.58 bits. The models are so small they can even run locally in your browser on WebGPU! ⚡️ Here's the 8B version (just ~2GB in size) running at 60 tokens per second on my M4 Max. Try the demo out yourself! 👇

English

8

35

188

23.4K

PrismML retweetledi

rohan anil@_arohan_·16 Nis

People are the most valuable resource in action.

> > anon asked for one more state  > > we added zero  > > +600 MB  > > +5 benchmark points  > > 75.5 avg at 1.75 GB  > > still ~1/9 the size of Qwen3 8B  > > shout out brahmagupta  > > zero mattered

English