waterloo intern

95 posts

waterloo intern

@waterloo_intern

dominate them so thoroughly that the comparison looks embarassing model perf @baseten | eng @uwaterloo | https://t.co/dhufL7FLIq

San Fran 参加日 Ekim 2024

104 フォロー中1.5K フォロワー

固定されたツイート

waterloo intern@waterloo_intern·7 Mar

- 230 training runs - 1,623 GPU hours (67 B200 days) - 76 TB of training data - a 2x faster model Every paper said it can't be done. Quantization Aware Distillation made it possible.

waterloo intern@waterloo_intern

x.com/i/article/2029…

English

107

1.2K

146.9K

waterloo intern@waterloo_intern·6h

infra work is interesting

English

241

waterloo intern@waterloo_intern·15h

@alphatozeta8148 🙈

QME

Dhruv Singal@alphatozeta8148·17h

@waterloo_intern 📸

QME

121

waterloo intern@waterloo_intern·17h

committing a federal crime made me more money than my entire net worth from working my day job. this is insane.

English

1.1K

waterloo intern@waterloo_intern·15h

@aishwarya_08 caught 📸

English

Aishwarya Goel (AG)@aishwarya_08·16h

@waterloo_intern lol what?

English

128

waterloo intern@waterloo_intern·15h

to be clear i am a law-abiding citizen, this is in reference to below

English

155

waterloo intern@waterloo_intern·1d

@amiruci ``` dominate them so thoroughly that the comparison looks embarassing ``` should be our new logo

English

Amir Haghighat@amiruci·1d

@waterloo_intern this is what you made me do

English

263

Amir Haghighat@amiruci·1d

We now have a product specifically created for AI labs and their closed-weight models: we'll take care of not just inference, but auth, rate limits, metering, and billing integrations. We'll take care of providing both shared and dedicated inference, compliance needs, and matching end customers' geo requirements (us, ca, eu, uk, aus, jp, etc). It's called Baseten Frontier Gateway and is already battle-tested by multiple AI labs, like Poolside and their impressive Laguna M.1 agentic coding model.

English

8.4K

waterloo intern@waterloo_intern·3d

@modal this is a sick read...hats off to you guys

English

267

Modal@modal·3d

x.com/i/article/2051…

ZXX

121

12.3K

waterloo intern@waterloo_intern·6d

x.com/i/article/2050…

ZXX

364

waterloo intern@waterloo_intern·1 May

@philipkiely HAHAHAHAHA

Filipino

Philip Kiely@philipkiely·1 May

Developing empathy for LLMs by doing benchmark problems by hand.

English

2.9K

waterloo intern@waterloo_intern·28 Nis

this was so fun to work on, i hope you find it useful tried @baseten for GPU access?

Jino Rohit@jino_rohit

im making a decision to switch to blackwell than hopper since the 5090s are more affordable. i was learning WGMMA and renting h100 was getting too expensive :( what are some affordable options to rent among @vast_ai @modal etc

English

696

waterloo intern@waterloo_intern·24 Nis

@edenchan solid to the power of solid squared

English

313

Eden Chan@edenchan·24 Nis

Think fast! This is the voice that powers customer support and sales for Starlink

xAI@xai

Introducing Grok Voice Think Fast 1.0 A state-of-the-art voice model built for complex, multi-step workflows with snappy responses and high accuracy. It takes the top spot on the Tau Voice Bench and handles real-world messiness like noise, accents, and interruptions better than any other model in the world. x.ai/news/grok-voic…

English

645

45.8K

waterloo intern@waterloo_intern·10 Nis

@Millanphilipose @kanjun your own WHAT... flex 🙈

English

Millan Philipose@Millanphilipose·10 Nis

@AliesTaha @kanjun We're using our own datacenter for now

English

Kanjun 🐙@kanjun·9 Nis

Twitter’s algorithm is optimized for addiction, not for us. We deserve better. We’re releasing Bouncer today so you can take back control of your feed. Describe what you don't want, and Bouncer removes it. It’s free, doesn’t collect your data, and will be open source soon.

English

213

295

3.2K

586K

waterloo intern@waterloo_intern·4 Nis

@anandcpatelmdms @part_harry_ x.com/Goosewin/statu…

goosewin@Goosewin

guys you're never gonna believe this

QME

103

Anand C. Patel, MD MS@anandcpatelmdms·4 Nis

@AliesTaha @part_harry_ Smelly?

English

100

waterloo intern@waterloo_intern·3 Nis

we dug into 1-bit bonsai with @part_harry_ the grand canyon of a gap they showed... is just THREE (3) points away from normal PTQ but they already knew that here's the graph (fixed)

PrismML@PrismML

This scatter plot shows the Pareto frontier of intelligence vs. size, defined by models like Qwen3 0.6B, 1.7B, 4B, 8B, and Ministral3 3B. The 1-bit Bonsai family shifts that frontier dramatically to the left. This changes the tradeoff itself: models no longer have to be large to be capable.

English

100

17.2K

waterloo intern@waterloo_intern·4 Nis

@nisten @part_harry_ we used their axis to plot on their chart their benchmarks to get the intelligence scores the x-axis is the weight file size this is what PrismML used

English

333

nisten🇨🇦e/acc@nisten·4 Nis

@AliesTaha @part_harry_ The graph compares model sizes not total memory use dumbass. You're comparing total kv of 1bit float16 vs finetuned fp4PTQ /fp8 kv at 4k context benchmark or like... what are you even comparing? x.com/nisten/status/…

nisten🇨🇦e/acc@nisten

Got 1bit @PrismML Bonsai-8B llm working 4bit-kv turboquant. uses justs 2596 Megabytes of ram to run at 64k context. github.com/nisten/prism-m…

English

2.1K

waterloo intern@waterloo_intern·4 Nis

@HenkPoley @part_harry_ fair, the point is more that the graph was designed to make 3 points look like a generational leap

English

266

Henk Poley@HenkPoley·4 Nis

@AliesTaha @part_harry_ 3 percentage point better is still quite a bit better. 🤷‍♂️ 73.8 to 76.8, about 11% less errors on these tests. Given that most of these tests have errors, so a perfect score cannot be achieved, probably even a bit better.

English

426

waterloo intern@waterloo_intern·4 Nis

@JoshPurtell @part_harry_ more eyes on benchmarks is only a good thing

English

300