ryan mathieu

258 posts

ryan mathieu banner
ryan mathieu

ryan mathieu

@gapDEEPry

kernel guy. || Fast vs Slow Thinking

Inside Kernels Katılım Ocak 2018
526 Takip Edilen148 Takipçiler
Dejounte Murray
Dejounte Murray@DejounteMurray·
I Feel Like Giving A Couple Of My FOLLOWERS That FW Me $5,000 Each!!! 🖤 I Really Love The REAL AND GENUINE ONES!!!!!!!
English
3.4K
1K
19.5K
836.9K
kalomaze
kalomaze@kalomaze·
@teortaxesTex i will eat crow the second reliable estimates of in-context learning scaling well with diffusion lms crops up here
English
2
0
31
1.7K
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
> MDM-Prime-v2 is 21.8× more compute-efficient than autoregressive models I may be humiliated extremely hard with my diffusionLM skepticism.
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞) tweet mediaTeortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞) tweet media
You Jiacheng@YouJiacheng

HUGE if true. If true, this is probably a larger efficiency gain than ALL publicly available techniques since DeepSeekMoE(Jan 2024) COMBINED. And it can just win modded-nanogpt speedrun. (1e18 is 250s@50%MFU, but the loss is significantly lower than 3.28) cc @classiclarryd

English
10
8
118
19K
You Jiacheng
You Jiacheng@YouJiacheng·
HUGE if true. If true, this is probably a larger efficiency gain than ALL publicly available techniques since DeepSeekMoE(Jan 2024) COMBINED. And it can just win modded-nanogpt speedrun. (1e18 is 250s@50%MFU, but the loss is significantly lower than 3.28) cc @classiclarryd
Chen-Hao (Lance) Chao@chenhao_chao

(2/7) 💵 With training costs exceeding $100M for GPT-4, efficient alternatives matter. We show that diffusion LMs unlock a new paradigm for compute-optimal language pre-training.

English
6
12
223
45.7K
ryan mathieu
ryan mathieu@gapDEEPry·
@xeophon Idk if it’s the prime effect but you are either tweeting way more or you are now all over my feed (no complaints)
English
0
0
0
6
Xeophon
Xeophon@xeophon·
it’s so weird how smart the models are when you never bother to look what it’s doing vs. how dumb they are when you observe them
English
16
4
271
12.2K
Hyena
Hyena@hy3na_xyz·
Imagine a group of ex OAI researchers being afraid lil old Silares is gonna reverse engineer their entire IP in 2 weeks. Imagine.
English
2
0
21
3.5K
ryan mathieu
ryan mathieu@gapDEEPry·
You guys think Honey Bunnie actually cares about distribution of training data of svg in frontier LLMs?
ryan mathieu tweet media
English
1
0
6
71
Ben Clavié
Ben Clavié@bclavie·
I'm so excited to introduce this! We've worked on a million different moving parts to produce this. I'm fairly confident it's the best multimodal model that exists, period -- and it's not too shabby at pushing back the LIMITs of retrieval either...
Mixedbread@mixedbreadai

Introducing Mixedbread Wholembed v3, our new SOTA retrieval model across all modalities and 100+ languages. Wholembed v3 brings best-in-class search to text, audio, images, PDFs, videos... You can now get the best retrieval performance on your data, no matter its format.

English
37
41
410
138.4K
will brown
will brown@willccbb·
@fkasummer pricing power on things like compute / hiring / acquisitions. investors largely don’t care about realized profit at this stage, they care about theoretical margins + growth. if you’re profitable as a pre-IPO AI SaaS company in 2026, you’re probably not spending enough on R&D
English
6
1
134
13.5K
himanshu
himanshu@himanshustwts·
how to become an awesome ai researcher in 2026:
himanshu tweet media
English
16
17
506
21.6K
will brown
will brown@willccbb·
you can create complex agentic environments and launch RL training runs with a single prompt. deploy trained inference endpoints with a single click. no GPUs, no SSH, no vLLM. just `prime`. guide: docs.primeintellect.ai/guides/rl-trai…
English
58
64
919
135.8K
Iain Dunning
Iain Dunning@iaindunning·
How do you talk to Claude? Say, when using Claude Code, at the end of a session you asked it to write its learnings in a text file. In a new session: A: "a previous claude wrote this summary, read it" B: "you previously wrote this summary, read it"
English
15
0
39
8.7K
MrBeast
MrBeast@MrBeast·
@Noxlcs I havnt done the math but I feel like it’s prob higher tbh
English
1.1K
276
30.7K
1.1M
Noxic
Noxic@Noxlcs·
MrBeast has crossed an estimated $200,000,000 given away since his first giveaway 9 years ago
Noxic tweet mediaNoxic tweet media
English
231
269
19.5K
1.2M
Tibo
Tibo@thsottiaux·
On codex, which speed do you use?
English
139
8
162
34K