Cody Blakeney (@code_star) - Twitter प्रोफ़ाइल

पिन किया गया ट्वीट

Excited to announce the return of American OSS with Arcee Trinity Large. This model couldn't have been possible without the awesome collaboration of Modeling @arcee_ai , Infra @PrimeIntellect , and Data @datologyai I can't say enough about how talented the whole team at Arcee is being able to scale from their first MoE to a big boy like this in such a short time. Since the last data mix we have been in the lab pushing our midtraining and synthetic data to the limits. For Trinity Large we generated over 800B tokens of high quality synthetic code and 6.5T(!!!) tokens overall. We also added multilingual curation. This was a massive effort from the whole Datology family. From scaling up the rephrasing workflows to support heterogenous clusters to scale efficiently (@isabelle226ku, @JackUrbs, @parthjdoshi @haakonmongstad @alvind319), pushing out midtraining and mixing (@_BrettLarsen) , innovating on new code synthetic data (@amrokamal1997) new math synthetic data (David Schwab), multilingual curation (@KaleighMentzer, @agcrnz, @RicardoMonti9) and of course built on our great foundation of synthetic data (@pratyushmaini Vineeth Dorna)

Arcee.ai@arcee_ai

Today, we’re releasing the first weights from Trinity Large, our first frontier-scale model in the Trinity MoE family.

English

9

19

123

12.4K

Cody Blakeney@code_star·37m

@gospaceport 👀

QME

0

5

Digital Spaceport@gospaceport·37m

@code_star Totally doable w/o a fire. This is (currently) just a 3kWh rig. youtu.be/93pQvbhOF0s

YouTube

English

1

0

1

11

Cody Blakeney@code_star·1d

I was day dreaming about building a 4x rtx pro 6000 workstation for my home and then realized I'd probably blow a fuse or start a fire.

English

12

1

29

2.1K

Cody Blakeney रीट्वीट किया

Arcee.ai@arcee_ai·43m

Here are a few of our favorite shots from our recent out-of-home campaign. Loving how the Arcee teal cuts right through the noise of downtown SF and the traffic on the 101 + a bonus shot from the DC metro.

English

1

3

7

217

Cody Blakeney@code_star·50m

This

Boyuan (Nemo) Chen@boyuan_chen

The eval and RL environment piece is underappreciated. Most of the effort goes into the training loop itself but the quality of the feedback signal ends up mattering way more than which optimizer you pick. A strong base model just makes it easier to debug whether your reward is measuring what you think it is.

English

1

0

2

323

Cody Blakeney@code_star·1h

@willccbb A prime one even!

English

0

4

89

Cody Blakeney@code_star·1h

@willccbb That’s a great idea!

English

1

0

4

136

Cody Blakeney@code_star·2h

Model adaptation is coming. It works, and learning how to do it will is going to be a big differentiator for people going forward. Even if you have ambitions to train from scratch starting from great models helps you understand your problems better, make evals, RL environments, adapt to scale. I’m excited to see how this evolves.

clem 🤗@ClementDelangue

Looks like it’s confirmed Cursor’s new model is based on Kimi! It reinforces a couple of things: - open-source keeps being the greatest competition enabler - another validation for chinese open-source that is now the biggest force shaping the global AI stack - the frontier is no longer just about who trains from scratch, but who adapts, fine-tunes, and productizes fastest (seeing the same thing with OpenClaw for example).

English

6

2

47

4.3K

Cody Blakeney रीट्वीट किया

Cooper Leong@cooperleong22·3h

Midtraining = RL prior

Cody Blakeney@code_star

Found another great midtraining paper. I haven't seen it on my TL so thought I would share. Super excited to dig into it later but looks really promising. (ty @lukemerrick_ ) I love seeing more work unifying understanding of midtraining -> RL

English

1

4

25

2.8K

Cody Blakeney रीट्वीट किया

Alex@afurgs·16h

Hi Suhail, my name is Alex. I don't think we've had the chance to meet

Suhail@Suhail

I am now at 5 GPU providers being completely sold out for a single node of 8xH100s. I don’t think people understand the gravity of what is about to come.

English

7

2

119

27.1K

Cody Blakeney@code_star·13h

@birdabo Well … h100 demand and prices are actually going up right now

English

2

0

9

1.6K

sui ☄️@birdabo·18h

🚨Goldman Sachs just confirmed something insane and nobody’s talking about it companies spent $450 billion in AI and contributed zero to US economic growth. not “minimal.” actually zero. these companies fired a total of 30,000+ human workers to bet on ai infrastructure. it failed. so where did the money went? > Nvidia took $130B. Jensen got rich selling GPUs - not from AI working, from everyone buying hardware. > corporate buybacks got the rest. cut jobs, slash costs, pump stock, buy it back. money went to shareholders, not productivity. > the bubble: H100 clusters sitting underutilized. enterprise contracts collecting dust. same hype cycle. the damage is done. Meta fired 21K. Amazon 16K. Atlassian 1.6K. those jobs aren’t coming back. they bet the economy on AI productivity. Goldman confirmed there is no productivity. lmao either this pays off in 18 months or we just witnessed the largest capital misallocation in tech history and we can’t undo it 💀

unusual_whales@unusual_whales

"Massive investment in AI contributed basically zero to US economic growth last year," per Goldman Sachs

English

162

2K

14K

719.5K

Cody Blakeney@code_star·22h

Found another great midtraining paper. I haven't seen it on my TL so thought I would share. Super excited to dig into it later but looks really promising. (ty @lukemerrick_ ) I love seeing more work unifying understanding of midtraining -> RL

English

3

28

251

15.8K

Cody Blakeney@code_star·21h

In light of @cursor_ai s recent model release this seems relevant again

Cody Blakeney@code_star

I’m not going to say frontier models with good harnesses can’t solve incredibly difficult problems. I will say many people who tried fine tuning circa 2022-2024 were just way to earlier. The base models at the time just weren’t good enough to meaningfully be improved. The methods for generating training data were terrible, using models as judges was not economic or reliable and paying for human annotations was too far out of reach. Fast forward to 2026 and many people have developed sophisticated and realistic evaluation environments which if you squint look a lot like what you want for RL. Cost of tokens is way down and the intelligence per token is much higher. Training infrastructure is much simpler to setup, algorithms and data are better. It’s more possible than it has ever been to adapt models to solve real world problems with a small team.

English

1

0

16

3.9K

Cody Blakeney रीट्वीट किया

Cody Blakeney@code_star·2d

I’m not going to say frontier models with good harnesses can’t solve incredibly difficult problems. I will say many people who tried fine tuning circa 2022-2024 were just way to earlier. The base models at the time just weren’t good enough to meaningfully be improved. The methods for generating training data were terrible, using models as judges was not economic or reliable and paying for human annotations was too far out of reach. Fast forward to 2026 and many people have developed sophisticated and realistic evaluation environments which if you squint look a lot like what you want for RL. Cost of tokens is way down and the intelligence per token is much higher. Training infrastructure is much simpler to setup, algorithms and data are better. It’s more possible than it has ever been to adapt models to solve real world problems with a small team.

Michel Levy Provençal@mikiane

Mistral lance Forge pour fine-tuner. 95% des boîtes n'en ont pas besoin. Un frontier + RAG + tools bat un fine-tuning custom. À chaque fois. Fine-tuning = 2023 Orchestration = 2026 Non ? mistral.ai/news/forge

English

2

6

61

7.9K

Cody Blakeney@code_star·21h

Gonna put on my resume “Frontier model data generation specialist” (aka shitpoaster)

Santosh Mohan@theycallmeMohan

If all of us are contributing training data to OpenAI/Anthropic, aren't we all "Members of Technical Staff" in our own way?

English

1

14

1.2K

Cody Blakeney रीट्वीट किया

Dimitris Papailiopoulos@DimitrisPapail·1d

The entire NSF research budget is ~$9B/year. This is literally funding every awarded PI at every field and every institution. But we've decided that all of basic science is a rounding error in comparison to venture bets. Please consider funding basic science more.

English

10

49

476

33.9K

Cody Blakeney रीट्वीट किया

Aaron Gokaslan@SkyLi0n·22h

If only Google didn’t lay off most of their language tooling folks a couple of years ago. Like the near entirety of their core Python team

ben guo 🪽@0thernet

> dario buys @bunjavascript > 3 months later > sama buys @astral_sh > 3 months later > google panic buys @linuxfoundation

English

3

11

330

35.1K

Cody Blakeney@code_star·22h

arxiv.org/abs/2603.17074

ZXX

0

1

8

855

Cody Blakeney रीट्वीट किया

🎭@deepfates·1d

yes that's the Technical Staff

Jacques@JacquesThibs

i like to imagine every ai lab passes around an arcane staff as their talking stick during standups while standing around a summoning circle

English

5

9

199

6.3K

Cody Blakeney रीट्वीट किया

Samip@industriaalist·1d

Announcing 10x data efficiency on NanoGPT Slowrun! There are two macro trends worth highlighting: - pretraining is nowhere close to done, and - 100x looks feasible. Writeup on all the core ideas: qlabs.sh/10x

GIF

English

6

36

235

18.2K

Cody Blakeney@code_star·23h

@redtachyon @willccbb

QAM

0

72

Ariel@redtachyon·1d

For the next few months I'm finding myself temporarily unemployed. The good part is that I can do anything OSS and not worry about IP issues. The bad part is that my GPU resources are very limited. Anyways, I'm planning to expand gyllm a bit - I think the core API is solid, but it could use some more features, and there's only so much I can do on a single Spark. Sooooo anyone got any GPU credits for some good OSS RL?

English

6

0

57

4.2K

Cody Blakeney

खोजें