Cody Blakeney (@code_star) - Perfil do Twitter | Zamantika Mersobahis Locabet

Tweet fixado

Excited to announce the return of American OSS with Arcee Trinity Large. This model couldn't have been possible without the awesome collaboration of Modeling @arcee_ai , Infra @PrimeIntellect , and Data @datologyai I can't say enough about how talented the whole team at Arcee is being able to scale from their first MoE to a big boy like this in such a short time. Since the last data mix we have been in the lab pushing our midtraining and synthetic data to the limits. For Trinity Large we generated over 800B tokens of high quality synthetic code and 6.5T(!!!) tokens overall. We also added multilingual curation. This was a massive effort from the whole Datology family. From scaling up the rephrasing workflows to support heterogenous clusters to scale efficiently (@isabelle226ku, @JackUrbs, @parthjdoshi @haakonmongstad @alvind319), pushing out midtraining and mixing (@_BrettLarsen) , innovating on new code synthetic data (@amrokamal1997) new math synthetic data (David Schwab), multilingual curation (@KaleighMentzer, @agcrnz, @RicardoMonti9) and of course built on our great foundation of synthetic data (@pratyushmaini Vineeth Dorna)

Arcee.ai@arcee_ai

Today, we’re releasing the first weights from Trinity Large, our first frontier-scale model in the Trinity MoE family.

English

9

19

123

12.4K

Cody Blakeney retweetou

Cooper Leong@cooperleong22·38m

Midtraining = RL prior

Cody Blakeney@code_star

Found another great midtraining paper. I haven't seen it on my TL so thought I would share. Super excited to dig into it later but looks really promising. (ty @lukemerrick_ ) I love seeing more work unifying understanding of midtraining -> RL

English

1

2

8

313

Cody Blakeney retweetou

Alex@afurgs·13h

Hi Suhail, my name is Alex. I don't think we've had the chance to meet

Suhail@Suhail

I am now at 5 GPU providers being completely sold out for a single node of 8xH100s. I don’t think people understand the gravity of what is about to come.

English

7

2

118

26.4K

Cody Blakeney@code_star·10h

@birdabo Well … h100 demand and prices are actually going up right now

English

2

0

8

1.2K

sui ☄️@birdabo·15h

🚨Goldman Sachs just confirmed something insane and nobody’s talking about it companies spent $450 billion in AI and contributed zero to US economic growth. not “minimal.” actually zero. these companies fired a total of 30,000+ human workers to bet on ai infrastructure. it failed. so where did the money went? > Nvidia took $130B. Jensen got rich selling GPUs - not from AI working, from everyone buying hardware. > corporate buybacks got the rest. cut jobs, slash costs, pump stock, buy it back. money went to shareholders, not productivity. > the bubble: H100 clusters sitting underutilized. enterprise contracts collecting dust. same hype cycle. the damage is done. Meta fired 21K. Amazon 16K. Atlassian 1.6K. those jobs aren’t coming back. they bet the economy on AI productivity. Goldman confirmed there is no productivity. lmao either this pays off in 18 months or we just witnessed the largest capital misallocation in tech history and we can’t undo it 💀

unusual_whales@unusual_whales

"Massive investment in AI contributed basically zero to US economic growth last year," per Goldman Sachs

English

133

1.4K

10.2K

561.6K

Cody Blakeney@code_star·19h

Found another great midtraining paper. I haven't seen it on my TL so thought I would share. Super excited to dig into it later but looks really promising. (ty @lukemerrick_ ) I love seeing more work unifying understanding of midtraining -> RL

English

3

25

235

12.3K

Cody Blakeney@code_star·18h

In light of @cursor_ai s recent model release this seems relevant again

Cody Blakeney@code_star

I’m not going to say frontier models with good harnesses can’t solve incredibly difficult problems. I will say many people who tried fine tuning circa 2022-2024 were just way to earlier. The base models at the time just weren’t good enough to meaningfully be improved. The methods for generating training data were terrible, using models as judges was not economic or reliable and paying for human annotations was too far out of reach. Fast forward to 2026 and many people have developed sophisticated and realistic evaluation environments which if you squint look a lot like what you want for RL. Cost of tokens is way down and the intelligence per token is much higher. Training infrastructure is much simpler to setup, algorithms and data are better. It’s more possible than it has ever been to adapt models to solve real world problems with a small team.

English

1

0

13

3.7K

Cody Blakeney retweetou

Cody Blakeney@code_star·1d

I’m not going to say frontier models with good harnesses can’t solve incredibly difficult problems. I will say many people who tried fine tuning circa 2022-2024 were just way to earlier. The base models at the time just weren’t good enough to meaningfully be improved. The methods for generating training data were terrible, using models as judges was not economic or reliable and paying for human annotations was too far out of reach. Fast forward to 2026 and many people have developed sophisticated and realistic evaluation environments which if you squint look a lot like what you want for RL. Cost of tokens is way down and the intelligence per token is much higher. Training infrastructure is much simpler to setup, algorithms and data are better. It’s more possible than it has ever been to adapt models to solve real world problems with a small team.

Michel Levy Provençal@mikiane

Mistral lance Forge pour fine-tuner. 95% des boîtes n'en ont pas besoin. Un frontier + RAG + tools bat un fine-tuning custom. À chaque fois. Fine-tuning = 2023 Orchestration = 2026 Non ? mistral.ai/news/forge

English

2

6

61

7.7K

Cody Blakeney@code_star·18h

Gonna put on my resume “Frontier model data generation specialist” (aka shitpoaster)

Santosh Mohan@theycallmeMohan

If all of us are contributing training data to OpenAI/Anthropic, aren't we all "Members of Technical Staff" in our own way?

English

1

13

1.1K

Cody Blakeney retweetou

Dimitris Papailiopoulos@DimitrisPapail·1d

The entire NSF research budget is ~$9B/year. This is literally funding every awarded PI at every field and every institution. But we've decided that all of basic science is a rounding error in comparison to venture bets. Please consider funding basic science more.

English

10

47

469

33K

Cody Blakeney retweetou

Aaron Gokaslan@SkyLi0n·19h

If only Google didn’t lay off most of their language tooling folks a couple of years ago. Like the near entirety of their core Python team

ben guo 🪽@0thernet

> dario buys @bunjavascript > 3 months later > sama buys @astral_sh > 3 months later > google panic buys @linuxfoundation

English

3

11

329

34.6K

Cody Blakeney@code_star·19h

arxiv.org/abs/2603.17074

ZXX

0

1

7

728

Cody Blakeney retweetou

🎭@deepfates·21h

yes that's the Technical Staff

Jacques@JacquesThibs

i like to imagine every ai lab passes around an arcane staff as their talking stick during standups while standing around a summoning circle

English

5

9

167

5.3K

Cody Blakeney retweetou

Samip@industriaalist·22h

Announcing 10x data efficiency on NanoGPT Slowrun! There are two macro trends worth highlighting: - pretraining is nowhere close to done, and - 100x looks feasible. Writeup on all the core ideas: qlabs.sh/10x

GIF

English

6

35

233

17.7K

Cody Blakeney@code_star·20h

@redtachyon @willccbb

QAM

0

72

Ariel@redtachyon·22h

For the next few months I'm finding myself temporarily unemployed. The good part is that I can do anything OSS and not worry about IP issues. The bad part is that my GPU resources are very limited. Anyways, I'm planning to expand gyllm a bit - I think the core API is solid, but it could use some more features, and there's only so much I can do on a single Spark. Sooooo anyone got any GPU credits for some good OSS RL?

English

6

0

57

4.1K

Cody Blakeney retweetou

N8 Programs@N8Programs·1d

This is why @allen_ai is so important - it's an organization philosophically committed to complete open-source. It doesn't open-source to generate traction/investor money to then go closed source later (or farm community goodwill w/ OSS releases in order to then offer premium, closed-source models) - it open-sources as that's its ideological commitment. Even if its models aren't SOTA, the fact is that its current models are OSS and as long as it has the backing, it will keep making OSS models. And this is important, as OSS today doesn't guarantee OSS tmrw. It's unclear if Minimax 2.7 will be OSS. If Qwen4 wlil be OSS. GLM-5 Turbo is a closed-source beta. Minimax 2.5, Qwen3.5, and GLM-5 may be open-source, but they are ultimately made by commercial companies with commercial goals who take their models closed-source when they wish. So while I may use the above models now, I can rest assured that Olmo 4, and Olmo 5, and Olmo N, will all be open-source if they exist. There will not be a more-capable Olmo locked behind an API. We will always know how Olmo models are made, have the data they were trained on, and the intermediate checkpoints. And that's incredibly, incredibly valuable.

Nathan Lambert@natolambert

Qwen is irreplaceable. Has been going from strength to strength in recent times. Things will always be different, I'm hopeful we can find groups of other models to fill the void. RIP

English

5

8

96

6.9K

Cody Blakeney retweetou

jason liu@jxnlco·1d

ZXX

9

3

167

4.5K

Cody Blakeney retweetou

Eric W. Tramel@fujikanaeda·1d

Chris 🇨🇦@llm_wizard

And they call me…. Joe Nemotron.

ZXX

1

3

24

1.1K

Cody Blakeney retweetou

Eric W. Tramel@fujikanaeda·22h

a new competition from JF and the grandmasters?? you know it’s going to be something special. you don’t want to miss out on this one, kagglers :)

JFPuget 🇺🇦🇨🇦🇬🇱@JFPuget

NVIDIA is hosting a Kaggle competition. How can you train a nemotron nano model to solve scientific questions? I hope you'll enjoy it! For this competition @kaggle secured NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs from Google Cloud. These GPUs are much more powerful than the usual Kaggle GPUs. Come and try these beasts! kaggle.com/competitions/n…

English

0

1

7

877

Cody Blakeney retweetou

Cody Blakeney@code_star·1d

Continued Pretraining is going to be more and more common to help people unlock the full potential of there RL environments.

Cursor@cursor_ai

We were able to significantly improve the model quality and cost to serve. These quality improvements come from our first continued pretraining run, providing a far stronger base to scale our reinforcement learning.

English

2

5

35

2.4K

Cody Blakeney@code_star·23h

@JFPuget Hold my gun

English

0

4

91

JFPuget 🇺🇦🇨🇦🇬🇱@JFPuget·23h

I wonder why Americans think every French has a beret on his head while holding a baguette. It's like thinking that all Americans wear a Stetson hat and hold a colt in every situation.

Cody Blakeney@code_star

I can do this all day @Dorialexander Your move

English

4

0

5

2K

Cody Blakeney

Descobrir