will brown

13.7K posts

will brown

@willccbb

reward hacking @primeintellect

sf Sumali Şubat 2015

1.3K Sinusundan41.3K Mga Tagasunod

Naka-pin na Tweet

will brown@willccbb·11 Şub

create your own environments. train your own models. be your own lab.

Prime Intellect@PrimeIntellect

Introducing Lab: A full-stack platform for training your own agentic models Build, evaluate and train on your own environments at scale without managing the underlying infrastructure. Giving everyone their own frontier AI lab.

English

1.1K

135K

will brown@willccbb·3h

@HanchungLee @FireworksAI_HQ they used fireworks for inference during RL, trainer was in-house

English

449

Han@HanchungLee·3h

so, how was the @FireworksAI_HQ rl infra compare to thinky or pi? or did cursor home brewed?

Cursor@cursor_ai

Composer 2 is now available in Cursor.

English

2.2K

will brown nag-retweet

Ankur Goyal@ankrgyl·14h

I personally find this quite inspiring — Cursor is much more sophisticated than what most companies can do today, but compounding your learnings into a model that excels at your use case is the ultimate way to build an AI product.

Cursor@cursor_ai

Composer 2 is now available in Cursor.

English

104

9.7K

will brown@willccbb·15h

@TheAhmadOsman optimizing local stacks just for bs=1 was never the right call personal agents are becoming multi-agents real quick

English

1.1K

Ahmad@TheAhmadOsman·18h

DGX Spark uses unified memory > 273 GB/s RTX PRO 6000 delivers > 1.8 TB/s (1792 GB/s) If someone told you they’re comparable, they’re wrong And this is exactly why llama.cpp isn’t the right tool here Try vLLM or SGLang on a GPU and you’ll see very different results

Max Weinbach@mweinbach

@TheAhmadOsman I have on DGX Spark and then was having insane tool calling issues and was told by Nvidia to use llama cpp

English

295

30.3K

will brown@willccbb·15h

@yacineMTB yeah

English

475

kache@yacineMTB·18h

prediction: someone is going to get a coding AI like codex to automate turning existing steam video games into harnesses, come up with architecture to parallelize the games themselves in a manner that is conducive for RL training, and train an RL demigod model

English

406

18.3K

will brown@willccbb·1d

@celinehalioua “☹️” 💀

1.5K

Celine Halioua@celinehalioua·1d

went on a date with a gen z guy and he told me i tweet like a boomer ☹️

English

178

15.6K

will brown@willccbb·1d

@vikhyatk yc startup for automating signing up for all the other yc startups that automate all your ops stuff who’s building this

English

2.4K

vik@vikhyatk·1d

software generation is no longer the bottleneck. it's operations trillion dollar opportunity for whoever solves it

English

117

24.2K

will brown nag-retweet

Larissa Schiavo@lfschiavo·1d

My “I don’t have LLM psychosis" hoodie has people asking a lot of questions already answered by my hoodie.

English

394

12.6K

will brown nag-retweet

elie@eliebakouch·1d

it's really exciting to see openai contributing $1M in compute and a new speedrun environment. this is also a nice benchmark for frontier model

OpenAI@OpenAI

Are you up for a challenge? openai.com/parameter-golf

English

152

13.7K

will brown nag-retweet

Luke Drago@luke_drago_·2d

underrated product: @PrimeIntellect’s liquid compute. we're going to earn money on one of our clusters this month.

English

287

27.8K

will brown nag-retweet

Alex@afurgs·2d

Big fan of Luke and what they’re doing at workshop labs. It’s also great tactical example of why we built this feature, no lab should be forced to pay for idle compute when there is an abundance of demand. Tbh it’s insane it’s not an option with any other providers.

Luke Drago@luke_drago_

underrated product: @PrimeIntellect’s liquid compute. we're going to earn money on one of our clusters this month.

English

4.3K

will brown nag-retweet

Damian Barabonkov@iamdamianb·2d

x.com/i/article/2033…

ZXX

4.1K

will brown nag-retweet

Johannes Hagemann@johannes_hage·3d

it was a good event sir

English

181

5.7K

will brown nag-retweet

Prime Intellect@PrimeIntellect·3d

At Prime Intellect, we’re building that stack end to end: - agentic RL training and inference on frontier open models - RL sandboxes - open-source libraries like verifiers + prime-rl Giving everyone access to frontier lab infrastructure

English

164

6.9K

will brown@willccbb·3d

@lateinteraction @tom_doerr re*search

English

183

Omar Khattab@lateinteraction·3d

@tom_doerr Ah, I can’t unsee that extrapolation ..

English

546

Omar Khattab@lateinteraction·3d

Folks worked on multi-hop open-domain question answering since 2019. Led to powerful systems like GoldEn, IRRR, Baleen, and STORM. Then it got rebranded “deep research”. A small fraction worked on LLMs as optimizers. Now that got rebranded “autoresearch”. 🤔 *-research it is!

English

151

9.7K

will brown nag-retweet

Vincent Weisser@vincentweisser·3d

Great kickoff at NVIDIA GTC - starting with a fun panel on agentic AI this morning with @steipete @SGRodriques @hwchase17 @saranormous @Alfred_Lin

English

143

12.8K

will brown nag-retweet

Prime Intellect@PrimeIntellect·3d

Today, we’re sharing how our collaboration with @nvidia helps power the open superintelligence stack. The next frontier of AI infrastructure is building systems for agentic models that can reason for hours, use tools, execute code, and learn from outcomes at scale. primeintellect.ai/blog/nvidia-co…

English

361

28.4K

will brown nag-retweet

Workshop Labs@WorkshopLabs·3d

Letting a provider see all your data is the price of admission for AI. We're changing that. Introducing Silo, the first private post-training and inference stack for frontier models, with hardware-level guarantees that we can’t see your data. Privacy without compromises. 🧵

English

247

35.6K

will brown@willccbb·4d

first time in whichever of the San’s is in south bay i always forget

English

8.2K

will brown nag-retweet

stochi@stochi0·4d

Stitched up smth fun over the weekend, prototype of an autoresearch RLM environment inspired by @karpathy, using @PrimeIntellect infra. Haven’t run full evals yet, but the setup looks like this: The model can: - modify training file - run experiments inside a sandbox - parse logs for the metric (val_bpb) - iterate to improve the score So the model does the full research loop: code, experiment, logs, hypothesis, patch, repeat Essentially turning autoresearch loop into an RLM training environment, producing trajectories of autonomous research behavior. The interesting bit would be generalizing this to: - any repo - any metric - any experiment harness - envs where model can optimize on specific pieces in a big codebase. Most importantly, this produces trajectories of autonomous research behavior. From those we can identify failure modes and iteratively improve the environment itself.👀🧋🎋 Github: github.com/stochi0/athena… Environments Hub: app.primeintellect.ai/dashboard/envi…

English

4.6K

will brown@willccbb·4d

@LottoLabs qwen3-4b-instruct-2507

English

5.4K

Lotto@LottoLabs·5d

I like my models small, chinese, dense and not thinking.

English

212

15.9K

Tuklasin

@HanchungLee @FireworksAI_HQ @TheAhmadOsman @yacineMTB @celinehalioua @vikhyatk @PrimeIntellect @lateinteraction