Nick

232 posts

Nick

@nickwal

just emergent behavior

Katılım Haziran 2014

902 Takip Edilen178 Takipçiler

Nick@nickwal·3d

@leonardtang_ sick

English

189

Leonard Tang@leonardtang_·3d

Hello MJ1: The World's TASTIEST Judge Model Agent verification is the bottleneck to AI's progress. The field's ability to verify visual output lags far behind that of text, especially in matters of ~taste~. So we built the world's tastiest multimodal judge model, MJ1.

English

9.4K

Nick retweetledi

Bartosz Naskręcki@nasqret·5 Mar

It finally happened-my personal move 37 or more. I am deeply impressed. The solution is very nice, clean, and feels almost human. While testing new models in the last few weeks, I felt this coming, but it's an eerie feeling to see an algorithm solve a task one has curated for about 20 years. But at least I have gained a tool that understands my idea on par with the top experts in the field. And I am now working on a completely new level. My singularity has just happened… and there is life on the other side, off to infinity!

Epoch AI@EpochAIResearch

We ran GPT-5.4 (xhigh) an additional ten times on Tier 4 to get a pass @10 score. This was 38%. In one of these runs, it solved another problem no model had solved before. This problem was by @nasqret.

English

105

452

3.6K

1.1M

Nick@nickwal·5 Mar

@xeophon @PrimeIntellect congrats!

English

Xeophon@xeophon·5 Mar

Some personal news: - Finished another trip around the sun today 🫡 - Decided to join @PrimeIntellect to work on evals!! There’s a lot to be build and do couldn’t imagine a better team to do just that 🙌 - I will be in SF the next two weeks :) Just to look around, of course 👀

English

195

905

102.3K

Nick@nickwal·24 Şub

@si_pbc sick

English

Standard Intelligence@si_pbc·23 Şub

Computer use models shouldn't learn from screenshots. We built a new foundation model that learns from video like humans do. FDM-1 can construct a gear in Blender, find software bugs, and even drive a real car through San Francisco using arrow keys.

GIF

English

186

404

3.9K

1.1M

Nick@nickwal·24 Şub

@jxnlco @OpenAI @romainhuet @OpenAIDevs let’s go!!! congrats

English

jason liu@jxnlco·23 Şub

I’ve recently joined @openai to work with @romainhuet on @OpenAIDevs Now is the year of dogged pursuits But Back in 2021 i thought my technical career was over. I had chronic hand pain in both my hands and could barely tie my shoes let alone use my phone or write code. I spent a few years not thinking about what it mean for the value of my labor to go zero but to not being able to produce any labor at all… I gave up bjj. Pottery. Tech. Etc. Then, that one company that solved dota and hide and seek released chatgpt and whisper and all of a sudden with dictation and some determination I could write essays, build things, and make a living from twitter meeting great people like @eugeneyalt @dmdohan @humford @GEVS94 for my reintegration into the tech world after so many years away. From Canada advised companies for free until I had to ask them to pay me. I charged companies until I figured out pricing and asked for enough that I became an investor as well. I started a consulting business and a course business. Learning alongside @HamelHusain and @vig_xyz But through that time I learned a lot about running a business and felt like I’d stopping learning about everything else. I realized that last summer that I wanted to wrap things up and go somewhere and just get involved and be at the center of it all.

English

124

595

75.4K

Nick@nickwal·19 Şub

@ProximalHQ Congrats!

English

Proximal@ProximalHQ·18 Şub

Today, we are announcing Proximal. Proximal is a research lab for data. Our core belief is that data which is complex enough to teach today’s frontier models is not bottlenecked by domain experts, but by great ideas and excellent software. We are excited about a world in which coding agents can autonomously run for multiple weeks, solve the hardest technical problems and discover novel ideas that advance progress in various domains of science and engineering. We believe that we are not far from this future, but that the biggest bottleneck preventing us from achieving it is training data. Many companies work on data, but most of them are approaching it the wrong way. Historical capability breakthroughs are the result of creative engineers discovering scalable data collection methods, not thousands of contractors manually writing task demonstrations. Inevitably, the potential impact of human data will become smaller and smaller as model capabilities increase: agents are already outperforming most humans in many domains - the number of experts that are capable of judging model outputs shrinks with every new model release. Proximal is a new data company. We are not a recruiting firm or a talent marketplace, but a research and engineering organization that treats data as a problem which deserves the same level of rigor as work on training algorithms and model architectures. We think that this is the most impactful work towards agents that can autonomously solve complex technical problems, and intend to share our research and progress in the open.

English

317

105.9K

Nick retweetledi

David@DavidSHolz·18 Şub

5 million humanoid robots working 24/7 can build Manhattan in ~6 months. now just imagine what the world looks like when we have 10 billion of them by 2045. now imagine the year 2100.

English

589

338

4.6K

535.5K

Nick@nickwal·11 Şub

@GoodfireAI this is so cool

English

675

Goodfire@GoodfireAI·11 Şub

We used interpretability to scale RL against open-ended tasks, cutting Gemma 12B’s hallucination rate in half by teaching it to self-correct in tandem with our probing harness.

English

341

69.5K

Nick@nickwal·11 Şub

very inspiring vision for the future of research the hosted training has been incredible to iterate with

Prime Intellect@PrimeIntellect

Introducing Lab: A full-stack platform for training your own agentic models Build, evaluate and train on your own environments at scale without managing the underlying infrastructure. Giving everyone their own frontier AI lab.

English

1.4K

Nick retweetledi

Prime Intellect@PrimeIntellect·11 Şub

English

133

291

2.5K

746.9K

Nick@nickwal·6 Şub

@thdxr providing*

English

Nick@nickwal·6 Şub

@thdxr I believe this is regarding intra-turn prefill where you precondition the response by proving the first few tokens for that turn of the assistant response I believe this is unrelated to prior turns in the chat format

English

1.7K

dax@thdxr·6 Şub

are we misunderstanding this? the implication is you can't insert any content that anthropic didn't know to have generated this breaks things like switching models mid session and a dozen other things harnesses rely on i switch between claude and gpt all the time :(

English

650

89.8K

Nick@nickwal·28 Oca

@latkins yooooo amazing work!

English

Lucas Atkins@latkins·28 Oca

Today, we are releasing our first weights from Trinity-Large, our first frontier-scale model in the Trinity MoE family. American Made. - Trinity-Large-Preview (instruct) - Trinity-Large-Base (pretrain checkpoint) - Trinity-Large-TrueBase (10T pre Instruct data/anneal)

English

111

858

296.7K

Nick@nickwal·23 Oca

@msfeldstein fetch tool / web search is very brittle (constantly hangs for me) and could use some improvements regarding what it’s actually searching

English

Michael Feldstein@msfeldstein·22 Oca

What are your biggest paper cuts and quality issues you'd like to see us invest in fixing in the Cursor IDE?

English

11.8K

Nick@nickwal·16 Oca

@aye_aye_kaplan @MikeCarbone @leerob I find that the chevron dropdown isn't visible regularly on nightly

English

Jon Kaplan@aye_aye_kaplan·16 Oca

@MikeCarbone @leerob Agent Review does have the option, it looks like you're on a really old client. Just update your client and you'll see a chevron dropdown that lets you change the base branch.

English

383

Mike Carbone 🇺🇸@MikeCarbone·16 Oca

Yo @leerob would love the ability to find issues vs. custom branch (the parent), not main. Working in a graphite stack 🙏 Right now relying on bugbot in PRs but would like to do a pass locally

English

509

Nick@nickwal·13 Oca

@willccbb nemotron would be sick if it didn't mean a trip to mamba hell

English

will brown@willccbb·13 Oca

what models do you guys wanna train? maybe some bigger ones?

English

120

8.5K

Nick@nickwal·9 Oca

@aye_aye_kaplan this has got me a few times, but when @-ing cursor on Github, the prompt has to go after the @ or else the agent wont get triggered I often find myself writing a comment in review and then also mentioning cursor as a second opinion along with team members

English

Jon Kaplan@aye_aye_kaplan·6 Oca

Anyone have any bugs or quality of life improvements on their Cursor wishlist? Tell me and I'll fix it this week! * Please make sure it repros on the latest version of Cursor (2.3) so I don't waste time chasing a bug that's already fixed * Caveat that some bugs may not be fixable in such a short time

English

9.8K

Nick retweetledi

Crémieux@cremieuxrecueil·7 Oca

I've never understood this claim. The fascist leadership was non-STEM and absolutely *obsessed* with art. They considered it more important than economic affairs! And of course they were! The allure of STEM is mastery over matter—the opposite of the drive to political power.

Variety@Variety

Guillermo del Toro tells young directors they should not listen when “people tell you art is not important,” because that is “always a prelude to fascism.” “Be kind, be involved, believe in your art. At a time when people tell you art is not important, that is always a prelude to fascism. They think they can debase everything that makes us a little better, a little more human. And that, in my book, and in my life, includes monsters," @RealGDT said. Read more here: variety.com/2026/film/news…

English

155

2.2K

80.1K

Nick@nickwal·6 Oca

@aye_aye_kaplan oh amazing guess I didn’t click around enough thanks!

English

Jon Kaplan@aye_aye_kaplan·6 Oca

@nickwal You can already do this! Click the little chevron to open the dropdown!!

English

Keşfet

@leonardtang_ @xeophon @PrimeIntellect @si_pbc @jxnlco @OpenAI @romainhuet @OpenAIDevs