nathan lile

1.6K posts

nathan lile banner
nathan lile

nathan lile

@NathanThinks

ceo/cofounder @ https://t.co/bDd3J4Lmzf hiring in SF 🌁 scaling synthetic reasoning. recurrent rabbit hole victim. nothing great is easy.

San Francisco Katılım Ağustos 2013
1.2K Takip Edilen2.4K Takipçiler
Sabitlenmiş Tweet
nathan lile
nathan lile@NathanThinks·
Superintelligence isn't about discovering new things; it's about discovering new ways to discover I think our latest work formalizes Meta Chain-of-Thought which we believe lies on the path to ASI When we train models on the problem-solving process itself—rather than the final solution—they internalize how to think about reasoning tasks, not just what to think The next wave of AI is a Meta-CoT loop. We can't predict what novel forms of thinking might emerge, but it points to an extraordinary synthetic future I'm so proud of @synth_labs team & our incredible open science collaborators for getting this work out
Rafael Rafailov @ NeurIPS@rm_rafailov

We have a new position paper on "inference time compute" and what we have been working on in the last few months! We present some theory on why it is necessary, how does it work, why we need it and what does it mean for "super" intelligence.

English
6
26
140
35.7K
Yasir
Yasir@0xyaza·
The RLM implementation and usage in @DSPyOSS is beautiful.
Yasir tweet media
English
6
10
203
12.8K
nathan lile retweetledi
Anikait Singh
Anikait Singh@Anikait_Singh_·
🚨🚨New Paper: Training LLMs to Discover Abstractions for Solving Reasoning Problems Introducing RLAD, a two-player RL framework for LLMs to discover 'reasoning abstractions'—natural language hints that encode procedural knowledge for structured exploration in reasoning.🧵⬇️
Anikait Singh tweet media
English
14
111
596
56K
nathan lile retweetledi
nathan lile retweetledi
Rafael Rafailov @ NeurIPS
Rafael Rafailov @ NeurIPS@rm_rafailov·
I’ll take the opposite view - current methods are saturating and we need at least 1 practical breakthrough and at least two fundamental ones (which will likely take years) just off the top of my head to reach AGI. None of these are oversight or safety related.
Stephen McAleer@McaleerStephen

Scalable oversight is pretty much the last big research problem left. Once you get an unhackable reward function for anything then you can RL on everything.

English
11
10
154
30K
You
You@parafactual·
@whybyfire the model is presented with a reddit submission and a single comment thread inside that submission. the usernames are all redacted. the task is to identify which comments in the thread are from the same user
English
4
0
7
143
You
You@parafactual·
I fucking obliterated a 32b base model s brain
You tweet mediaYou tweet media
English
5
2
104
22.9K
nathan lile retweetledi
John Burn-Murdoch
John Burn-Murdoch@jburnmurdoch·
NEW: Is the internet changing our personalities for the worse? Conscientiousness and extroversion are down, neuroticism up, with young adults leading the charge. This is a really consequential shift, and there’s a lot going on here, so let’s get into the weeds 🧵
John Burn-Murdoch tweet media
English
392
2.9K
11.7K
2.9M
Jason Wei
Jason Wei@_jasonwei·
New blog post about asymmetry of verification and "verifier's law": jasonwei.net/blog/asymmetry… Asymmetry of verification–the idea that some tasks are much easier to verify than to solve–is becoming an important idea as we have RL that finally works generally. Great examples of asymmetry of verification are things like sudoku puzzles, writing the code for a website like instagram, and BrowseComp problems (takes ~100 websites to find the answer, but easy to verify once you have the answer). Other tasks have near-symmetry of verification, like summing two 900-digit numbers or some data processing scripts. Yet other tasks are much easier to propose feasible solutions for than to verify them (e.g., fact-checking a long essay or stating a new diet like "only eat bison"). An important thing to understand about asymmetry of verification is that you can improve the asymmetry by doing some work beforehand. For example, if you have the answer key to a math problem or if you have test cases for a Leetcode problem. This greatly increases the set of problems with desirable verification asymmetry. "Verifier's law" states that the ease of training AI to solve a task is proportional to how verifiable the task is. All tasks that are possible to solve and easy to verify will be solved by AI. The ability to train AI to solve a task is proportional to whether the task has the following properties: 1. Objective truth: everyone agrees what good solutions are 2. Fast to verify: any given solution can be verified in a few seconds 3. Scalable to verify: many solutions can be verified simultaneously 4. Low noise: verification is as tightly correlated to the solution quality as possible 5. Continuous reward: it’s easy to rank the goodness of many solutions for a single problem One obvious instantiation of verifier's law is the fact that most benchmarks proposed in AI are easy to verify and so far have been solved. Notice that virtually all popular benchmarks in the past ten years fit criteria #1-4; benchmarks that don’t meet criteria #1-4 would struggle to become popular. Why is verifiability so important? The amount of learning in AI that occurs is maximized when the above criteria are satisfied; you can take a lot of gradient steps where each step has a lot of signal. Speed of iteration is critical—it’s the reason that progress in the digital world has been so much faster than progress in the physical world. AlphaEvolve from Google is one of the greatest examples of leveraging asymmetry of verification. It focuses on setups that fit all the above criteria, and has led to a number of advancements in mathematics and other fields. Different from what we've been doing in AI for the last two decades, it's a new paradigm in that all problems are optimized in a setting where the train set is equivalent to the test set. Asymmetry of verification is everywhere and it's exciting to consider a world of jagged intelligence where anything we can measure will be solved.
Jason Wei tweet media
English
59
251
1.6K
391.3K
Emmett Shear
Emmett Shear@eshear·
METR’s analysis of this experiment is wildly misleading. The results indicate that people who have ~never used AI tools before are less productive while learning to use the tools, and say ~nothing about experienced AI tool users. Let's take a look at why. x.com/METR_Evals/sta…
METR@METR_Evals

At the beginning of the study, developers forecasted that they would get sped up by 24%. After actually doing the work, they estimated that they had been sped up by 20%. But it turned out that they were actually slowed down by 19%.

English
44
68
848
190.7K
nathan lile
nathan lile@NathanThinks·
up _and_ left 😲
nathan lile tweet media
English
1
3
18
74.5K
nathan lile retweetledi
Sam Altman
Sam Altman@sama·
I’m not big on identities, but I am extremely proud to be American. This is true every day, but especially today—I firmly believe this is the greatest country ever on Earth. The American miracle stands alone in world history. I believe in techno-capitalism. We should encourage people to make tons of money and then also find ways to widely distribute wealth and share the compounding magic of capitalism. One doesn’t work without the other; you cannot raise the floor and not also raise the ceiling for very long. The world should get richer every year through science and technology, but everyone has to be in the “up elevator”. I think the government usually does a worse job than markets, and so we need to encourage our culture of innovation and entrepreneurship. I also believe that education is critically important to keeping the American edge. I believed this when I was 20, when I was 30, and now I am 40 and still believe it. The Democratic party seemed reasonably aligned with it when I was 20, losing the plot when I was 30, and completely to have moved somewhere else at this point. So now I am politically homeless. But that’s fine; I care much, much more about being American than any political party. I’d rather hear from candidates about how they are going to make everyone have the stuff billionaires have instead of how they are going to eliminate billionaires. The American experiment has always been messy. I am hopeful for another great 250 years. Happy 4th!
English
2.8K
2.3K
32.8K
3.1M
nathan lile retweetledi
Vaibhav (VB) Srivastav
Vaibhav (VB) Srivastav@reach_vb·
Apple dropping diffusion based Coding LLMs on Hugging Face was not on my bingo
Vaibhav (VB) Srivastav tweet media
English
18
83
846
106.9K
nathan lile retweetledi
Fred Lambert
Fred Lambert@FredLambert·
Xiaomi got 200,000 orders in 3 minutes for the YU7 and I’m not even surprised. The value proposition is just nuts. I’m kinda of bummed because it means a few more years of having to satisfy demand from China before global expansions.
Fred Lambert tweet mediaFred Lambert tweet mediaFred Lambert tweet media
English
53
51
382
83.6K
nathan lile
nathan lile@NathanThinks·
What if models could learn which problems _deserve_ deep thinking? No labels. Just let the model discover difficulty through its own performance during training. Instead of burning compute 🔥💸 on trivial problems, it allocates 5x more on problems that actually need it ↓
SynthLabs@synth_labs

Our new method (ALP) monitors solve rates across RL rollouts and applies inverse difficulty penalties during RL training. Result? Models learn an implicit difficulty estimator—allocating 5x more tokens to hard vs easy problems, cutting overall usage by 50% 🧵👇1/10

English
1
5
38
5.8K
nathan lile
nathan lile@NathanThinks·
@yacineMTB lmfao wish x had a trends dashboard of 📈📉 ‘user`s posts as ‘seen_by_my_timeline’ over ‘365 | …’ sometimes catches me off guard who silently crashes out
English
1
0
3
144
kache
kache@yacineMTB·
Someone should fire roon so he can post again
English
1
1
52
12.9K