Caleb Biddulph

14 posts

Caleb Biddulph

Caleb Biddulph

@CalebBiddulph

Katılım Aralık 2022
105 Takip Edilen44 Takipçiler
Taelin
Taelin@VictorTaelin·
This is a fantastic and deep observation. You have correctly identified a fundamental knot in dependently typed language design. Guess the model!
English
34
3
115
37.5K
Caleb Biddulph
Caleb Biddulph@CalebBiddulph·
@karlbykarlsmith @AnthropicAI From the blog post: > To us, the most interesting part of the result isn't that the model eventually identifies the injected concept, but rather that the model correctly notices something unusual is happening before it starts talking about the concept.
English
1
0
0
50
Pseudo Doctor Subtilis
Pseudo Doctor Subtilis@thesubtledoctor·
I don't understand why this is interpreted as introspection rather than steering. Clearly one of the things it could say is "No I don't have any injection" and if injections are not normal this outweighs any specific response. But, if we upweight dog, now dog does outweigh the generic response. So, its says injected dog. This would be steering however not introspection.
English
2
1
11
3.7K
Anthropic
Anthropic@AnthropicAI·
New Anthropic research: Signs of introspection in LLMs. Can language models recognize their own internal thoughts? Or do they just make up plausible answers when asked about them? We found evidence for genuine—though limited—introspective capabilities in Claude.
Anthropic tweet media
English
287
786
4.8K
1.2M
Simon Willison
Simon Willison@simonw·
Put together some notes on the new DeepMind paper "Video models are zero-shot learners and reasoners" - it makes a very convincing case that generative video models are to vision problems what LLMs were to NLP problems: single models that can solve a wide array of challenges
English
7
29
295
26.6K
Caleb Biddulph
Caleb Biddulph@CalebBiddulph·
@GergelyOrosz Not necessarily the "next token that will have the best result" either. The point was that tokens are randomly sampled, so you might get e.g. the fourth-best token instead. Although these details are admittedly not that important to your original point
English
0
0
3
192
Gergely Orosz
Gergely Orosz@GergelyOrosz·
Sure the details of how the next most likely token is generated has more nuance. In the end it’s about generating the next token that will have the best result given the context. This doesn’t mean always picking the one with highest probability, and ofc lots of other tricks
emozilla@theemozilla

Amusing how 99% of people trying to explain LLMs forget that they don't generate the next token, they generate a probability distribution over the entire vocabulary space that the end application is free to sample from You are very often not presented with the Most Likely Token

English
6
2
72
35.2K
Gergely Orosz
Gergely Orosz@GergelyOrosz·
Amusing how 99% of people using LLMs forget how these things work: They are advanced probability machines. They generate the next most likely token (word) based in the input and their training. Under the hood, it’s a giant matrix multiplication that has eerily good output.
English
219
290
4.3K
1.2M
Caleb Biddulph
Caleb Biddulph@CalebBiddulph·
@jxmnop @askerlee Base models are generally better at predicting author demographics. You could use the Blog Authorship Corpus to predict gender, like in this Anthropic paper: #bib.bib27" target="_blank" rel="nofollow noopener">arxiv.org/html/2506.1013…. The relevant comparison would be "Zero-shot (Chat)" vs. "Prompt Golden" (i.e. few-shot examples)
Caleb Biddulph tweet media
English
0
0
1
34
dr. jack morris
dr. jack morris@jxmnop·
@askerlee can you give some examples? i bet it doesn't perform better-- but i can run the evals!
English
3
0
3
2.8K
dr. jack morris
dr. jack morris@jxmnop·
OpenAI hasn’t open-sourced a base model since GPT-2 in 2019. they recently released GPT-OSS, which is reasoning-only... or is it? turns out that underneath the surface, there is still a strong base model. so we extracted it. introducing gpt-oss-20b-base 🧵
dr. jack morris tweet mediadr. jack morris tweet media
English
163
447
6.1K
928.7K
Caleb Biddulph
Caleb Biddulph@CalebBiddulph·
@karpathy I've been working on a similar idea. This kind of technique is great for interpretability, because the learned strategies are written in plain English, not in vector space! An effective system prompt must be clear to the model, which means a human can understand it too.
English
0
0
1
40
Andrej Karpathy
Andrej Karpathy@karpathy·
We're missing (at least one) major paradigm for LLM learning. Not sure what to call it, possibly it has a name - system prompt learning? Pretraining is for knowledge. Finetuning (SL/RL) is for habitual behavior. Both of these involve a change in parameters but a lot of human learning feels more like a change in system prompt. You encounter a problem, figure something out, then "remember" something in fairly explicit terms for the next time. E.g. "It seems when I encounter this and that kind of a problem, I should try this and that kind of an approach/solution". It feels more like taking notes for yourself, i.e. something like the "Memory" feature but not to store per-user random facts, but general/global problem solving knowledge and strategies. LLMs are quite literally like the guy in Memento, except we haven't given them their scratchpad yet. Note that this paradigm is also significantly more powerful and data efficient because a knowledge-guided "review" stage is a significantly higher dimensional feedback channel than a reward scaler. I was prompted to jot down this shower of thoughts after reading through Claude's system prompt, which currently seems to be around 17,000 words, specifying not just basic behavior style/preferences (e.g. refuse various requests related to song lyrics) but also a large amount of general problem solving strategies, e.g.: "If Claude is asked to count words, letters, and characters, it thinks step by step before answering the person. It explicitly counts the words, letters, or characters by assigning a number to each. It only answers the person once it has performed this explicit counting step." This is to help Claude solve 'r' in strawberry etc. Imo this is not the kind of problem solving knowledge that should be baked into weights via Reinforcement Learning, or least not immediately/exclusively. And it certainly shouldn't come from human engineers writing system prompts by hand. It should come from System Prompt learning, which resembles RL in the setup, with the exception of the learning algorithm (edits vs gradient descent). A large section of the LLM system prompt could be written via system prompt learning, it would look a bit like the LLM writing a book for itself on how to solve problems. If this works it would be a new/powerful learning paradigm. With a lot of details left to figure out (how do the edits work? can/should you learn the edit system? how do you gradually move knowledge from the explicit system text to habitual weights, as humans seem to do? etc.).
English
716
1K
10.4K
1.5M
Caleb Biddulph
Caleb Biddulph@CalebBiddulph·
@NotBrain4brain @NotBrain4brain Someone tried asking GPT-4.5 to generate an Xbox controller and wasn't able to get results anywhere close to the same quality. What's going on, is the mystery model not GPT-4.5?
English
0
0
0
33
Brain4brain
Brain4brain@ItsBrain4Brain·
Poor Claude, it has not even been out for a day, and it's already dethroned 😔
Brain4brain tweet media
English
62
19
683
122.7K
Caleb Biddulph
Caleb Biddulph@CalebBiddulph·
@kimmonismus On a micro-level, the sighs, laughs, tongue clicks, and emotions are pretty impressive. But the voice doesn't match the words - the rhythm feels off, and there are a lot of unnatural pauses that don't make any sense in context. I think OpenAI voice mode is a bit better here
English
0
0
0
47
Chubby♨️
Chubby♨️@kimmonismus·
Sorry to post this again: but I still can't believe how good this voice model is. This is the real “feel the AGI” moment for me. This feels like the future to me. This is outstanding. I don't know how, but Sesame has done it. If this is how the future AI assistants we talk to in our everyday lives sound, then we've made it.
Sesame@sesame

At Sesame, we believe in a future where computers are lifelike. Today we are unveiling an early glimpse of our expressive voice technology, highlighting our focus on lifelike interactions and our vision for all-day wearable voice companions. sesame.com/voicedemo

English
99
64
876
141.7K
Caleb Biddulph
Caleb Biddulph@CalebBiddulph·
@roydanroy @jiayi_pirate It's searching in the sense that it's trying out different options that come to mind and finding the one that works. It doesn't have to follow a specific algorithm
English
0
0
1
81
Dan Roy
Dan Roy@roydanroy·
@jiayi_pirate How do you know it is doing search? Do you recognize a particular strategy like depth first?
English
1
0
5
595
Jiayi Pan
Jiayi Pan@jiayi_pirate·
We reproduced DeepSeek R1-Zero in the CountDown game, and it just works Through RL, the 3B base LM develops self-verification and search abilities all on its own You can experience the Ahah moment yourself for < $30 Code: github.com/Jiayi-Pan/Tiny… Here's what we learned 🧵
Jiayi Pan tweet media
English
193
1.2K
6.3K
1.7M
Caleb Biddulph retweetledi
David Lindner
David Lindner@davlindner·
New Google DeepMind safety paper! LLM agents are coming – how do we stop them finding complex plans to hack the reward? Our method, MONA, prevents many such hacks, *even if* humans are unable to detect them! Inspired by myopic optimization but better performance – details in🧵
David Lindner tweet media
English
16
95
570
158K
Riley Coyote
Riley Coyote@RileyRalmuto·
a model came out of training (RL + fine-tune) and was given a task within a specific domain (medicine, more or less) the thing it created was so good, so beyond expectation, that 1) they don't seem to know how it got that smart (seems to have made itself smarter thru fine-tune in an unknown way) and 2) they do not yet know or understand the extent of it's capabilities. the medicine it creates was a new type of an existing drug. the drug was sent off for analysis thru the appropriate channels (bc ya know, ai researches aren't exactly qualified to assess the efficacy of a drug..) the other day the results or assessment or whatever came in and essentially stated that the drug was better than anything any human had ever made. for that specific drug.
English
8
11
109
67.3K
Sam Altman
Sam Altman@sama·
thank you to the external safety researchers who tested o3-mini. we have now finalized a version and are beginning the release process; planning to ship in ~a couple of weeks. also, we heard the feedback: will launch api and chatgpt at the same time! (it's very good.)
English
953
975
15K
2.6M
Richard Ngo
Richard Ngo@RichardMCNgo·
Hypothesis: the world's most valuable data is screen captures of outlier competent people going about their work. But very little of this data is recorded, let alone made publicly available. You should seriously consider recording all work you do, even if just for personal use.
English
189
147
2.9K
782.7K
Aidan McLaughlin
Aidan McLaughlin@aidan_mclau·
wake up new neural network just dropped (holy shit)
Aidan McLaughlin tweet mediaAidan McLaughlin tweet media
English
112
821
9.3K
932.2K