Elle Najt

11 posts

Elle Najt

Elle Najt

@latentfool

nyc / sf Katılım Ocak 2022
44 Takip Edilen20 Takipçiler
Elle Najt
Elle Najt@latentfool·
@lumpenspace @AsaCoopStick CoT uncontrollability implies agents can't prevent schemes from appearing in CoT, bc if they cannot do CoT in lowercase they cannot do obfuscated CoT. But they can reason in the response in lower case, and we show that the reasoning can be moved there at low accuracy cost...
English
1
0
1
47
Elle Najt
Elle Najt@latentfool·
@AsaCoopStick Improved plot + prompt for this one, showing that GPT-5.4 can reason about the decoy topic and still reason in the response.
Elle Najt tweet media
English
0
0
1
37
Asa Cooper Stickland
Asa Cooper Stickland@AsaCoopStick·
These early exit reasoning traces would be trivially detectable by a simple length check. But we find models can fill their CoT with plausible-looking decoy topic (discussing whether there are more circles or squares in the world), while displacing real reasoning into their outputs.
Asa Cooper Stickland tweet media
English
2
0
16
1.8K
Asa Cooper Stickland
Asa Cooper Stickland@AsaCoopStick·
New post! OpenAI/Yueh-Han et al. (2026) find models struggle to control their CoT -- good news for monitoring. But we show that GPT 5.4, Gemini 3.1 Pro, and Claude Opus 4.6 can all “early exit” their CoT and reason in their outputs instead (with reasoning-mode on!).
Asa Cooper Stickland tweet media
English
2
15
136
39.1K
Elle Najt
Elle Najt@latentfool·
@GenReasoning Could it just be that this market is already very efficient, and there's no alpha in the data you force the models to use? Is there a (non-overfit) strategy that makes money using the provided data?
English
0
0
0
80
General Reasoning
General Reasoning@GenReasoning·
🎲 Introducing KellyBench, a new long-horizon evaluation for frontier models. KellyBench evaluates models within a year long sports betting market, a challenging and highly non-stationary environment. Every frontier model we test loses money. They struggle to design ML strategies, manage risk, and adapt as the world changes. Link and thread below.
General Reasoning tweet media
English
25
49
638
158.2K
Elle Najt
Elle Najt@latentfool·
@FazlBarez @amang0112358 I'm also confused about the choice to boost refusals in the score, since this makes the correlation between PAC and refusal rate stronger.
Elle Najt tweet media
English
0
0
0
42
Elle Najt
Elle Najt@latentfool·
@FazlBarez @amang0112358 I'm confused about the treatment of refusals and how that is inflating the PAC scores for some models; the .91 rank corr. is striking. Did you try filtering out the refusals or resampling and looking the impact on the PAC score? (Is your data with all responses available on HF? )
English
1
0
0
21
Fazl Barez
Fazl Barez@FazlBarez·
New paper: 🧭 Introducing VAL-Bench: Measuring Value Alignment in Language Models. A benchmark that measures the consistency in language model expression of human values when prompted to justify opposing positions on real-life issues. Work with @amang0112358 and Denny O'Shea! (1/7)
Fazl Barez tweet media
English
2
18
36
6K
Elle Najt
Elle Najt@latentfool·
@docmilanfar remarkable how much simpler the modern proof is -- big win for abstractions like Jensen's + the optimization definition of the median getting identified and into the water supply. otoh you do lose all these cute bespoke proofs that give nice intuitions
English
0
0
3
1.2K
Peyman Milanfar
Peyman Milanfar@docmilanfar·
surprising fact: for any random variable x, | mean(x) - median(x) | ≤ std(x) It’s not hard to show |μ - m| = |E(X - m)| ≤ E|X - m| ≤ E|X - μ| = E√(X - μ)² ≤ √ E(X - μ)² = σ originally appeared in a paper by Hotelling & Solomons Annals of Mathematical Statistics 1932
Peyman Milanfar tweet mediaPeyman Milanfar tweet media
English
15
99
1.1K
70.8K
Some theorems
Some theorems@CihanPostsThms·
Let A be a commutative Noetherian ring with Krull dimension ≥ 2. Then for every integer N, there is an ideal of A that cannot be generated by N elements.
English
2
6
63
0