Max Niederman

354 posts

Max Niederman

Max Niederman

@MaxNiederman

Head of Quality @ Mechanize | 19M

San Francisco, CA Katılım Ekim 2019
192 Takip Edilen117 Takipçiler
Max Niederman
Max Niederman@MaxNiederman·
The AI labs are really just a front for Big Em Dash.
English
0
0
5
93
Max Niederman
Max Niederman@MaxNiederman·
@GuiveAssadi @dfrsrchtwts You’re definitely not supposed to spoof this, but it’s a very frequently violated norm. Most web scrapers are at least this dishonest.
English
1
0
3
53
Daniel Filan
Daniel Filan@dfrsrchtwts·
My biggest gripe with persona selection stuff (or maybe just the pop version) is that it feels like it doesn't grapple with the way LLMs are actually pretty unlike any human persona.
English
4
1
40
2.9K
Tom Kelly ケリー・トム
@MaxNiederman @tomieinlove Banknotes were originally much the same. They weren’t standardised by region so there was no way of knowing it if was legitimate. They basically just promised you get gold of equivalent value if you went to their bank.
English
1
0
2
218
tomie
tomie@tomieinlove·
Astonishing that we could ever have a system like checks. We gave everyone the ability to write themselves arbitrary-denomination banknotes on the hope—with no way of checking—they’ll have enough funds to cover it later on. How did this not immediately collapse under fraud?
English
164
8
1.9K
175.9K
Pete Cawley
Pete Cawley@corsix·
Speaking as someone who studied joint mathematics and computer science, this right here is how you tell apart the mathematicians from the computer scientists.
Pete Cawley tweet media
English
86
23
859
259.5K
Max Niederman
Max Niederman@MaxNiederman·
More people should know just how bad all public SWE evals are. METR Time Horizons was better but now it’s saturated too. I’m not aware of any truly good public SWE evals atm.
Joel Becker@joel_bkr

new @METR_Evals research note from @whitfill_parker, @cherylwoooo, nate rush, and me. (chiefly parker!) we find that *half* of SWE-bench Verified solutions from Sonnet 3.5-to-4.5 generation AIs *which are graded as passing* are rejected by project maintainers.

English
2
2
38
4.8K
Max Niederman
Max Niederman@MaxNiederman·
@dfrsrchtwts @1a3orn I think the problem is that “emergent” is vague enough that you can’t replace “misalignment” with other properties that generalize across training domains in this way. “misalignment from cross-domain generalization” would be better, but idk how to make it catchy.
English
0
0
1
36
1a3orn
1a3orn@1a3orn·
has anyone tried to justify "emergent misalignment" as a specific term rather than just "personal selection model" or some other less morally loaded frame it seems to me either entirely historically accidental, or part of AI safety community's preference for frightening terms
English
14
1
38
3K
Guive Assadi
Guive Assadi@GuiveAssadi·
@MaxNiederman I think this is an idea that makes sense, but it is also not a good descriptive definition of how this term is actually used.
English
2
0
2
151
Guive Assadi
Guive Assadi@GuiveAssadi·
The idea of "luxury beliefs" makes no sense. My opinions about policing don't affect anything. Neither do yours. Poor people can absorb the costs of irrational political beliefs just as easily as rich people, because those costs are zero.
English
6
1
44
2.4K
ish.exe
ish.exe@ishtwts·
Devs where do you buy your domains? - GoDaddy - Hostinger - Dynadot - Namecheap
English
31
1
24
9.2K
Max Niederman retweetledi
typedfemale
typedfemale@typedfemale·
what the military had access to was a claude with the us constitution prepended to any prompt they asked - you can see why this pissed them off so much
English
7
44
1.4K
60K
Max Niederman
Max Niederman@MaxNiederman·
@sichuan_mala I don’t think so, but that bet seems better to me than a simple acquisition play.
English
0
0
3
360
四川麻辣燙
四川麻辣燙@sichuan_mala·
A friend is considering SPVing into Cursor — what’s the play here, aren’t they just fully deprecated by Codex / CC? Is it purely an acquisition bet?
English
8
1
20
24.8K
Max Niederman
Max Niederman@MaxNiederman·
@yacineMTB large companies are often dysfunctional, but it’s not generally because no smart people work there. it’s just really really hard to coordinate large groups of people
English
0
0
20
1.4K
kache
kache@yacineMTB·
literally everything that you see done by "large" companies for their products is actually laughably easy. like, yes, they have PhDs and what not. but here's what they don't tell you. their PhD engineers are retarded. and you're smarter
English
39
24
964
35.3K
Max Niederman retweetledi
“paula”
“paula”@paularambles·
sf is an extremely walkable city and i will die on this hill
English
497
937
22.4K
911.8K
EigenGender 🔸
EigenGender 🔸@EigenGender·
you get a black box that can find a proof to any lean theorem you formalize (or state that no such proof exists). can you do anything useful with it?
English
14
0
22
3.3K
Max Niederman
Max Niederman@MaxNiederman·
@EigenGender You can use this to compute the result of any computation in time proportional to the number of bits in the output.
English
0
0
1
126