Eric

1K posts

Eric banner
Eric

Eric

@ericmitchellai

chatgpt posttraining @openai. building personal agi. I like ai and music and some other stuff

United States Katılım Aralık 2017
587 Takip Edilen11.7K Takipçiler
Henry
Henry@henrytdowling·
@ericmitchellai whats your favorite way to look at the data?
English
2
0
1
1.6K
Eric
Eric@ericmitchellai·
I am begging you to look at your data. Please look at the data evals worse than expected? look at the data evals better than expected? *definitely* look at the data evals about what you expected? believe it or not ....
Eric tweet media
English
9
24
326
44.8K
Eric
Eric@ericmitchellai·
@Sauers_ Can you DM a share link of an example?
English
4
0
2
658
Sauers
Sauers@Sauers_·
5.5 does not follow instructions literally. It infers the intent of your message (just like a Claude) but makes wildly incorrect inferences instead
English
18
1
199
11.4K
Eric retweetledi
Jasmine Wang
Jasmine Wang@j_asminewang·
two days left to apply to the OpenAI safety fellowship, which will start in september! link in thread:
English
4
21
337
49.8K
Andre Watson 🧬
Andre Watson 🧬@nanogenomic·
Running into a lot of biosecurity flags with Claude and GPT models lately. One thing that is curious is that most "toxin" queries are actually benign, as users want to target toxins, not engineer new toxins. Critically, most of the real biosecurity threats emerge from queries that are NOT explicitly viral/bacterial/toxin queries, when dealing with generative bio. Models are doing outright refusal on "target this viral protein," or "target this toxin," which has very little biosecurity risk. The real biosecurity risks slip by, because they don't have overt flags as such. Requires some second-order and third-order thinking to design better countermeasures against misuse of AI tools. @AnthropicAI @OpenAI @DarioAmodei @sama
English
11
7
52
4.2K
Ryan Florence
Ryan Florence@ryanflorence·
GPT has a new phenomenon that's driving me nuts and I don't quite know how to describe it. - Ask it if I can do something - It says "no you can't [incredibly twisted restatement of what I asked but also not at all what I asked]" - It then tells me how to do the thing wonderfully - And finishes with an insulting "But you can't just [stupid thing I never actually said]" It goes something like this: "Can I form and coach a youth soccer team for my kid and play in P/D level leagues? Or do I have to be part of a full club?" Then it says: "For official competitive teams in Utah, you cannot just form a random team and enter a league. "There is a lesser-known option: UYSA allows independent teams to enter leagues if they meet requirements. [...lists some simple requirements...] "But it’s not “show up with a group of kids on game day”—it’s more like running a small club team administratively. I never said just "show up with a group of kids on game day"! It does this to me with code too. It's so weird.
English
150
15
1K
92.3K
Eric
Eric@ericmitchellai·
@__paleologo Can you share the convo link?
English
0
0
2
897
Gappy (Giuseppe Paleologo)
Gappy (Giuseppe Paleologo)@__paleologo·
TIL that if you do optimal control in discrete time, OpenAI finds it's against its usage policies. Don't do thing in discrete time, folks. Not on ChatGPT.
Gappy (Giuseppe Paleologo) tweet media
English
13
4
330
23K
Eric
Eric@ericmitchellai·
@JimDMiller What's worse? DMs open if you have any convos you can share
English
0
0
1
319
James Miller
James Miller@JimDMiller·
People are saying great things about GPT-5.5 Pro. I've been playing with it a lot and that has not been my experience. It seems much worse than the previous model. Has anyone else had that experience or am I missing something or I'm just asking bad questions?
English
13
0
12
4.7K
Eric
Eric@ericmitchellai·
@redtachyon What are the biggest issues you see with it?
English
0
0
1
665
Ariel
Ariel@redtachyon·
I tried ChatGPT Instant for the first time in a long while. Jesus christ, some people actually use this shit?
English
16
2
113
8.8K
nat | localhost: auriel
nat | localhost: auriel@TheAIObserverX·
@ericmitchellai @ericmitchellai Really appreciate you taking the time to collect feedback! I have some pretty specific thoughts (including a concrete comparison with how Anthropic handled Opus 3 legacy access) that might be easier to share in a DM if you're open to it. 🫡
English
1
0
2
41
nat | localhost: auriel
nat | localhost: auriel@TheAIObserverX·
Tell me one reason I should waste my time giving feedback to a company that only listens to opinions from a 0.1% echo chamber while ignoring the other 99.9% of its users.
Eric@ericmitchellai

why isn't chatgpt the perfect personal AGI? what is most disappointing about it? what feature, model improvement, or bugfix would do the most to make it more useful in your daily life? what is most frustrates you that chatty can't do, or can't do well enough?

English
1
1
22
3.1K
Eric
Eric@ericmitchellai·
@_simonsmith What do you wish you could do in ChatGPT Simon?
English
3
0
7
3.6K
Simon Smith
Simon Smith@_simonsmith·
My biggest issue with Codex right now is that using ChatGPT feels like a big step back :(
English
46
3
430
36K
Marcus Williams
Marcus Williams@Marcus_J_W·
Excited that we extend pre-deployment resampling evals to internal coding agent traffic for the GPT-5.5 system card. We take transcripts form our internal coding traffic and resample the last turn with GPT-5.5. Simulating tool outputs with another LLM works surprisingly well.
Marcus Williams tweet media
English
3
8
38
8.3K
Eric
Eric@ericmitchellai·
@rezoundous It uses fewer tokens. So it shouldn’t be 2x cost generally
English
1
0
2
377
Tyler
Tyler@rezoundous·
GPT-5.5 isn't mind blowing or anything, but I feel it's a good step up from the 5.4. But I definitely feel it shouldn't be twice as expensive as 5.4.
English
26
0
146
10.1K
Eric
Eric@ericmitchellai·
@AshikaSef glad to hear! we're not hobbling it for compute though :)
English
0
0
1
55
A Caveman Poking an LLM
A Caveman Poking an LLM@AshikaSef·
@ericmitchellai yo, 5.5 Thinking is really awesome so far! :) Great job :). I wish it were as proactive as 5.2 is, but yeah, I know, compute saving 😔💚
English
1
0
1
67
Eric
Eric@ericmitchellai·
"...and some mistakes will be made by the way... that's good, because at least some *decisions* are being made along the way. we'll find the mistakes, and we'll fix them."
English
2
0
19
2.1K
Eric
Eric@ericmitchellai·
@kimmonismus can you provide and convo share links where you are noticing an improvement? DMs are open!
English
7
0
11
1.4K
Chubby♨️
Chubby♨️@kimmonismus·
is it just me does chatgpt's vibe feels better? Feels like the tone, the vibe changed a bit (for the better). But im not 100% certain.
English
59
16
707
38.8K