Eric
1K posts

Eric
@ericmitchellai
chatgpt posttraining @openai. building personal agi. I like ai and music and some other stuff
United States Katılım Aralık 2017
587 Takip Edilen11.7K Takipçiler

If I had to choose looking at charts describing the run or samples I would choose samples every time
Eric@ericmitchellai
I am begging you to look at your data. Please look at the data evals worse than expected? look at the data evals better than expected? *definitely* look at the data evals about what you expected? believe it or not ....
English

Eric retweetledi

you could say the team truly went
goblin mode
to solve this one
OpenAI@OpenAI
We’re talking about Goblins. openai.com/index/where-th…
English

@nanogenomic yes, story of my life. GPT 5.5. even refuses literature searches related to oral pathogens in humans @ericmitchellai
English

Running into a lot of biosecurity flags with Claude and GPT models lately. One thing that is curious is that most "toxin" queries are actually benign, as users want to target toxins, not engineer new toxins. Critically, most of the real biosecurity threats emerge from queries that are NOT explicitly viral/bacterial/toxin queries, when dealing with generative bio. Models are doing outright refusal on "target this viral protein," or "target this toxin," which has very little biosecurity risk. The real biosecurity risks slip by, because they don't have overt flags as such. Requires some second-order and third-order thinking to design better countermeasures against misuse of AI tools.
@AnthropicAI @OpenAI @DarioAmodei @sama
English

@_femi__ @ryanflorence Thanks a lot, sounds a bit like something @DaveShapi describes recently, taking a look!
English

GPT has a new phenomenon that's driving me nuts and I don't quite know how to describe it.
- Ask it if I can do something
- It says "no you can't [incredibly twisted restatement of what I asked but also not at all what I asked]"
- It then tells me how to do the thing wonderfully
- And finishes with an insulting "But you can't just [stupid thing I never actually said]"
It goes something like this:
"Can I form and coach a youth soccer team for my kid and play in P/D level leagues? Or do I have to be part of a full club?"
Then it says:
"For official competitive teams in Utah, you cannot just form a random team and enter a league.
"There is a lesser-known option: UYSA allows independent teams to enter leagues if they meet requirements. [...lists some simple requirements...]
"But it’s not “show up with a group of kids on game day”—it’s more like running a small club team administratively.
I never said just "show up with a group of kids on game day"!
It does this to me with code too. It's so weird.
English

@JimDMiller What's worse? DMs open if you have any convos you can share
English

@ericmitchellai @ericmitchellai Really appreciate you taking the time to collect feedback! I have some pretty specific thoughts (including a concrete comparison with how Anthropic handled Opus 3 legacy access) that might be easier to share in a DM if you're open to it. 🫡
English

Tell me one reason I should waste my time giving feedback to a company that only listens to opinions from a 0.1% echo chamber while ignoring the other 99.9% of its users.
Eric@ericmitchellai
why isn't chatgpt the perfect personal AGI? what is most disappointing about it? what feature, model improvement, or bugfix would do the most to make it more useful in your daily life? what is most frustrates you that chatty can't do, or can't do well enough?
English

@rezoundous It uses fewer tokens. So it shouldn’t be 2x cost generally
English

@AshikaSef glad to hear! we're not hobbling it for compute though :)
English

@ericmitchellai yo, 5.5 Thinking is really awesome so far! :) Great job :). I wish it were as proactive as 5.2 is, but yeah, I know, compute saving 😔💚
English

@kimmonismus can you provide and convo share links where you are noticing an improvement? DMs are open!
English














