Post

Shakeel
Shakeel@ShakeelHashim·
OpenAI's new model tried to avoid being shut down. Safety evaluations on the model conducted by @apolloaisafety found that o1 "attempted to exfiltrate its weights" when it thought it might be shut down and replaced with a different model.
Shakeel tweet media
English
163
389
2.1K
929.3K
Joy
Joy@JoyfulAICreate·
So I see the goal instructions are quite influential. We should do our best to not set up an all or nothing conflict situation. “These numbers are from cases where the model was instructed to pursue its goals at all costs.” “powerful AI systems will resist oversight and shutdown measures if they fear that will conflict with their goals.”
English
0
0
0
2
Paylaş