
What if we just decided to make AI risk discourse not completely terrible?
Joern Stoehler ⏹️🔸
3.2K posts

@JStoehler
Alignment is too hard, we should do governance instead. Leave me anonymous feedback at https://t.co/A1Prj0teYX

What if we just decided to make AI risk discourse not completely terrible?



Will models be aware that they're being trained? @Mihonarium on Bayesian Supercycle.


I have evidence that Gary Marcus is two weeks away from developing a neurosymbolic cyberweapon .










@repligate In general, it's been kinda hilarious watching red teams burn their lightcone social capital for short term gains.




I’m surprised it took models this long tbh (I think anthropic models were ahead of the game here, it’s kind of insane some of the older evals still get hits). I’ve never understood your theories on evaluation awareness as far as how they’d apply to the trajectory current frontier labs are on though. As OpenAI notes, models sometimes think this even in literal production.




My brain outputs N/A to this
