
@janleike You assume that you don't need to solve hard philosophical problems. But the superhuman researcher model probably will need to, right? Seems like a very difficult instance of weak-to-strong generalization, and I'm not sure how you would know whether you've successfully solved it.









