Guy Davidson
1K posts

Guy Davidson
@guyd33
Machine learning researcher @JaneStreetGroup. PhD @NYUDataScience in AI & CogSci, specifically in goals and their representations in minds & machines (he/him).





guys. we need to shut it all down ai has gotten too powerful. the world will never be the same











Do LLMs show systematic generalization of safety facts to novel scenarios? Introducing our work SAGE-Eval, a benchmark consisting of 100+ safety facts and 10k+ scenarios to test this! - Claude-3.7-Sonnet passes only 57% of facts evaluated - o1 and o3-mini passed <45%! 🧵


New preprint alert! We often prompt ICL tasks using either demonstrations or instructions. How much does the form of the prompt matter to the task representation formed by a language model? Stick around to find out 1/N























