
Log analysis is necessary for credible evaluation of AI agents
Peter Kirgis, Sayash Kapoor (@sayashk), Stephan Rabanser (@steverab), Nitya Nadgir, Cozmin Ududec (@CUdudec), Magda Dubois (@DubMagda), JJ Allaire (@fly_upside_down), @MariusHobbhahn, Jacob Steinhardt (@JacobSteinhar2), Arvind Narayanan (@random_walker)

English







