
@SamuelHsiang @DevinAI i have no idea what you're talking about!
English
kylie chang 𓇢𓆸
32 posts

@kylifec
upenn m&t / 🦦@cognition / prev. ai @figma, @kp_fellows



Introducing FrontierCode: a coding eval that raises the bar for difficulty & quality. Each task took 40+ hrs of work by leading open-source maintainers. Models write sloppy code that works but isn’t maintainable. Our eval is first to measure: would you actually merge this code?




