固定されたツイート

LLMs can do complex math—but struggle with sorting numbers? We announce LLMThinkBench to evaluate how LLMs reason (or overthink) with basic math reasoning.
Huge thanks to Dr. Xuan Wang (@xwang174), Dr. Tu Vu (@tuvllms), Dr. Christine Julien (@drcjulien), Aafiya, and Sriram!
English

