
Shawn Sullivan
1.8K posts

Shawn Sullivan
@shawntsullivan
CTO @ https://t.co/Ejx8na6sMZ: GenAI for Edu. Benchmarking AI in Edu. Early reading with Reading Critters. Co-founder/ex-CTO @ Phase Genomics: AI+Genomics. MIT CS, Ex-MSFT








AP exam season is looming! 😰 Feeling the pressure? We tested how well popular AI models can help you get a 5. The results might surprise you 🧵👇 edubench.com/edublog/which-… #APTests #AI #EduTech 1/8






Now live on edubench.com: Quiz Composition benchmarks! Unlike MCQ Generation, frontier models struggle with most aspects of creating good quizzes. More details soon; for now, head over to edubench.com and see for yourself. What stands out to you?

We're benchmarking AI for education. What does that mean? Let's dive into an example. Suppose you're an AP World History teacher who wants to use an LLM to create a quiz question for your students... 🧵 1/7

We analyzed Claude 3.5 vs 3.7 for creating educational content. The results are surprising... edubench.com/edublog/claude… 1/


🚨 AI in education is at a crossroads. Too many AI tools look helpful but actually mislead students, fail to align with curricula, and give a false sense of competence. We’re launching EduBench to demand real educational impact from AI. 🧵👇

Bringing pixel perfect editing to your @lovable apps (feature preview)



Past five years of OpenAI models vs. the ARC-AGI benchmark







