Arun Bahl (e/reason)
216 posts

Arun Bahl (e/reason)
@arunbahl
Partner, friend, son, brother, dog dad. CEO at @AloeInc. Cognitive science + AI. Advocate for reason in both humans and machines. Specialization is for insects.






A research finds in a standardized critical thinking test, that LLM‑integrated group improved more overall, with a notable gain in inductive reasoning. That adding AI to an established pedagogy did not erode critical thinking Researchers ran a randomized controlled trial with 100 first-year nursing students, splitting them 50 and 50 into traditional problem-based learning and an LLM-assisted version. Neither group had prior exposure to problem-based learning or LLMs. Everyone took the California Critical Thinking Skills Test before and after an 8-week, 16-hour course. After adjusting for starting scores, total critical thinking rose in both groups, but the LLM group improved a bit more, roughly 0.60 points versus 0.50 points, with a p value under 0.01. The clear standout was inductive reasoning. The LLM group showed a marked jump on questions that ask students to generalize from cases, while other subskills like analysis, inference, evaluation, and deduction were similar between groups. Course grades did not differ meaningfully, about 77.6 versus 74.3, which suggests the benefit targeted thinking skills rather than test performance. Why this likely happened is straightforward. The assistant can summarize readings quickly, break a messy case into smaller questions, surface overlooked details, and propose alternative solutions that students can compare to their own, which trains pattern recognition. There is a tradeoff. When the assistant helps structure problems, students may do less slow, step-by-step analysis, which fits the flat results on deductive and evaluative subscales. Overall, pairing an LLM with problem-based learning nudged critical thinking up, and the biggest lift was in pattern-building skills. --- journals. lww. com/nurseeducatoronline/fulltext/2025/07000/randomized_controlled_study_on_the_impact_of.15.aspx


Aloe builds itself. We are now state-of-the-art on the GAIA benchmark of generalist AIs, beating OpenAI, Manus, and Genspark by a wide margin. How? Like other AIs, Aloe uses tools. Unlike other AIs, if Aloe doesn't have the right tool for the job, it creates a new one first. This is composable program synthesis, and it's a new species of AI (we nicknamed it 𝘈𝘭𝘰𝘦 𝘩𝘢𝘣𝘪𝘭𝘪𝘴): when one Aloe increases its capability, they all get more capable together as they share tools and use them to create even better tools. Aloe’s lead over other systems is highest on the most difficult scenarios - there’s plenty of headroom to expand. We are just getting started – this is the floor of what we can do.

This is the post you want to read about Zuck’s Superintelligence Memo. @om breaks it down and teaches us a thing or two about corporate communications.

























