Rohan Paul@rohanpaul_ai
New @Microsoft paper teaches LLMs to organize reasoning into concurrent subtasks for faster, more accurate answers.
It shows 28% lower wait time than typical parallel thinking while also boosting math accuracy.
The big deal is simple, it turns coordination into a skill the model learns, so it decides when to split work, when to wait, and when to merge.
The usual single chain wastes time because each step blocks the next.
Fixed parallel plans also waste time because they cannot adapt to each query.
The fix is an organizer that writes simple Fork and Join tags to start and merge worker thoughts.
Workers chase sub-queries in parallel while the organizer keeps thinking and only pauses to Join.
All control lives in plain text, so the base model stays unchanged.
Training happens in 2 stages, first supervised traces that teach the tag format.
Then reinforcement learning rewards correct final answers, clean format, and real concurrency.
Speed is measured by the critical path through the Fork-Join graph, which matches true waiting.
Across countdown puzzles, math questions, and Sudoku, the learned policy runs faster and fails less.
The big idea is to learn organization itself rather than hard-code a script.
----
Paper – arxiv. org/abs/2510.26658
Paper Title: "The Era of Agentic Organization: Learning to Organize with Language Models"