Victor Isaac Oshimua retweetledi

Your LLM can reason better without any fine-tuning!
optillm is an OpenAI API-compatible proxy that implements 20+ optimization techniques to improve LLM accuracy on reasoning tasks without training or fine-tuning.
The concept: Instead of one API call, optillm makes multiple calls using different techniques and combines the results. You're trading compute for accuracy - more API calls, higher cost, slower response, but better results.
How it works: optillm sits between your OpenAI client and the LLM API. You control which technique by prepending a slug to the model name. With Mixture of Agents, optillm makes 3 parallel API calls with different approaches, synthesizes them, and returns the best answer.
The tradeoff: A query that takes 1 API call and 2 seconds now takes 4 calls and 5 seconds. Token cost goes up 4x. But accuracy jumps significantly on reasoning tasks.
Results show the gains. Mixture of Agents using gpt-4o-mini matches GPT-4 on Arena-Hard-Auto. PlanSearch achieves 20% higher pass@5 on LiveCodeBench.
Available techniques:
• Mixture of Agents: Multiple models critique each other
• Monte Carlo Tree Search: Explores decision trees
• PlanSearch: Searches candidate plans before executing
• Best of N: Generates multiple responses, picks best
• Chain-of-Thought with Reflection: Structured thinking and output
• Self-Consistency: Multiple reasoning paths
Works with 100+ models via LiteLLM. You can combine techniques in pipelines or run them parallel.
The insight: Spend more computation at query time to get better results without training. Works for benchmarks, offline tasks, critical queries. Not for real-time production.
I've shared the link to the Github Repo in the comments!

English


























