Gab
444 posts










Knowledge graph agents might not be ready for prime time, but they are promising. This paper introduces ARK-V1, a lightweight agent that helps LLMs answer questions by actively walking through a knowledge graph instead of relying only on memorized text. Here are my notes:




Trust your AI, but can it trust itself? 🤔 Introducing an online reinforcement learning framework, RISE (Reinforcing Reasoning with Self-Verification), enabling LLMs to simultaneously level-up BOTH their problem-solving AND self-checking skills! 🧐 Problems tackled: ✅ "Superficial self-reflection" — models failing to verify their own reasoning robustly. ✅ Separation between reasoning and self-verification training. 🚀 RISE empowers models to critique their OWN reasoning via on-the-fly feedback and verifiable rewards, promoting stronger, more dynamic reasoning loops and effective self-assessment skills. 📊 Key results: 📈 Up to 2.8× better self-verification accuracy on challenging math tasks. 📈 Outperforms instruction-tuned models (Qwen2.5): +3.7% in reasoning, +33.4% in verification accuracy. 📈 Better internal reasoning: frequent, more accurate verification behaviors. 🧑💻 Code: github.com/xyliu-cs/RISE 📃 Paper: arxiv.org/abs/2505.13445















"veRL is the best RL framework it's super efficient" really. are you sure about that. are you sure that you need 16 GPUs to tune a 7B model at 8k context. do you think that it's reasonable each step takes 19 minutes for this










