




Isaac Fino
84 posts

@GhxIsaac
Good for you's just like you very much






🍫 CocoaBench is calling for contributions from the community! Join us and help shape how next-generation agents are evaluated and built🚀✨ #LLM #AI #Agent #CocoaBench More details in the threads 👇








🚀 Introducing OpenResearcher: a fully offline pipeline for synthesizing 100+ turn deep-research trajectories—no search/scrape APIs, no rate limits, no nondeterminism. 💡 We use GPT-OSS-120B + a local retriever + a 10T-token corpus to generate long-horizon tool-use traces (search → open → find) that look like real browsing, but are free + reproducible. 📈 The payoff: SFT on these trajectories turns Nemotron-3-Nano-30B-A3B from 20.8% → 54.8% accuracy on BrowseComp-Plus (+34.0). 🧩 What makes it work? 🔎 Offline corpus = 15M FineWeb docs + 10K “gold” passages (bootstrapped once) 🧰 Explicit browsing primitives = better evidence-finding than “retrieve-and-read” 🎯 Reject sampling = keep only successful long-horizon traces 🧵 And we’re releasing everything: ✅ code + search engine + corpus recipe ✅ 96K-ish trajectories + eval logs ✅ trained models + live demo 👨💻 GitHub: github.com/TIGER-AI-Lab/O… 🤗 Models & data: huggingface.co/collections/TI… 🚀 Demo: huggingface.co/spaces/OpenRes… 🔎 Eval logs: huggingface.co/datasets/OpenR… #llms #agentic #deepresearch #tooluse #opensource #retrieval #SFT



What if small models with the right tools could beat large models without them? AgentFlow assembles a team of agents: Planner, Executor, Verifier, Generator. Each learning to coordinate and call tools in the flow of a task via end-to-end RL. 3B/7B models trained on 8× NVIDIA A100 GPUs outperforms GPT-4o on reasoning benchmarks. + ICLR 2026 Oral (Top 1.1%) + Best Paper Nomination @ NeurIPS 2025 Efficient Reasoning Workshop + One of 12 papers Lambda co-authored at ICLR 2026 Project page: agentflow.stanford.edu





Some of the CAPTCHA are so hard that I have to use AI to assist me in solving them. What kind of world is this? Am I the actually a bot?






