Emily Jones
2.7K posts

Emily Jones
@Emilyixg8a
No matter how fierce the storms you encounter, I hope you always possess the ability to find happiness within yourself

En Palacio Nacional, recibimos a Ben Horowitz, cofundador y socio general de la firma financiera Andreessen Horowitz. Coincidimos en que México es ejemplo de confianza y certeza económica.

Fable 5 is state-of-the-art on nearly all tested benchmarks, with exceptional performance in software engineering, knowledge work, scientific research, and vision. The longer and more complex the task, the larger Fable 5’s lead over our other models.

For medical information, general AI frontier models (Google, OpenAI, Anthropic) outperformed specialized @EvidenceOpen and @UpToDate as assessed by 12 US clinicians, randomized and blinded to which model and extensive testing/benchmarks. This was not anticipated. @NatureMedicine nature.com/articles/s4159…

Out of coverage no longer means out of touch 📱🛰️ Kyivstar customers in Ukraine can now access satellite-ready apps, even where there is no terrestrial coverage or during prolonged power outages - helping people stay connected when it matters most. Automatically switching to satellite data when the terrestrial network is unavailable, compatible smartphones can now use Viber, WhatsApp and Google Maps with data connectivity from Starlink Mobile. Kyivstar plans to add support for more apps as they are adapted, helping to keep people connected and businesses operating, and boosting the country's resilience. $VEON @TwiyKyivstar @Starlink veon.com/newsroom/press…







Supersonic. Mach 1.21. Quarterhorse Mk 2.1 is now the world’s first privately developed, unmanned supersonic jet and the fastest unmanned aircraft flying today. This flight makes Hermeus the fastest company in aviation history to go from founding to supersonic flight - exactly 364 days after the maiden flight of our first aircraft. Now, we fly faster. A special thanks to @DIU_x, Director @OwenWest91, Maj. Gen. Joe "Solo" Kunkel, and Deputy Director Kyle Norman.



Introducing our new work: “Learning to Orchestrate Agents in Natural Language with the Conductor” accepted at #ICLR2026 arxiv.org/abs/2512.04388 What if we trained an AI not to solve problems directly, but to act as a manager that delegates tasks to a diverse team of other AIs? To solve complex tasks, humans rarely work alone; we form teams, delegate, and communicate. Yet, multi-agent AI systems currently rely heavily on rigid, human-designed workflows or simple routers that just pick a single model. We wanted an AI that could dynamically build its own team. We trained a 7B Conductor model using Reinforcement Learning to orchestrate a pool of frontier models (including GPT-5, Gemini, Claude, and open-source models available during the period leading up to ICLR 2026). Instead of executing code, the Conductor outputs a collaborative workflow in natural language. For any given question, the Conductor specifies: 1/ Which agent to call 2/ What specific subtask to give them (acting as an expert prompt engineer) 3/ What previous messages they can see in their context window Through pure end-to-end reward maximization, amazing behaviors emerged. The Conductor learned to adapt to task difficulty: it 1-shots simple factual questions, but autonomously spins up complex planner-executor-verifier pipelines for hard coding problems. The results are very promising: The 7B Conductor surpasses the performance of every individual worker model in its pool, setting new records on LiveCodeBench (83.9%) and GPQA-Diamond (87.5%) at the time of publication. It also significantly outperforms expensive multi-agent baselines like Mixture-of-Agents at a fraction of the cost. One of our favorite features: Recursive Test-Time Scaling! By allowing the Conductor to select itself as a worker, it reads its own team's prior output, realizes if it failed, and spins up a corrective workflow on the fly. This opens a new axis for scaling compute during inference. This research proves that language models can become elite meta-prompt engineers, dynamically harnessing collective intelligence. Alongside our TRINITY research which we announced a few days earlier, this foundational research powers our new multi-agent system: Sakana Fugu! (sakana.ai/fugu-beta) 🐡 OpenReview: openreview.net/forum?id=U23A2… (ICLR 2026)









