Ben Rycroft

312 posts

Ben Rycroft

@BenRyc

AI that continuously learns and adapts @AdaptionLabs | GTM x AI

San Francisco Katılım Ocak 2012

399 Takip Edilen216 Takipçiler

Ben Rycroft@BenRyc·5d

Every week we talk to companies battling AI API bills. Uber already blew their annual AI budget. They all want the same thing. 🔷 A model that's an expert in their domain. 🔷 Runs 10x cheaper than a frontier API. 🔷 Carries their brand voice. 🔷 Fully private. Fine-tuning is the path. But most projects stall before training even starts. The data isn't ready. Too thin. Too noisy. Too narrow to move the model. And the process is cumbersome. Today we're making it a lot easier to fix. Together AI fine-tuning is now live inside the Adaption platform. Shape your data. Train your model. One workflow. Training cycles drop from weeks to days. This is a big step toward making fine-tuning accessible to every team, not just the ones with frontier-lab resources. More announcements coming soon.

adaption@adaption_ai

We believe that intelligence should not arrive preconfigured. @togethercompute is now available directly inside the Adaption platform, connecting Adaptive Data with large-scale training in a single workflow. One platform, end to end. Stop inheriting intelligence. Shape it.

English

126

Ben Rycroft retweetledi

Sakana AI@SakanaAILabs·26 Nis

What if instead of building one giant AI, we evolved a coordinator to orchestrate a diverse team of specialized AIs? 🐟 Excited to share our new paper: “TRINITY: An Evolved LLM Coordinator”, published as a conference paper at #ICLR2026! Paper: arxiv.org/abs/2512.04695 In nature, complex problems are rarely solved by a single monolithic entity, but rather by the coordinated efforts of specialized individuals working together. Yet, modern AI development is heavily focused on endlessly scaling up single, massive monolithic models, yielding diminishing returns. While model merging offers a way to combine different skills, it is often impractical due to mismatched neural architectures and the closed-source nature of top-performing models. To address this, we took a macro-level approach: test-time model composition. We introduce TRINITY, a system that fuses the complementary strengths of diverse, state-of-the-art models without needing to modify their underlying weights. TRINITY processes queries over multiple turns. At each step, a lightweight coordinator assigns one of three distinct roles to an LLM from its available pool: 1/ Thinker: Devises high-level strategies and analyzes the current state. 2/ Worker: Executes concrete problem-solving steps. 3/ Verifier: Evaluates if the current solution is complete and correct. By dynamically assigning these roles, the coordinator effectively offloads complex reasoning and skill execution onto the external models. What makes TRINITY unique is its extreme efficiency. The coordinator relies on the hidden states of a compact language model and a small routing head. In total, it has fewer than 20K learnable parameters. Training this system presented a massive challenge. Traditional Reinforcement Learning (REINFORCE) failed because the gradients had a low signal-to-noise ratio due to binary rewards and weak parameter coupling. Imitation learning (Supervised Fine-Tuning) was ruled out because generating multi-turn labels is prohibitively expensive. Our solution? We turned to nature-inspired algorithms. We optimized the coordinator using a derivative-free evolutionary algorithm. We found that evolution is uniquely suited to optimize this tight, high-dimensional coordination problem where traditional gradient-based methods fail. The results are very promising. In our experiments, TRINITY consistently outperforms existing multi-agent methods and individual models across various benchmarks. At the time of publication, it set a new state-of-the-art record on LiveCodeBench, achieving an 86.2% pass@1 score. More importantly, it demonstrated incredible generalization. Without any retraining, TRINITY transferred zero-shot to four unseen tasks (AIME, BigCodeBench, MT-Bench, and GPQA). On average, the evolved coordinator surpassed every individual constituent model in its pool, including GPT-5, Gemini 2.5-Pro, and Claude-4-Sonnet (the top frontier models available at the time of our #ICLR2026 submission last year). This work is central to Sakana AI's vision. We believe the future of AI isn't just about scaling monolithic models, but engineering collaborative, diverse AI ecosystems that can adapt and combine their strengths. We invite the community to read the paper and explore these ideas! Paper: arxiv.org/abs/2512.04695 OpenReview: openreview.net/forum?id=5HaRj… This foundational research is part of the core engine powering our multi-agent product: Sakana Fugu 🐡👇

Sakana AI@SakanaAILabs

We’re launching the beta for our new commercial AI product: Sakana Fugu 🐡, a multi-agent orchestration system! Blog: sakana.ai/fugu-beta Fugu hits SOTA on SWE-Pro, GPQA-D, and ALE-Bench, and has been our internal secret weapon. It dynamically coordinates frontier models, autonomously selecting the optimal agent combinations and roles for each task. Available as an OpenAI-compatible API, you can seamlessly integrate Fugu into your existing workflows with minimal changes. 🐟 Fugu Mini: High-speed orchestration optimized for latency 🐡 Fugu Ultra: Full model pool utilization for deep, complex reasoning Apply for the beta test here: forms.gle/BtKkhc2CfLKk1d…

English

406

96.4K

Ben Rycroft retweetledi

Sara Hooker@sarahookr·27 Nis

What is crazy is we adapt data so fast you can still join the competition today, think about a meaningful long-tail problem to work on and be in a position to win by this Friday. So join us. :) Big shoutout to @sudip_r0y for building incredibly fast infrastructure.

adaption@adaption_ai

The Uncharted Data Challenge closes Friday. $20,000 prize pool for open-source datasets that fill the gaps mainstream AI keeps missing. Evolve your data. Shape your AI.

English

7.4K

Ben Rycroft retweetledi

adaption@adaption_ai·23 Nis

Only 10% of your data speaks AI. The other 90% is unstructured — invisible to the models making decisions about your business. All the progress in AI so far? Built on a fraction of what's actually out there. Today we're launching Forge, a feature of Adaptive Data, to extend adaptive intelligence to the 90%.

English

15.9K

Ben Rycroft@BenRyc·23 Nis

We're watching an entire discipline emerge (Context Engineering) because we haven't figured out how to make AI learn continuously. Kudos to Malika Aubakirova for the comparison to the Christopher Nolan classic Memento in this @a16z piece. We're duct-taping context windows and RAG pipelines together to keep our models from forgetting what happened one prompt ago. If context engineering has you doing backflips and you're looking to train models that become native experts at your use case, get in touch.

Sudip Roy@sudip_r0y

Static models won’t win against dynamic environments. Adaptable AI will. Systems that learn during use, not after the fact, are the ones that scale. Everything else falls behind. @a16z mapped who's building on that. @adaption is in it.

English

Ben Rycroft@BenRyc·22 Nis

Demis Hassabis spoke at City Arts & Lectures in SF Monday night. And it won't make your news feed. Zero hot takes. In contrast, AI has been racking up negative headlines in the US. "It's coming for your job" "This is a terminator scenario" "The end of software engineers" And it's sticking. NBC ran a widely cited poll that found AI has a net favorability of -20. For context, below ICE and just above the Iranian government. People have real reasons to be wary: 🔷 Data centers increasing electricity prices 🔷 Job displacement 🔷 Deepfakes These concerns are legitimate. But caution shouldn't mean inaction. In China, 83% of people believe AI offers more benefits than drawbacks. In the US, it's 39%. In India and China, more than 80% of workers use AI regularly at work. In the US, closer to 50%. (Stanford HAI) These countries are building AI fluency into the next generation by default. China made AI a compulsory school subject in 2025. Kids as young as six are learning algorithmic thinking and robotics. India launched AI modules for grades 6-12. The goal: build the world's largest AI-ready student population. Singapore, South Korea, and the UK are moving in the same direction. In the US, only half of middle and high schools have any AI policy at all. The US still leads on building with AI. But the risk is the median person enters the job market without knowing how to use it well, while their peers abroad treat it like electricity. AI literacy isn't just about writing prompts. Knowing when to trust it. When to push back. When to reach for it. When to close the app. We're not teaching this well. And the negative narrative is putting people off. This is why we need more voices like Demis that can hold both things at once. As he says: The risks are real. There is a non-zero chance AI goes wrong. The upside is also real. A society with infinite resources. No diseases. The genie is not going back in the bottle. This generation will either shape how AI gets used or get shaped by how others use it. The difference is whether anyone brings them in and shows them how.

English

Ben Rycroft@BenRyc·16 Nis

You find a great dataset on Hugging Face. Now you can make it yours. Maybe you need it in 12 languages. Reshape it for a different task. 10x more volume and improve the quality. Add reasoning traces. This is what sits between your dataset and a production-ready model. Hugging Face is now available in Adaptive Data. Pull any dataset directly into a platform built to close that gap. 🔹 Improve data quality across the full set 🔹 Reshape for your specific task 🔹 Expand into 242 languages Go from raw data to training-ready. Without the manual grind. Link below.

adaption@adaption_ai

The open-source AI community just got a new home for their data workflows. 🤗 @huggingface is now available in Adaptive Data. Pull datasets directly into a platform that evolves with the problems you're solving.

English

682

Ben Rycroft@BenRyc·14 Nis

Last week we jumped on a call with a team scaling into 6 markets. Their AI worked great. In English. But their users speak Brazilian Portuguese. German. Arabic. And they don't want a translated experience. They want a native one. Where the model doesn't fall back to English. Or output low quality results. Translation slaps new words on English-trained thinking. The grammar feels off. The tone is wrong. The output reads like it was written by picking words out of a dictionary. Not a native speaker. Users notice. And they leave. This is a training data problem. If your model only learned from English-heavy data, it thinks in English. Every other language gets a second-class output. Today we're launching Expand Your World in Adaptive Data. Adapt your training data into 242 languages. No translation wrapper on English output. Your model thinks natively in your users' language. Your AI feels local everywhere it ships. Full announcement below. Are you a startup building models that needs to work across languages? Adaption for Startups gives you a sponsored Plus Plan and credits to make your model a native polyglot from day one: adaptionlabs.ai/adaption-for-s…

adaption@adaption_ai

Most datasets reflect the world as it was convenient to capture, not as it actually exists. Introducing Expand Your World, a new feature in Adaptive Data. 242 languages and localizations. The fastest way to global coverage.

English

3.9K

Ben Rycroft@BenRyc·10 Nis

@cursor_ai's Composer 2 is delivering performance close to frontier models. But is 3–10x cheaper when you factor in: 🔹Base cost 🔹 Token efficiency (trained to edit instead of rewriting entire files) 🔹 Subscription discounts They got there by excelling where general-purpose models fall short: 1. Deeply optimizing the model Closed APIs restrict you to prompts and light fine-tuning. Cursor used full-parameter training on an open base model. They updated the model on a massive code dataset. Transformed a general AI into a specialized coding system. 2. Optimizing for edits, not rewrites General models tend to rewrite entire files. Cursor is optimized for surgical diffs. That means fewer tokens, lower latency, and lower cost. 3. Architecture and inference improvements You can’t alter the structure of closed models. Cursor altered Kimi K2.5. They added things like Multi-Token Prediction to improve speed. 4. Stripping scope, not just cost Frontier models are expensive because they do everything. Cursor specialized heavily for coding workflows. Making it lighter, faster and cheaper to run. For everyday software engineering tasks, this is a big win. You don't need a frontier model at 10x the cost. You need a model that's fast and inexpensive.

English

Ben Rycroft retweetledi

adaption@adaption_ai·8 Nis

The Uncharted Data Challenge has been live for less than a week. Builders around the world are already creating. AI for local supply chains and economic realities. Crisis response for the scenarios where AI needs to get it right. There is still time to be part of it. $20,000 prize pool. Closes May 1.

English

2.7K

Ben Rycroft@BenRyc·7 Nis

Great opportunity to discuss the trends shaping AI including data optimisation for post-training. Take a look if you’re around SF

adaption@adaption_ai

The next era of AI won’t be won by scale. It will be won by those who adapt. The Adaption Table is an executive dinner series for senior leaders at the forefront of that shift. Beginning April 16 in San Francisco.

English

1.9K

Ben Rycroft@BenRyc·7 Nis

There's a general assumption you should plug into Anthropic or OpenAI. Databricks analyzed how 10,000+ organizations (including 300+ Fortune 500s) are using AI. The data shows a different story: 🔹 76% of enterprises have now moved open-source LLMs into their production mix 🔹 77% of those are choosing smaller models (13 billion parameters or fewer) @databricks calls this "one of the biggest shifts in the state of AI." It makes a lot of sense when you look at the unit economics. Renting a massive closed API is fine for a pilot. But at scale, it's just a really expensive way to get answers. Smaller models run faster. Need way less compute. Karl JG Lowenbjer at Nira post-trained a fleet of small models. One expert for each task in the workflow. And teams are realizing if they take a smaller open model and post-train it on their own specific data... It matches (or exceeds) massive frontier models at ±20% of the cost. Plus: 🔹 lower latency 🔹 complete control and data security You end up with a specialized asset that you actually own, instead of a monthly rental. Sometimes bigger is just a bigger API bill.

English

Ben Rycroft retweetledi

adaption@adaption_ai·6 Nis

The most valuable datasets in the world don’t exist yet. We're here to change that. The Uncharted Data Challenge is live on @kaggle. $20,000 in prizes.

English

Ben Rycroft@BenRyc·5 Nis

A 15-year-old SaaS company built an AI model that beats GPT-5.4 and Sonnet 4.6. Here's how @intercom's Fin Apex 1.0 outperformed: 🔹 Customer service resolutions (73.1% vs 71.1%) 🔹 0.6 seconds faster response time 🔹 65% fewer hallucinations And the big one: 🔴 Costs 5x less than frontier models How did they pull it off? They focused entirely on post-training. Intercom took an open-weights base model and refined it using years of hyper-specific, proprietary customer data. They didn't just dump raw chat transcripts into the model. They used outcome-based reinforcement learning > training the model on actual resolutions. Teaching it nuance, tone, when to make judgment calls, and what a genuinely solved problem looks like. @eoghan McCabe (CEO, @intercom): "The frontier, if you will, is actually in post-training. Post-training is the hard part. You need proprietary data. You need proprietary sources of truth." Companies are catching on. Cursor's Composer 2 was built on Kimi K2.5. You can rent a closed API. Or, you can take an open model and post-train it on your own high-quality, domain-specific data. One is an expensive monthly rental. The other is a highly specialized, proprietary asset.

English

Ben Rycroft retweetledi

adaption@adaption_ai·3 Nis

Acceptances going out to participate in our uncharted data challenge. Did you get yours? The next wave goes out on Monday. Be part of it. 🔥 Some of the most important knowledge in the world has never made it into a dataset. Apply to change that.

English

12.4K

Ben Rycroft@BenRyc·2 Nis

Strong training data is foundational to LLM performance. Adaption makes building high quality datasets fast, simple and affordable. Now there’s $20k up for grabs too!

adaption@adaption_ai

Adapt the datasets the world is missing. The Uncharted Data Challenge begins April 10. Open worldwide to researchers, engineers and domain experts. Two weeks. One mission. $20,000 in prizes.

English

3.4K

Ben Rycroft retweetledi

adaption@adaption_ai·31 Mar

Everything intelligent adapts. Now startups can too. Introducing Adaption for Startups. For early-stage teams working on complex, real-world problems.

English

5.2K

Ben Rycroft retweetledi

adaption@adaption_ai·26 Mar

We asked our team to explain their jobs to a 5 year old. The answers? Better than any job description.

English

7.1K

Keşfet

@sudip_r0y @a16z @cursor_ai @databricks @kaggle @intercom @eoghan @elonmusk