Tony
306 posts

Tony
@halluton
Self-Improving Agents @KaybaAI



Apple Research just published something really interesting about post-training of coding models. You don't need a better teacher. You don't need a verifier. You don't need RL. A model can just… train on its own outputs. And get dramatically better. Simple Self-Distillation (SSD): sample solutions from your model, don't filter them for correctness at all, fine-tune on the raw outputs. That's it. Qwen3-30B-Instruct: 42.4% → 55.3% pass@1 on LiveCodeBench. +30% relative. On hard problems specifically, pass@5 goes from 31.1% → 54.1%. Works across Qwen and Llama, at 4B, 8B, and 30B. One sample per prompt is enough. No execution environment. No reward model. No labels. SSD sidesteps this by reshaping distributions in a context-dependent way — suppressing distractors at locks while keeping diversity alive at forks. The capability was already in the model. Fixed decoding just couldn't access it. The implication: a lot of coding models are underperforming their own weights. Post-training on self-generated data isn't just a cheap trick — it's recovering latent capacity that greedy decoding leaves on the table. paper: arxiv.org/abs/2604.01193 code: github.com/apple/ml-ssd





Based on the leaked Claude Code source code, your CLAUDE.md file is re-injected on every single **turn** of the conversation.




.@theresidency is going zurich! with @ArvindAGI22, @chrisbrolin123 and @_sethmorton, working on > neuromorphic compute > spiking neural nets > thermo compute > self referential neural nets who's in zurich??

Claude Code being closed source is the biggest bag fumble in the AI era. If CC was on Github, these things would be trivial to identify and fix. Instead we're stuck reverse engineering their incompetence.


POV: you accidentally said “hello” to claude and it costs you 2% of your session limit.


BREAKING 🚨: Anthropic is working on a new Operon agent for Claude Desktop, built for scientific research in biology! Operon will have a "private environment" to work alongside you. Users will be able to create different sessions within Operon projects, manage generated artefacts, and work with Skills. Cowork but for scientists 👀









