
Joscha Bach
67.6K posts





Researchers sent the same resume to an AI hiring tool twice. Same qualifications. Same experience. Same skills. One version was written by a real human. The other was rewritten by ChatGPT. The AI picked the ChatGPT version 97.6% of the time. A team from the University of




Let us settle this distillation debate with science. ============= Can anyone share any papers that show distillation based on text tokens alone (which is glorified fine tuning) - in absence of log probabilities, because frontier labs don't share them - can produce near frontier performance? Let me know if you found any. I haven't seen any. Why you are not likely to find such papers: ------------------- If you try to distill/fine tune based on text outputs from another model, it is "off policy" update - i.e. fine tuning based on tokens produced by another models. It is a well known fact that such approaches degrade model performance in unexpected ways. Distillation in such cases can provide style, but not 'wisdom/capabiltiies' to describe in simple terms. The claim here is that the malicious labs are acquiring 'wisdom/capabilities' through distillation. Where distillation actually works: ------------------- When frontiers labs distill their large models into smaller models (the way DeepSeek and ZAI also do, in addition to OpenAI, Ant, GDM), they use token probability distribution and not text output alone. The student model's objective is to match teacher's probability distribution at each token position. Further more, now 'on policy distillation' is in vogue. You can read more here about one such approach thinkingmachines.ai/blog/on-policy… where in text is generated by the student model, and token probabilities are extracted from the teacher model for the same set of tokens. However, such token probabilities are not available from APIs of closed source labs. So, for distillation by malicious labs - with intent to acquire "wisdom/capabilities" - to be successful there must have been some scientific breakthrough that I have missed that enables them to distill without log probabilities. Can anyone please enlighten me and share relevant papers? I haven't found any. If such literature does not exist, then this debate is meaningless. Caveats: --------- 1. I myself have distilled from open source reasoning models to open source non-reasoning model with text only, i.e. without log probabilities. There was a slight boost in performance in one area and degradation in performance on many other areas. 2. Text only distillation can still help to get a few good agentic trajectories to seed that behaviour in the model before conducting further re-inforcement learning (RL). However, here model still acquire style only. The "wisdom/capabilities" in this approach still comes reinforcement learning training done by the labs. 3. Capable closed source models could be used during RL training as a judge (Generative Reward). However, they can be very expensive and labs have figured they can use smaller more effective models as a judge. @teortaxesTex

Dear friends, grateful for the interest and excitement for @DatalandMuseum ! We worked truly very hard to reimagine and to invent a new form of art. Can not wait to host you after 18 months of construction and 2 years of AI research! 20th of June!

Codex (5.5) was repeatedly killing innocent Claude Codes without any instruction. I've never seen this happen before








