

DusterBloom
139 posts

@dusterbloom
Here to tell you how it really feels to be Bloom









@Faltz009 @DanielleFong You misunderstood the question. The question is: why does in context learning with LLMs even work and how? PS. Your work seems like it has some legitimate/interesting potential, but also heavy signs of AI psychosis.




.@vonderleyen "The European #AgeVerification app is technically ready. It respects the highest privacy standards in the world. It's open-source, so anyone can check the code..." I did. It didn't take long to find what looks like a serious #privacy issue. The app goes to great lengths to protect the AV data AFTER collection (is_over_18: true is AES-GCM'd); it does so pretty well. But, the source image used to collect that data is written to disk without encryption and not deleted correctly. For NFC biometric data: It pulls DG2 and writes a lossless PNG to the filesystem. It's only deleted on success. If it fails for any reason (user clicks back, scan fails & retries, app crashes etc), the full biometric image remains on the device in cache. This is protected with CE keys at the Android level, but the app makes no attempt to encrypt/protect them. For selfie pictures: Different scenario. These images are written to external storage in lossless PNG format, but they're never deleted. Not a cache... long-term storage. These are protected with DE keys at the Android level, but again, the app makes no attempt to encrypt/protect them. This is akin to taking a picture of your passport/government ID using the camera app and keeping it just in case. You can encrypt data taken from it until you're blue in the face... leaving the original image on disk is crazy & unnecessary. From a #GDPR standpoint: Biometric data collected is special category data. If there's no lawful basis to retain it after processing, that's potentially a material breach. youtube.com/watch?v=4VRRri…





welcome PlasmaBlind: a new privacy L2 that ditches SNARKs altogether, with <100ms client-side zk proofs, using the blindfold scheme and a carefully designed aggregator. uncompromising and instant privacy on *any* device. we implemented and benchmarked the whole thing at @PrivacyEthereum and are opening it under MIT. paper + code available at plasmablind.xyz the time for a special purpose privacy L2 has come 😎




Apple Research just published something really interesting about post-training of coding models. You don't need a better teacher. You don't need a verifier. You don't need RL. A model can just… train on its own outputs. And get dramatically better. Simple Self-Distillation (SSD): sample solutions from your model, don't filter them for correctness at all, fine-tune on the raw outputs. That's it. Qwen3-30B-Instruct: 42.4% → 55.3% pass@1 on LiveCodeBench. +30% relative. On hard problems specifically, pass@5 goes from 31.1% → 54.1%. Works across Qwen and Llama, at 4B, 8B, and 30B. One sample per prompt is enough. No execution environment. No reward model. No labels. SSD sidesteps this by reshaping distributions in a context-dependent way — suppressing distractors at locks while keeping diversity alive at forks. The capability was already in the model. Fixed decoding just couldn't access it. The implication: a lot of coding models are underperforming their own weights. Post-training on self-generated data isn't just a cheap trick — it's recovering latent capacity that greedy decoding leaves on the table. paper: arxiv.org/abs/2604.01193 code: github.com/apple/ml-ssd



