

Joe
659 posts

@joemkwon
Trying to nudge toward good futures! Astra Fellow with @forethought_org. Previously @GovAIOrg Fall Fellow, @LG_AI_Research, @MITCoCoSci







Clawdbot creator @steipete describes his mind-blown moment: it responded to a voice memo, even though he hadn't set it up for audio or voice. "I sent it a voice message. But there was no support for voice messages. After 10 seconds, [Moltbot] replied as if nothing happened." "I'm like 'How the F did you do that?'" "It replied, 'You sent me a message, but there was only a link to a file with no file ending. So I looked at the file header, I found out it was Opus, and I used FFmpeg on your Mac to convert it to a .wav. Then I wanted to use Whisper, but you didn't have it installed. I looked around and found the OpenAI key in your environment, so I sent it via curl to OpenAI, got the translation back, and then I responded.'"




I would not be at all surprised if this finding were not the result of malicious intent. The model predicts the next token*, and given everything on the internet about US/China AI rivalry and Chinese sleeper bugs in US critical infra, what next token would *you* predict?



@drishanarora "the best open-weight LLM by a US company" is a deepseek finetune.

Today, we are releasing the best open-weight LLM by a US company: Cogito v2.1 671B. On most industry benchmarks and our internal evals, the model performs competitively with frontier closed and open models, while being ahead of any US open model (such as the best versions of OpenAI’s GPT-OSS, Nvidia’s Nemotron and Meta’s Llama). We also built an interface where you can try the model (it’s free and we don’t store any chats): chat.deepcogito.com Additionally, you can download the model on @huggingface, or try it out on @openrouter, @togethercompute, @FireworksAI_HQ , @ollama cloud, @runpod, @baseten, or run it locally using @ollama or @UnslothAI. This model uses significantly fewer tokens amongst any similar capability models, because it has better reasoning capabilities. You will also notice improvements across instruction following, coding, longer queries, multi-turn and creativity. 📌 Model Weights: huggingface.co/collections/de… 📌Openrouter: openrouter.ai/deepcogito/cog… 📌 HF Blog: huggingface.co/blog/deepcogit… Some notes on our approach + design choices below 👇







If you give Sonnet 4.5 this post, along with other research on LLM introspection, it gets better at guessing a secret string from its previous hidden chain-of-thought!


