
Humanitarian AI
262 posts

Humanitarian AI
@HumanitarianAI
https://t.co/NOZKn7LD20 community advancing humanitarian AI with local groups in fifteen cities + podcast series: Humanitarian AI Today


Excited to finally share that I left Google DeepMind to create Kyutai w/ talented colleagues, a non-profit lab based in Paris and dedicated to open-science, w/ ~300M€ of funding so far. We are starting with multimodal LLMs, for everyone, for free. No distraction, just science.


ChatGPT "Advanced Data Analysis" (which doesn't really have anything to do with data specifically) is an awesome tool for creating diagrams. I could probably code these diagrams myself, but it's soo much better to just sit back, and iterate in English. In this example, I was experimenting with a possible diagram to explain Supervised Finetuning in LLMs. The "document" at the origin (0,0) is the empty document, and eminating outwards are token streams. Highlighted in black are the high probability token streams of the base model. In red are the token streams corresponding to the conversational finetuning data. When we finetune, we are increasing the probabilities of the red paths and suppressing the black paths. I like this view because it emphasizes LLMs as "token simulators", with their own kind of statistical physics backed by datasets, bouncing around in the discrete token space. The conversation where we built it in a few minutes: chat.openai.com/share/d48fddff… (Sadly I just remembered that ChatGPT sharing doesn't support images, but at least the text is there, of me iterating with the diagram in plain language, and needing to touch no code. Such a vibe of the future.) I had a similar experience yesterday, was trying to create a plot that shows smoothing in n-gram language models. Again I could just have coded this manually, but this was 10X faster and so easy. Conversation: chat.openai.com/share/9e7fd404… Posting because during these chats I was struck again by that feeling of what must be the future, where you just sit back and say stuff, and the computer is doing the hard work. And in some narrow pockets of tasks, you can already get that feeling today.



While demand for generative model training soars 📈, I think a new field is coalescing that’s focused on trying to make sense of generative models _once they’re already trained_: characterizing their behaviors, differences, and underlying mechanisms…so we wrote a paper about it!




















