Ron K Jeffries @[email protected]
8.8K posts

Ron K Jeffries @[email protected]
@RonKJeffries
A curious guy. Becoming a better human? QUOTE: Tell me about despair, yours, and I will tell you mine. Meanwhile the world goes on. --Mary Oliver, Wild Geese






Big updates to the @GeminiApp today: - Deep Research is now available to all users for free - Deep Research is now powered by 2.0 Flash Thinking (our reasoning model) - New Gemini personalization using your search history - Gems are available to all users And more in 🧵




New 3h31m video on YouTube: "Deep Dive into LLMs like ChatGPT" This is a general audience deep dive into the Large Language Model (LLM) AI technology that powers ChatGPT and related products. It is covers the full training stack of how the models are developed, along with mental models of how to think about their "psychology", and how to get the best use them in practical applications. We cover all the major stages: 1. pretraining: data, tokenization, Transformer neural network I/O and internals, inference, GPT-2 training example, Llama 3.1 base inference examples 2. supervised finetuning: conversations data, "LLM Psychology": hallucinations, tool use, knowledge/working memory, knowledge of self, models need tokens to think, spelling, jagged intelligence 3. reinforcement learning: practice makes perfect, DeepSeek-R1, AlphaGo, RLHF. I designed this video for the "general audience" track of my videos, which I believe are accessible to most people, even without technical background. It should give you an intuitive understanding of the full training pipeline of LLMs like ChatGPT, with many examples along the way, and maybe some ways of thinking around current capabilities, where we are, and what's coming. (Also, I have one "Intro to LLMs" video already from ~year ago, but that is just a re-recording of a random talk, so I wanted to loop around and do a lot more comprehensive version of this topic. They can still be combined, as the talk goes a lot deeper into other topics, e.g. LLM OS and LLM Security) Hope it's fun & useful! youtube.com/watch?v=7xTGNN…



I found that the Reasoning Engine in DeepSeek R1 can shift into a multitude of languages while it is thinking. This artifact takes place more often when I press it on a very complex task. I prompt in English, and the reasoning can shift to German, Japanese back to English.
















