Jay Alammar

2.1K posts

Jay Alammar

@JayAlammar

Machine Learning Researcher and writer https://t.co/5GlbofAHs0. O'Reilly Author https://t.co/Fl3uPAZHLg. LLM Builder @Cohere.

Katılım Nisan 2020

1.4K Takip Edilen48.8K Takipçiler

Sabitlenmiş Tweet

Jay Alammar@JayAlammar·5 Şub

We're ecstatic to bring you "How Transformer LLMs Work" -- a free course with ~90 minutes of video, code, and crisp visuals and animations that explain the modern Transformer architecture, tokenizers, embeddings, and mixture-of-expert models. @MaartenGr and I have developed a lot of the visual language over the last several years (tens of thousands of iterations for hundreds of figures) for the book. But to have an opportunity to collaborate with the legendary @AndrewYNg, we took them to the next level with animations and a concise narrative meant to enable technical learners to pick up an ML paper and understand the architecture description. Link in comments

Andrew Ng@AndrewYNg

Announcing How Transformer LLMs Work, created with @JayAlammar and @MaartenGr, co-authors of the beautifully illustrated book, “Hands-On Large Language Models.” This course offers a deep dive into the inner workings of the transformer architecture that powers large language models (LLMs). The transformer architecture revolutionized generative AI; in fact, the "GPT" in ChatGPT stands for "Generative Pre-Trained Transformer." Originally introduced in the Google Brain team's groundbreaking 2017 paper "Attention Is All You Need," by Vaswani and others, transformers were a highly scalable model for machine translation tasks. Variants of this architecture now power today’s LLMs such as those from OpenAI, Google, Meta, Cohere, Anthropic and DeepSeek. In this course, you’ll learn in detail how LLMs process text. You'll also work through code examples that illustrate that transformer's individual components. In details, you’ll learn: - How the representation of language has evolved, from Bag-of-Words to Word2Vec embeddings to the transformer architecture that captures a word's meanings taking into account the context of other words in the input. - How inputs are broken down into tokens before they are sent to the language model. - The details of a transformer's main stages: Tokenization and embedding, the stack of transformer blocks, and the language model head. - The inner workings of the transformer block, including attention, which calculates relevance scores, and the feedforward layer, which incorporates stored information learned in training. - How cached calculations make transformers faster. - Some of the most recent ideas in the latest models such as Mixture-of-Experts (MoE) which uses multiple sub-models and a router on each layer to improve the quality of LLMs. By the end of this course, you’ll have a deep understanding of how LLMs actually process text and be able to read through papers describing the latest models and understand the details. Gaining this intuition will improve your approach to building LLM applications. Please sign up here: deeplearning.ai/short-courses/…

English

211

1.5K

137K

Jay Alammar retweetledi

mrdoob@mrdoob·2d

ZXX

286

2.2K

127.4K

Jay Alammar retweetledi

Leland McInnes@leland_mcinnes·6d

EVoC is a library designed specifically for fast clustering of high dimensional embedding vectors. It can produce high quality clusters extremely efficiently, and requires little to no hyperparameter tuning. Better clustering than UMAP + HDBSCAN; faster clustering than KMeans.

English

185

18.2K

Jay Alammar@JayAlammar·3d

If only every foundational LLM lab would publish such a guide with their model releases.

Maarten Grootendorst@MaartenGr

A Visual Guide to Gemma 4 With almost 40 (!) custom visuals, explore the new models from Google DeepMind. We explore various techniques, ranging from Mixture of Experts and the Vision Encoder all the way up to Per-Layer Embeddings and the Audio Encoder. Link below 👇

English

603

63.7K

Jay Alammar retweetledi

Maarten Grootendorst@MaartenGr·4d

Introducing Gemma 4! My first two months at Google DeepMind have been great and totally not busy at all 😅 blog.google/innovation-and…

English

2.5K

Jay Alammar retweetledi

Daniel San@dani_avila7·30 Mar

Cohere released a new Transcribe model, so I built a Chrome extension to test it It works two ways: through the API or with a local server API mode: grab a free key from your Cohere account, dashboard.cohere.com/api-keys sign up and you get free tier access Local mode: download the model from Hugging Face, huggingface.co/CohereLabs/coh…, spin up the local server and add your HF access token from here: huggingface.co/settings/tokens Pick either mode and you're good to go, left it open source under MIT: github.com/davila7/cohere… Good weekend project, going to keep exploring where else this can be applied Thanks @nickfrosst and the @cohere team!

Nick Frosst@nickfrosst

@cohere transcribe Sota open source transcription model running in the browser :) Weights on @huggingface link below

English

10.1K

Jay Alammar@JayAlammar·29 Mar

@nothashem @wballaa مجال بحثي مهم. @earthspecies و @ProjectCETI من ابرز من يعمل فيه

العربية

hashim alsharif@nothashem·28 Mar

@wballaa @JayAlammar جداً، وخصوصاً أني متأكد أن الموضوع ممكن لحديث النبي ﷺ: "والذي نفسي بيده، لا تقوم الساعة حتى تكلم السباع الإنس" (أخرجه الترمذي) هذا دليل صادق على نبوة أشرف الخلق

العربية

412

hashim alsharif@nothashem·28 Mar

one of the things I keep thinking about is this: we’re probably 5 years away from understanding animals, maybe even communicating with them I don’t really care about AGI compared to this. humans have lived alongside animals forever without ever understanding them my view comes down to two things: a) do animals actually have language systems? not as a whole, but per species. meaning structured signals, patterns, intent not just random noise. if that’s true, then this becomes a pattern recognition problem. and that’s very solvable with current AI trends, better models, more data, more compute b) how do you build this at scale? so the real question is who’s actually building this

English

1.6K

Jay Alammar retweetledi

Aidan Gomez@aidangomez·26 Mar

Cohear 👂 Cohere's first audio model. Apache 2.0. #1 on the Open ASR leaderboard. Multilingual transcription across 14 languages.

Cohere@cohere

Introducing: Cohere Transcribe – a new state-of-the-art in open source speech recognition.

English

306

24.9K

Jay Alammar@JayAlammar·26 Mar

@imM0hannad @NajwaGhamdi يدعم العربي، ويهمنا نطوره زيادة ونسمع رأي الناس بعد تجربته

العربية

1.3K

مُهند/س@imM0hannad·26 Mar

@NajwaGhamdi يدعم العربي ؟

العربية

491

Jay Alammar retweetledi

نجوى مسفر@NajwaGhamdi·26 Mar

اطلقت كوهير نموذج transcribe لتحول الصوت إلى نصوص وهو مفتوح المصدر !

Cohere@cohere

Introducing: Cohere Transcribe – a new state-of-the-art in open source speech recognition.

العربية

144

32K

Jay Alammar retweetledi

vLLM@vllm_project·26 Mar

🎉 Congrats to @Cohere on releasing Cohere Transcribe, a 2B speech recognition model (Apache 2.0, 14 languages). Day-0 support in vLLM. Cohere contributed encoder-decoder serving optimizations to vLLM: variable-length encoder batching and packed attention for the decoder. Up to 2x throughput improvement for speech workloads, and these gains carry over to all encoder-decoder models on vLLM. Thanks to the @Cohere team for the contribution! PR 🔗 github.com/vllm-project/v… Blog 🔗 huggingface.co/blog/CohereLab…

Cohere@cohere

Introducing: Cohere Transcribe – a new state-of-the-art in open source speech recognition.

English

204

16.1K

Jay Alammar retweetledi

Pierre Richemond 🇪🇺@TheOneKloud·26 Mar

Excited and proud to introduce our latest: Cohere Transcribe, the best dedicated ASR model in the world. #1 EN HF leaderboard, SotA human evals, ahead of ElevenLabs, Qwen3, Mistral, Kyutai, and OpenAI. 14 supported languages. Apache 2.0, on HF for you to try. Our first audio model and a key step in powering North experiences. huggingface.co/CohereLabs/coh…

English

113

14.6K

Jay Alammar@JayAlammar·26 Mar

@NajwaGhamdi Good names, too! Easy to type, too! I can never find any of my "muhammed/mohamed/muhammad/etc" friends on linkedin.

English

103

نجوى مسفر@NajwaGhamdi·26 Mar

@JayAlammar I also gave names to my agents and I constantly refer to them as my team ..another future note to the anthropologists :)

English

597

Jay Alammar@JayAlammar·25 Mar

Blink and you may miss it. Multiple times this week I've heard people (in the industry and out) refer to their LLM with a human pronoun: "ask him", "I told him". Didn't register it as often before this year. It's not even a decade since the Transformer. A note for a future anthropologist

English

8.2K

Jay Alammar@JayAlammar·26 Mar

New SoTa transcription model from @cohere! - #1 on accuracy on the Open ASR Leaderboard. - Open Source (Apache 2.0) - 14 Languages (English, French, Arabic, German, Italian, Spanish, Portuguese, Greek, Dutch, Polish, Chinese, Japanese, Korean, Vietnamese).

Cohere@cohere

Introducing: Cohere Transcribe – a new state-of-the-art in open source speech recognition.

English

121

14.5K

Jay Alammar retweetledi

Cohere@cohere·24 Mar

We’re honored to be named one of @FastCompany's Most Innovative Companies of 2026! This recognition reflects our commitment to building secure, sovereign AI for enterprises and governments. Over the past year, we’ve deepened our focus on serving the unique needs of highly regulated industries—expanding what organizations can do with their protected data through North, our agentic platform for getting more work done. Learn more: fastcompany.com/91495412/artif…

English

3.3K

Jay Alammar retweetledi

Ivan Zhang@1vnzh·23 Mar

I'm excited to announce we're working with Saab to bring North onboard to Command the sky! saab.com/newsroom/press…

English

620

21.2K

Jay Alammar retweetledi

Ben Vinegar@bentlegen·23 Mar

We need a name for performative agent parallelization I propose: “slop theater”

Yiliu@yiliush

source code now available github.com/collaborator-a…

English

170

293

6.5K

222.4K

Jay Alammar retweetledi

Bharat@bharatrunwal2·21 Mar

Introducing PRISM: Demystifying Retention and Interaction in Mid-Training The modern LLM training pipeline has evolved beyond just pre-training + alignment. State-of-the-art models now insert a critical middle stage "mid-training" where targeted, high-quality data mixtures build reasoning foundations before RL. Yet despite its growing adoption, the field lacks a principled understanding of what actually drives its effectiveness. — What data should you use? — When in the pipeline should you mid-train? — How does it interact with downstream RL? — Does it generalize across architectures and scales? — And beyond benchmarks: what do these stages actually do to the model at the weight and representation level? These questions don't have clear answers in the literature at scale : and the cost of getting them wrong is significant. PRISM is our systematic attempt to answer all of these. Using ~27B high-quality tokens, we run controlled experiments across 7 models · 4 families · 3B–24B parameters, spanning both dense Transformers and attention-Mamba hybrids, measuring what mid-training actually does: to performance, to weights, to representations, and to downstream RL. 🧵 Key findings below. 🌐 bharat-runwal.github.io/PRISM/ 📄 arxiv.org/abs/2603.17074 🤗 Models and Datasets: huggingface.co/PRISM-Midtrain… (coming soon)

GIF

English

143

42.3K

Jay Alammar@JayAlammar·19 Mar

@srush_nlp Congrats on shipping, Sash!

English

422

Sasha Rush@srush_nlp·19 Mar

It was kind of amazing how many RL challenges in this run were bootstrapped by earlier Composers. Interesting times.

Cursor@cursor_ai

Composer 2 is now available in Cursor.

English

130

9.2K

Jay Alammar retweetledi

Alexander Doria@Dorialexander·15 Mar

"Synthetic pretraining is the way frontier models are built" — by @fujikanaeda

Maarten Van Segbroeck@mvansegb

@inductionheads Spot on. We actually just gave a guest lecture at Berkeley EECS on this exact dynamic (L11: Synthetic Data Powering Pre-Training). @fujikanaeda Here are our slides if anyone wants to go down the rabbit hole: scalable-ai.eecs.berkeley.edu/assets/lecture…

English

494

45.5K

Keşfet

@nickfrosst @cohere @nothashem @wballaa @earthspecies @ProjectCETI @imM0hannad @NajwaGhamdi