Joyce Chen

19 posts

Joyce Chen

@joycech3n

mots @_inception_ai | prev @stanfordailab, @DbrxMosaicAI, @BrownInstitute

Katılım Haziran 2023

199 Takip Edilen197 Takipçiler

Joyce Chen retweetledi

Inception@_inception_ai·4d

We're hiring. Inception builds diffusion-based LLMs that generate tokens in parallel, not one at a time. Our founders helped invent diffusion models, flash attention, decision transformers, and DPO. Team from DeepMind, OpenAI, Meta AI, Microsoft AI, AWS, Scale, and Stripe. Open roles: inceptionlabs.ai/careers

English

345

18.7K

Joyce Chen retweetledi

Stefano Ermon@StefanoErmon·6d

The research journey to create diffusion LLMs has been 10+ years in the making. My cofounders and I have been at the forefront of this work, from score-based generative modeling to SEDD to Mercury 2. @amplifypartners put together an excellent deep dive: amplifypartners.com/blog-posts/eve…

English

154

33.9K

Joyce Chen retweetledi

Inception@_inception_ai·31 Mar

Introducing Mercury Edit 2: highest quality next-edit model at the lowest latency. 75.6% quality at 221ms. Beats Zeta 2 (64.4%), Claude 4.5 Haiku (71.4%), GPT-5.4 Nano (73.5%). +48% accept rate. -27% shown rate.

English

425

45.3K

Joyce Chen@joycech3n·14 Mar

now your voice agents can think :D

Inception@_inception_ai

Voice AI only feels natural when the delay disappears. Autoregressive models often can’t meet that threshold. dLLMs can. Hear from @joycech3n on how real-time responsiveness unlocks better voice experiences across use cases. → Learn more: inceptionlabs.ai → Join us: inceptionlabs.ai/careers

English

133

Joyce Chen retweetledi

Inception@_inception_ai·26 Şub

Listen to @samar_a_khanna explain why parallel generation, rather than sequential, raises the performance ceiling for language models. Learn more about diffusion LLMs. → We're hiring: inceptionlabs.ai/careers

English

18.2K

Joyce Chen retweetledi

Inception@_inception_ai·26 Şub

#1 speed 😎 cc @ArtificialAnlys

English

103

8.8K

Joyce Chen retweetledi

Andrew Ng@AndrewYNg·25 Şub

Impressive inference speed from Inception Labs’ diffusion LLMs. Diffusion LLMs are a fascinating alternative to conventional autoregressive LLMs. Well done @StefanoErmon and team!

Stefano Ermon@StefanoErmon

Mercury 2 is live 🚀🚀 The world’s first reasoning diffusion LLM, delivering 5x faster performance than leading speed-optimized LLMs. Watching the team turn years of research into a real product never gets old, and I’m incredibly proud of what we’ve built. We’re just getting started on what diffusion can do for language.

English

156

1.4K

191.3K

Joyce Chen retweetledi

Qinqing Zheng@qqyuzu·25 Şub

Diffusion is the future. Mercury 2 hits ~1200 tokens/sec + agentic performance on par with Claude Haiku 4.5 - verified by AA. Something I believed in years ago is finally here. Looking forward to what's ahead with the team :) @_inception_ai

Volodymyr Kuleshov 🇺🇦@volokuleshov

@StefanoErmon 🚨Hot off the presses: the official Artificial Analysis benchmarking results are in! 🚀Mercury morels set a new frontier of speed and agentic quality

English

2.7K

Joyce Chen retweetledi

Sasha Rush@srush_nlp·25 Şub

Text diffusion seems like it’s really happening.

Volodymyr Kuleshov 🇺🇦@volokuleshov

@StefanoErmon 🚨Hot off the presses: the official Artificial Analysis benchmarking results are in! 🚀Mercury morels set a new frontier of speed and agentic quality

English

258

37.6K

Joyce Chen@joycech3n·24 Şub

The first reasoning dLLM and our best model yet ⚡️⚡️

Stefano Ermon@StefanoErmon

English

728

Joyce Chen retweetledi

Inception@_inception_ai·17 Şub

Inception pairs academic rigor with real speed. Fast feedback. Tight collaboration. Research in production. Hear from @joycech3n on our company culture. #AIResearch #ModelDevelopment

English

6.6K

Joyce Chen@joycech3n·2 Ara

I’m helping host the SAIL research podcast booth — sign up here! calendly.com/contact-readsa…

Nathan Lambert@natolambert

Excited to share another NeurIPS event I'm helping with. We're hosting a dedicated booth to record researchers talking about their work, share that audio&video content on our socials, and start great conversations. What is it? - 10 minute researchers interviews recorded live at NeurIPS - come talk about your poster, your research, or your company - we’ll edit and publish your interview as a short & long form video - Location: Hilton Hotel, across the street from convention Hosted by @readsail and @SemiAnalysis_ with much gratitude to our sponsor @LambdaAPI. Lambda has been a great ally of all my crazy ideas to help out the research community over the last year, the fastest to pick up the phone and help. I plan on spinning more of this into the @interconnectsai feed, so please reach out (email best, DM okay, but contact @readsail first) with questions. I've got more planned in this space soon, such as Interconnects Paper Awards. Come hang, signup link below!

English

1.3K

Joyce Chen@joycech3n·21 Eki

@daniel_mac8 @karpathy if you're looking for a prod diffusion model :) inceptionlabs.ai

English

Dan McAteer@daniel_mac8·20 Eki

@karpathy Google released a preview of Gemini Diffusion in May of this year and it was really cool. Super fast! Much faster than AR LLMs. They said they'd be improving it and do a release later so I wouldn't be surprised if they come with a prod version of Gemini Diffusion soon.

English

2.2K

Andrej Karpathy@karpathy·20 Eki

Nice, short post illustrating how simple text (discrete) diffusion can be. Diffusion (i.e. parallel, iterated denoising, top) is the pervasive generative paradigm in image/video, but autoregression (i.e. go left to right bottom) is the dominant paradigm in text. For audio I've seen a bit of both. A lot of diffusion papers look a bit dense but if you strip the mathematical formalism, you end up with simple baseline algorithms, e.g. something a lot closer to flow matching in continuous, or something like this in discrete. It's your vanilla transformer but with bi-directional attention, where you iteratively re-sample and re-mask all tokens in your "tokens canvas" based on a noise schedule until you get the final sample at the last step. (Bi-directional attention is a lot more powerful, and you get a lot stronger autoregressive language models if you train with it, unfortunately it makes training a lot more expensive because now you can't parallelize across sequence dim). So autoregression is doing an `.append(token)` to the tokens canvas while only attending backwards, while diffusion is refreshing the entire token canvas with a `.setitem(idx, token)` while attending bidirectionally. Human thought naively feels a bit more like autoregression but it's hard to say that there aren't more diffusion-like components in some latent space of thought. It feels quite possible that you can further interpolate between them, or generalize them further. And it's a component of the LLM stack that still feels a bit fungible. Now I must resist the urge to side quest into training nanochat with diffusion.

GIF

Nathan Barry@nathanrs

BERT is just a Single Text Diffusion Step! (1/n) When I first read about language diffusion models, I was surprised to find that their training objective was just a generalization of masked language modeling (MLM), something we’ve been doing since BERT from 2018. The first thought I had was, “can we finetune a BERT-like model to do text generation?”

English

270

536

5.2K

864.3K

Joyce Chen retweetledi

Inception@_inception_ai·19 May

At the #microsoftbuild keynote, Satya Nadella unveiled NLWeb – an open project that enables websites to easily create AI-powered natural language interfaces. We are incredibly excited to be one of NLWeb’s founding partners, using our ultra-fast Mercury Small diffusion Large Language Model (dLLM) to enable lightning-fast natural conversations.

English

27.8K

Joyce Chen retweetledi

Inception@_inception_ai·30 Nis

We are launching our API in open beta! Visit the Inception Platform to create your account and get started using the first commercial-scale diffusion large language models (dLLMs). platform.inceptionlabs.ai

English

136

64.4K

Joyce Chen retweetledi

Inception@_inception_ai·21 Nis

We will be at #ICLR2025 this week! DM us if you're interested in connecting and learning more about our dLLMs. (We're hiring!)

English

10.3K

Joyce Chen retweetledi

Inception@_inception_ai·26 Şub

We are excited to introduce Mercury, the first commercial-grade diffusion large language model (dLLM)! dLLMs push the frontier of intelligence and speed with parallel, coarse-to-fine text generation.

English

225

952

5.3K

1.9M

Joyce Chen retweetledi

Jeremy Irvin@jeremy_irvin16·28 Oca

🎉 Thrilled to announce that our work on TEOChat was accepted to #ICLR2025! We present the first vision-language assistant for temporal earth observation data ⏰🌍, capable of tasks like building damage assessment and identifying urban changes over time. More details below 👇

Jeremy Irvin@jeremy_irvin16

Vision-language models (VLMs) are revolutionizing how we use Earth observation (EO) data, but none could reason over time—a critical need for applications like disaster relief—until now. Introducing TEOChat 🌍🤖, the first VLM for temporal EO data! arxiv.org/abs/2410.06234 1/8

English

5.5K

Joyce Chen retweetledi

Jeremy Irvin@jeremy_irvin16·10 Eki

English

15.8K

Keşfet

@AmplifyPartners @samar_a_khanna @ArtificialAnlys @StefanoErmon @_inception_ai @daniel_mac8 @karpathy @elonmusk