Joyce Chen

19 posts

Joyce Chen

Joyce Chen

@joycech3n

mots @_inception_ai | prev @stanfordailab, @DbrxMosaicAI, @BrownInstitute

Katılım Haziran 2023
199 Takip Edilen197 Takipçiler
Joyce Chen retweetledi
Inception
Inception@_inception_ai·
We're hiring. Inception builds diffusion-based LLMs that generate tokens in parallel, not one at a time. Our founders helped invent diffusion models, flash attention, decision transformers, and DPO. Team from DeepMind, OpenAI, Meta AI, Microsoft AI, AWS, Scale, and Stripe. Open roles: inceptionlabs.ai/careers
Inception tweet media
English
5
22
345
18.7K
Joyce Chen retweetledi
Stefano Ermon
Stefano Ermon@StefanoErmon·
The research journey to create diffusion LLMs has been 10+ years in the making. My cofounders and I have been at the forefront of this work, from score-based generative modeling to SEDD to Mercury 2. @amplifypartners put together an excellent deep dive: amplifypartners.com/blog-posts/eve…
English
5
27
154
33.9K
Joyce Chen retweetledi
Inception
Inception@_inception_ai·
Introducing Mercury Edit 2: highest quality next-edit model at the lowest latency. 75.6% quality at 221ms. Beats Zeta 2 (64.4%), Claude 4.5 Haiku (71.4%), GPT-5.4 Nano (73.5%). +48% accept rate. -27% shown rate.
Inception tweet media
English
11
41
425
45.3K
Joyce Chen retweetledi
Inception
Inception@_inception_ai·
Listen to @samar_a_khanna explain why parallel generation, rather than sequential, raises the performance ceiling for language models. Learn more about diffusion LLMs. → We're hiring: inceptionlabs.ai/careers
English
2
7
50
18.2K
Joyce Chen retweetledi
Joyce Chen retweetledi
Qinqing Zheng
Qinqing Zheng@qqyuzu·
Diffusion is the future. Mercury 2 hits ~1200 tokens/sec + agentic performance on par with Claude Haiku 4.5 - verified by AA. Something I believed in years ago is finally here. Looking forward to what's ahead with the team :) @_inception_ai
Volodymyr Kuleshov 🇺🇦@volokuleshov

@StefanoErmon 🚨Hot off the presses: the official Artificial Analysis benchmarking results are in! 🚀Mercury morels set a new frontier of speed and agentic quality

English
0
3
33
2.7K
Joyce Chen retweetledi
Sasha Rush
Sasha Rush@srush_nlp·
Text diffusion seems like it’s really happening.
Volodymyr Kuleshov 🇺🇦@volokuleshov

@StefanoErmon 🚨Hot off the presses: the official Artificial Analysis benchmarking results are in! 🚀Mercury morels set a new frontier of speed and agentic quality

English
9
18
258
37.6K
Joyce Chen retweetledi
Inception
Inception@_inception_ai·
Inception pairs academic rigor with real speed. Fast feedback. Tight collaboration. Research in production. Hear from @joycech3n on our company culture. #AIResearch #ModelDevelopment
English
1
5
30
6.6K
Joyce Chen
Joyce Chen@joycech3n·
I’m helping host the SAIL research podcast booth — sign up here! calendly.com/contact-readsa…
Nathan Lambert@natolambert

Excited to share another NeurIPS event I'm helping with. We're hosting a dedicated booth to record researchers talking about their work, share that audio&video content on our socials, and start great conversations. What is it? - 10 minute researchers interviews recorded live at NeurIPS - come talk about your poster, your research, or your company - we’ll edit and publish your interview as a short & long form video - Location: Hilton Hotel, across the street from convention Hosted by @readsail and @SemiAnalysis_ with much gratitude to our sponsor @LambdaAPI. Lambda has been a great ally of all my crazy ideas to help out the research community over the last year, the fastest to pick up the phone and help. I plan on spinning more of this into the @interconnectsai feed, so please reach out (email best, DM okay, but contact @readsail first) with questions. I've got more planned in this space soon, such as Interconnects Paper Awards. Come hang, signup link below!

English
0
1
5
1.3K
Dan McAteer
Dan McAteer@daniel_mac8·
@karpathy Google released a preview of Gemini Diffusion in May of this year and it was really cool. Super fast! Much faster than AR LLMs. They said they'd be improving it and do a release later so I wouldn't be surprised if they come with a prod version of Gemini Diffusion soon.
English
1
0
7
2.2K
Andrej Karpathy
Andrej Karpathy@karpathy·
Nice, short post illustrating how simple text (discrete) diffusion can be. Diffusion (i.e. parallel, iterated denoising, top) is the pervasive generative paradigm in image/video, but autoregression (i.e. go left to right bottom) is the dominant paradigm in text. For audio I've seen a bit of both. A lot of diffusion papers look a bit dense but if you strip the mathematical formalism, you end up with simple baseline algorithms, e.g. something a lot closer to flow matching in continuous, or something like this in discrete. It's your vanilla transformer but with bi-directional attention, where you iteratively re-sample and re-mask all tokens in your "tokens canvas" based on a noise schedule until you get the final sample at the last step. (Bi-directional attention is a lot more powerful, and you get a lot stronger autoregressive language models if you train with it, unfortunately it makes training a lot more expensive because now you can't parallelize across sequence dim). So autoregression is doing an `.append(token)` to the tokens canvas while only attending backwards, while diffusion is refreshing the entire token canvas with a `.setitem(idx, token)` while attending bidirectionally. Human thought naively feels a bit more like autoregression but it's hard to say that there aren't more diffusion-like components in some latent space of thought. It feels quite possible that you can further interpolate between them, or generalize them further. And it's a component of the LLM stack that still feels a bit fungible. Now I must resist the urge to side quest into training nanochat with diffusion.
GIF
Nathan Barry@nathanrs

BERT is just a Single Text Diffusion Step! (1/n) When I first read about language diffusion models, I was surprised to find that their training objective was just a generalization of masked language modeling (MLM), something we’ve been doing since BERT from 2018. The first thought I had was, “can we finetune a BERT-like model to do text generation?”

English
270
536
5.2K
864.3K
Joyce Chen retweetledi
Inception
Inception@_inception_ai·
At the #microsoftbuild keynote, Satya Nadella unveiled NLWeb – an open project that enables websites to easily create AI-powered natural language interfaces. We are incredibly excited to be one of NLWeb’s founding partners, using our ultra-fast Mercury Small diffusion Large Language Model (dLLM) to enable lightning-fast natural conversations.
Inception tweet media
English
2
11
47
27.8K
Joyce Chen retweetledi
Inception
Inception@_inception_ai·
We are launching our API in open beta! Visit the Inception Platform to create your account and get started using the first commercial-scale diffusion large language models (dLLMs). platform.inceptionlabs.ai
English
8
30
136
64.4K
Joyce Chen retweetledi
Inception
Inception@_inception_ai·
We will be at #ICLR2025 this week! DM us if you're interested in connecting and learning more about our dLLMs. (We're hiring!)
English
2
6
43
10.3K
Joyce Chen retweetledi
Inception
Inception@_inception_ai·
We are excited to introduce Mercury, the first commercial-grade diffusion large language model (dLLM)! dLLMs push the frontier of intelligence and speed with parallel, coarse-to-fine text generation.
English
225
952
5.3K
1.9M
Joyce Chen retweetledi
Jeremy Irvin
Jeremy Irvin@jeremy_irvin16·
🎉 Thrilled to announce that our work on TEOChat was accepted to #ICLR2025! We present the first vision-language assistant for temporal earth observation data ⏰🌍, capable of tasks like building damage assessment and identifying urban changes over time. More details below 👇
Jeremy Irvin@jeremy_irvin16

Vision-language models (VLMs) are revolutionizing how we use Earth observation (EO) data, but none could reason over time—a critical need for applications like disaster relief—until now. Introducing TEOChat 🌍🤖, the first VLM for temporal EO data! arxiv.org/abs/2410.06234 1/8

English
1
4
26
5.5K
Joyce Chen retweetledi
Jeremy Irvin
Jeremy Irvin@jeremy_irvin16·
Vision-language models (VLMs) are revolutionizing how we use Earth observation (EO) data, but none could reason over time—a critical need for applications like disaster relief—until now. Introducing TEOChat 🌍🤖, the first VLM for temporal EO data! arxiv.org/abs/2410.06234 1/8
Jeremy Irvin tweet media
English
4
18
68
15.8K