Ishaan Gulrajani

1K posts

Ishaan Gulrajani

@__ishaan

Hi! I’m a machine learning researcher @openai. Previously @stanford @facebook @google @mila_quebec

San Francisco Katılım Kasım 2010

512 Takip Edilen4.2K Takipçiler

Ishaan Gulrajani@__ishaan·21h

@_aidan_clark_ forget-gate, forgotten

English

670

Aidan Clark@_aidan_clark_·21h

Absolutely crazy to me there is a whole generation of people who know what a transformer is but might not know what an RNN is (not meant to be a comment on Helen). Is this what getting old is like?

Helen Toner@hlntnr

Never forget @karpathy training a recurrent neural net (precursor to transformers) to imitate @paulg in 2015—a thing of syntactic and semantic beauty:

English

185

24.1K

Ishaan Gulrajani@__ishaan·21 Mar

@azmythalauris and one day you’ll say your last goodbye without knowing it at the time, and one day you’ll long for them, and one day …

English

241

🌸🕷𝔞𝔷𝔪𝔶𝔱𝔥🕷🌸@azmythalauris·21 Mar

I often get giddy thinking about the fact that there are people out there I don’t know, who I have yet to meet and will become close to and love. I don’t know who they are but eventually our paths will cross and I’m SO EXCITED ABOUT IT.

English

4.5K

Ishaan Gulrajani@__ishaan·8 Mar

@curiousceros my life thinks I am objectively attractive but is not attracted to me

English

191

Shanthi@evodevotee·8 Mar

Why am I suppressing the big gigantic crush I have on MY OWN LIFE?

English

Ishaan Gulrajani@__ishaan·31 Eki

@SebastienBubeck @OpenAI @sama Welcome!!

English

415

Sebastien Bubeck@SebastienBubeck·31 Eki

Just started at @OpenAI and I couldn't be more excited to join at this pivotal moment of safe AGI development! Met so many old friends already, talent density of this place is just insane!! Thank you all for the warm welcome, and in particular @sama. Now let the unicorns fly!

English

905

104K

Ishaan Gulrajani@__ishaan·25 Eki

@AriX Broke: inbox zero Woke: inbox INT64_MIN

English

428

Ari Weinstein@AriX·25 Eki

OK Apple Mail I know I’m not on top of my inbox but you don’t have to call me out like this

English

110

11.8K

Ishaan Gulrajani@__ishaan·21 Eki

this is the way.

typedfemale@typedfemale

i've talked to multiple very experienced people about how they would organize their ideal training codebase - all of them said "i'd put everything in a single file"

English

2.8K

Ishaan Gulrajani@__ishaan·10 Eki

@dpkingma @sirbayes @ylecun @srush_nlp I like this take (from arxiv.org/abs/2103.04047)

English

260

Durk Kingma@dpkingma·10 Eki

@sirbayes @ylecun @srush_nlp This is a non-issue but I've never understood why scientists prefer to minimize/descent. Maximization/ascent feels so much better. Much better vibes IMO 😎

English

4.8K

Sasha Rush@srush_nlp·8 Eki

Updating all my NeurIPS papers.

English

163

2.4K

250.4K

Ishaan Gulrajani@__ishaan·13 Haz

@ABAtanasov very cool stuff! :)

English

203

Alex Atanasov@ABAtanasov·12 Haz

Glad this one is also out! It was really fun working on this with Blake and Cengiz. I hope people enjoy :) Happy to chat if anyone has questions or comments

Kempner Institute at Harvard University@KempnerInst

NEW! Part two of a #KempnerInstitute blog series: @blake__bordelon, @ABAtanasov & @CPehlevan propose a simple and solvable model where many of the aspects of #LLMs are already present. Read more: bit.ly/3RqYMhX #neuralnetworks #AI

English

4.7K

Ishaan Gulrajani@__ishaan·23 May

from a glance this is an excellent sequel to @lucastheis ' 2015 paper, which remains the most important thing i've ever read on evaluating generative models

Lucas Theis@lucastheis

What does it mean for an image, video, or text to be 𝑟𝑒𝑎𝑙𝑖𝑠𝑡𝑖𝑐? Despite how far we've come in 𝑔𝑒𝑛𝑒𝑟𝑎𝑡𝑖𝑛𝑔 realistic data, 𝑞𝑢𝑎𝑛𝑡𝑖𝑓𝑦𝑖𝑛𝑔 realism is still a poorly understood problem. I've shared my thoughts on how to correctly quantify realism here: arxiv.org/abs/2403.04493 #icml2024 #genai #compression

English

5.5K

Ishaan Gulrajani@__ishaan·30 Nis

@MimeeXu 🙋‍♂️

QME

215

Mimee // smart casual dark and academic@MimeeXu·30 Nis

I realize that one of the hindrances to my making fast progress on the PhD is that I don’t have family here, and I would really appreciate it if my friends could sign up to attend my defense if I ever have one.

English

928

Ishaan Gulrajani@__ishaan·2 Nis

@davefontenot github.com/igul222

QME

1.1K

Dave Font@davefontenot·2 Nis

Just got GPT5 access. Holy fucking shit. Reply with your GitHub if you want to come try it out.

English

136

66.9K

Ishaan Gulrajani@__ishaan·7 Ara

@huangcza @MIT_MTA @MIT_SHASS @MITEECS Congratulations Anna!

English

333

Anna Huang@huangcza·7 Ara

I’m excited I’ll be joining MIT next fall, for a shared interdisciplinary faculty position between Music (@MIT_MTA @MIT_SHASS) & EECS. I’m recruiting PhD students @MITEECS for Fall 2024 (apply by Dec 15), and also Postdocs. See czhuang.github.io for details. Come join us!

English

758

114.1K

Ishaan Gulrajani@__ishaan·29 Kas

@chenlin_meng @judyhshen Congratulations, looks like really great work!

English

214

Chenlin Meng@chenlin_meng·28 Kas

Super excited for the launch of Pika 1.0! I am extremely grateful to be working with such an amazing and talented team on this journey! ❤️ I am also very thankful for the support from our incredible investors, advisors, friends, and community members. We couldn't have achieved this without you! :)

Pika@pika_labs

Introducing Pika 1.0, the idea-to-video platform that brings your creativity to life. Create and edit your videos with AI. Rolling out to new users on web and discord, starting today. Sign up at pika.art

English

605

168.5K

Ishaan Gulrajani@__ishaan·28 Kas

@colinraffel Congratulations!! Very well deserved 😊

English

315

Ishaan Gulrajani@__ishaan·20 Kas

OpenAI is nothing without its people.

English

295

23.5K

Ishaan Gulrajani@__ishaan·19 Kas

❤️

Sam Altman@sama

i love the openai team so much

ART

131

29K

Ishaan Gulrajani@__ishaan·14 Ağu

@txhf arxiv.org/abs/2002.05202

QME

302

Ishaan Gulrajani@__ishaan·12 Haz

@karpathy @DBahdanau @kchonyc Attention in deep learning goes back at least a little farther! See Alex Graves' "soft-window convolutions" in arxiv.org/abs/1308.0850. (This of course does not diminish @DBahdanau et al's excellent work in any way.)

English

1.1K

Andrej Karpathy@karpathy·12 Haz

Thanks for highlighting; The paper that introduced Attention (by @DBahdanau, @kchonyc, Bengio) gets ~1000X _less_ attention than the paper "Attention is All You Need". And it is historically amusing that both are very general but happened to be developed for machine translation.

Jim Fan@DrJimFan

Today 6 years ago, "Attention is All You Need" went on Arxiv! Happy birthday Transformer! 🎂 Fun facts: - Transformer did not invent attention, but pushed it to the extreme. The first attention paper was published 3 years prior (2014) and had an unassuming title: "Neural Machine Translation by Jointly Learning to Align and Translate", from Yoshua Bengio's lab. It is a combination of RNN + "context vectors" (i.e. attention). Many of you likely haven't heard about this paper, but it's one of the greatest milestones in NLP and has been cited 29K times (compared to Transformer's 77K). - Neither Transformer nor the original attention paper talked about the general-purpose sequence computer. Instead, both were conceived as solutions to one narrow & specific problem: machine translation. It's remarkable that AGI (some day soon) can trace its origin to the humble Google Translate. 😅 - Transformer was published at NeurIPS 2017, one of the top AI conferences worldwide. Yet it didn't even get an Oral presentation, let alone awards. There were 3 best papers at NeurIPS that year. Combined, they have 529 citations as of today.

English

114

921

251.8K

Ishaan Gulrajani@__ishaan·4 Haz

@jwthickstun @LucaAmb @sedielem @vlastelicap By a change-of-basis, Gaussian diffusion on learned embeddings is exactly equivalent to Gaussian diffusion on one-hots with a learned noise covariance. I really wanted one-hots to win, but reality had other plans.

English

John Thickstun@jwthickstun·4 Haz

@LucaAmb @sedielem @vlastelicap Ishaan has run experiments on one-hots (e.g., in the Diffusion-LM paper last year: Appendix F). I'm not sure whether a direct ablation of one-hots vs. embeddings has made it into any published papers. Maybe @__ishaan can comment.

English

103

Sander Dieleman@sedielem·1 Haz

Making diffusion language models work as well as autoregressive ones will be a challenge (see my earlier blog post: sander.ai/2023/01/09/dif…). This paper quantifies this and finds a 64x efficiency disadvantage across all scales 👀 a big gap, but at least it's a constant factor!

Ishaan Gulrajani@__ishaan

New paper with @tatsu_hashimoto! Likelihood-Based Diffusion Language Models: arxiv.org/abs/2305.18619 Likelihood-based training is a key ingredient of current LLMs. Despite this, diffusion LMs haven't shown any nontrivial likelihoods on standard LM benchmarks. We fix this!🧵

English

31.1K

Ishaan Gulrajani@__ishaan·1 Haz

@sedielem Fun fact that didn’t make it into the paper: when we started the project, the gap was 10,000x 😳. It took stacking many 10% improvements to make it this far.

English

435

Keşfet

@_aidan_clark_ @azmythalauris @SebastienBubeck @OpenAI @sama @AriX @dpkingma @sirbayes