Ari Seff

134 posts

Ari Seff banner
Ari Seff

Ari Seff

@ari_seff

Robotics @OpenAI | Previously @Waymo, PhD @Princeton

Katılım Mart 2017
166 Takip Edilen3.2K Takipçiler
Sabitlenmiş Tweet
Ari Seff
Ari Seff@ari_seff·
Can we model driving similarly to how we model language? Excited to introduce MotionLM, where we represent driving scenarios as conversations among road users. It establishes a new state-of-the-art in multi-agent motion forecasting: arxiv.org/abs/2309.16534
GIF
English
13
108
874
176.3K
Pengchuan Zhang
Pengchuan Zhang@PengchuanZ·
I’m joining OpenAI to work on World Simulation and Robotics, after 3.75 years at FAIR working on SAM and Llama. I’m thrilled to explore how visual perception, world model and robotics can come together to build physical intelligence.
English
77
51
1.3K
224.5K
Ari Seff retweetledi
OpenAI
OpenAI@OpenAI·
GPT-5.2 Thinking evals
OpenAI tweet media
English
342
484
3.8K
2M
Ari Seff
Ari Seff@ari_seff·
I’m at ICCV this week in Hawaii. Excited to talk with folks about robotics, agents, and multimodal reasoning. Reach out to chat!
English
0
0
11
820
Ari Seff retweetledi
Sam Altman
Sam Altman@sama·
watching chatgpt agent use a computer to do complex tasks has been a real "feel the agi" moment for me; something about seeing the computer think, plan, and execute hits different.
English
1.1K
790
12.6K
4.2M
Ari Seff retweetledi
Nathan Lambert
Nathan Lambert@natolambert·
One of the most striking, non-text AI plots I've seen since ChatGPT launched. Scaling keeps working, this time for Waymo's tooling.
Nathan Lambert tweet media
English
4
22
184
15.5K
Ari Seff retweetledi
Riley Goodside
Riley Goodside@goodside·
o3-pro: How many of each letter are there in the SHA1 of the answer to this question?
Riley Goodside tweet mediaRiley Goodside tweet media
English
30
35
879
150.3K
Ari Seff
Ari Seff@ari_seff·
@ptievgaleks @yuntiandeng If given the same time budget? I'm not so sure. Actually I'd be surprised if the accuracy were anywhere close. In any event, lengthy multiplication is not really what these models are designed for, which is why we generally give them access to tools :)
English
0
0
7
113
Yuntian Deng
Yuntian Deng@yuntiandeng·
For those curious about how o3-mini performs on multi-digit multiplication, here's the result. It does much better than o1 but still struggles past 13×13. (Same evaluation setup as before, but with 40 test examples per cell.)
Yuntian Deng tweet media
Yuntian Deng@yuntiandeng

Is OpenAI's o1 a good calculator? We tested it on up to 20x20 multiplication—o1 solves up to 9x9 multiplication with decent accuracy, while gpt-4o struggles beyond 4x4. For context, this task is solvable by a small LM using implicit CoT with stepwise internalization. 1/4

English
72
96
745
1M
Ari Seff
Ari Seff@ari_seff·
Language models are not designed specifically for lengthy arithmetic. That's why we generally give them access to tools (eg, the old computer you mention). In this case OP is testing the model on lengthy multiplications "by hand", which I think most humans would struggle with if given the same inference-time budget. As reasoning/inference-time compute increases, these mistakes reduce.
English
1
0
1
103
Ari Seff retweetledi
Sam Altman
Sam Altman@sama·
fun watching people react to operator. reminds me of the chatgpt launch!
English
1.1K
334
10.4K
2M
Ari Seff retweetledi
OpenAI
OpenAI@OpenAI·
Announcing The Stargate Project The Stargate Project is a new company which intends to invest $500 billion over the next four years building new AI infrastructure for OpenAI in the United States. We will begin deploying $100 billion immediately. This infrastructure will secure American leadership in AI, create hundreds of thousands of American jobs, and generate massive economic benefit for the entire world. This project will not only support the re-industrialization of the United States but also provide a strategic capability to protect the national security of America and its allies. The initial equity funders in Stargate are SoftBank, OpenAI, Oracle, and MGX. SoftBank and OpenAI are the lead partners for Stargate, with SoftBank having financial responsibility and OpenAI having operational responsibility. Masayoshi Son will be the chairman. Arm, Microsoft, NVIDIA, Oracle, and OpenAI are the key initial technology partners. The buildout is currently underway, starting in Texas, and we are evaluating potential sites across the country for more campuses as we finalize definitive agreements. As part of Stargate, Oracle, NVIDIA, and OpenAI will closely collaborate to build and operate this computing system. This builds on a deep collaboration between OpenAI and NVIDIA going back to 2016 and a newer partnership between OpenAI and Oracle. This also builds on the existing OpenAI partnership with Microsoft. OpenAI will continue to increase its consumption of Azure as OpenAI continues its work with Microsoft with this additional compute to train leading models and deliver great products and services. All of us look forward to continuing to build and develop AI—and in particular AGI—for the benefit of all of humanity. We believe that this new step is critical on the path, and will enable creative people to figure out how to use AI to elevate humanity.
English
5.4K
10.4K
59.4K
33.9M
Ari Seff retweetledi
Sulin Liu
Sulin Liu@su_lin_liu·
Discrete generative models use denoisers for generation, but they can slip up. What if generation *isn’t only* about denoising?🤔 Introducing DDPD: Discrete Diffusion with Planned Denoising🤗🧵(1/11) w/ @junonam_ @AndrewC_ML @HannesStaerk @xuyilun2 Tommi Jaakkola @RGBLabMIT
Sulin Liu tweet media
English
6
55
234
40.3K
Ari Seff
Ari Seff@ari_seff·
Evaluating autonomous vehicles in simulation is only useful if the other agents exhibit realistic reactions. We introduce a simple scheme for running RL on top of pre-trained behavior models to improve agent realism. (new paper with my former colleagues at @Waymo):
Ari Seff tweet media
English
1
0
10
1.4K
Ari Seff
Ari Seff@ari_seff·
@wilson1yan @matei_zaharia @VladMnih @pabbeel @AleksandraFaust @haoliuhl Very nice work! Qq: since it’s always the rightmost latents that are masked, if one was using a CNN-based VQ-VAE (as opposed to ViT-based), I assume this would result in the reconstruction degradation being explicitly biased towards specific image regions - is that accurate?
English
1
0
0
88
Wilson Yan
Wilson Yan@wilson1yan·
We are excited to announce ElasticTok, a simple yet scalable visual tokenizer that can adaptively encode image and video to variable-length sequences! Our model enables more efficient token usage over different images and long dynamic videos.
Wilson Yan tweet media
English
5
27
181
27.3K
Ari Seff
Ari Seff@ari_seff·
Very cool postdoc opportunity at Princeton. Had the great pleasure of working with Ryan during my PhD. If you're an AI researcher considering a postdoc, highly recommend taking a look:
Ryan Adams@ryan_p_adams

We are looking for AI Postdoctoral Fellows! Be a part of AI for Accelerating Invention, a new research initiative out of the Princeton AI Lab. You can learn about AI for Accelerating Invention here: invent.ai.princeton.edu and apply for the postdoc here: puwebp.princeton.edu/AcadHire/apply….

English
0
0
2
931
Daniel Litt
Daniel Litt@littmath·
Flip 100 coins, marked 1-100. Each second, Alice and Bob simultaneously check one coin. Alice goes in order (1, 2, 3, …); Bob checks the odd coins, then the even (so 1, 3, 5, …, 99, 2, 4, 6, …). Who is more likely to see 26 total heads *first*?
English
41
19
156
84.4K