Ari Seff

134 posts

Ari Seff

@ari_seff

Robotics @OpenAI | Previously @Waymo, PhD @Princeton

Katılım Mart 2017

166 Takip Edilen3.2K Takipçiler

Sabitlenmiş Tweet

Ari Seff@ari_seff·2 Eki

Can we model driving similarly to how we model language? Excited to introduce MotionLM, where we represent driving scenarios as conversations among road users. It establishes a new state-of-the-art in multi-agent motion forecasting: arxiv.org/abs/2309.16534

GIF

English

108

874

176.3K

Ari Seff@ari_seff·24 Şub

@PengchuanZ Looking forward to working together!

English

401

Pengchuan Zhang@PengchuanZ·24 Şub

I’m joining OpenAI to work on World Simulation and Robotics, after 3.75 years at FAIR working on SAM and Llama. I’m thrilled to explore how visual perception, world model and robotics can come together to build physical intelligence.

English

1.3K

224.5K

Ari Seff retweetledi

OpenAI@OpenAI·11 Ara

GPT-5.2 Thinking evals

English

342

484

3.8K

Ari Seff@ari_seff·20 Eki

I’m at ICCV this week in Hawaii. Excited to talk with folks about robotics, agents, and multimodal reasoning. Reach out to chat!

English

820

Ari Seff retweetledi

Sam Altman@sama·17 Tem

watching chatgpt agent use a computer to do complex tasks has been a real "feel the agi" moment for me; something about seeing the computer think, plan, and execute hits different.

English

1.1K

790

12.6K

4.2M

Ari Seff retweetledi

Nathan Lambert@natolambert·14 Haz

One of the most striking, non-text AI plots I've seen since ChatGPT launched. Scaling keeps working, this time for Waymo's tooling.

English

184

15.5K

Ari Seff retweetledi

Riley Goodside@goodside·14 Haz

o3-pro: How many of each letter are there in the SHA1 of the answer to this question?

English

879

150.3K

Ari Seff retweetledi

Greg Brockman@gdb·23 May

Operator is now powered by o3, improving overall task success rate. Also results in clearer, more thorough, and better-structured responses.

OpenAI@OpenAI

Operator 🤝 OpenAI o3 Operator in ChatGPT has been updated with our latest reasoning model. operator.chatgpt.com

English

162.9K

Ari Seff@ari_seff·23 May

we’ve trained our reasoning models to get better at using computers, and now operator is powered by o3. super fun effort with the team! computer use remains a tough frontier problem, but this update represents a substantial increase in capability: x.com/OpenAI/status/…

OpenAI@OpenAI

Operator 🤝 OpenAI o3 Operator in ChatGPT has been updated with our latest reasoning model. operator.chatgpt.com

English

4.6K

Ari Seff@ari_seff·15 Şub

@ptievgaleks @yuntiandeng If given the same time budget? I'm not so sure. Actually I'd be surprised if the accuracy were anywhere close. In any event, lengthy multiplication is not really what these models are designed for, which is why we generally give them access to tools :)

English

113

Zhenya Ptichnikov@ptievgaleks·14 Şub

@ari_seff @yuntiandeng At least people eventually will get a correct answer😉

English

177

Yuntian Deng@yuntiandeng·12 Şub

For those curious about how o3-mini performs on multi-digit multiplication, here's the result. It does much better than o1 but still struggles past 13×13. (Same evaluation setup as before, but with 40 test examples per cell.)

Yuntian Deng@yuntiandeng

Is OpenAI's o1 a good calculator? We tested it on up to 20x20 multiplication—o1 solves up to 9x9 multiplication with decent accuracy, while gpt-4o struggles beyond 4x4. For context, this task is solvable by a small LM using implicit CoT with stepwise internalization. 1/4

English

745

Ari Seff@ari_seff·15 Şub

Language models are not designed specifically for lengthy arithmetic. That's why we generally give them access to tools (eg, the old computer you mention). In this case OP is testing the model on lengthy multiplications "by hand", which I think most humans would struggle with if given the same inference-time budget. As reasoning/inference-time compute increases, these mistakes reduce.

English

103

Ari Seff retweetledi

Sam Altman@sama·26 Oca

fun watching people react to operator. reminds me of the chatgpt launch!

English

1.1K

334

10.4K

Ari Seff retweetledi

OpenAI@OpenAI·22 Oca

Announcing The Stargate Project The Stargate Project is a new company which intends to invest $500 billion over the next four years building new AI infrastructure for OpenAI in the United States. We will begin deploying $100 billion immediately. This infrastructure will secure American leadership in AI, create hundreds of thousands of American jobs, and generate massive economic benefit for the entire world. This project will not only support the re-industrialization of the United States but also provide a strategic capability to protect the national security of America and its allies. The initial equity funders in Stargate are SoftBank, OpenAI, Oracle, and MGX. SoftBank and OpenAI are the lead partners for Stargate, with SoftBank having financial responsibility and OpenAI having operational responsibility. Masayoshi Son will be the chairman. Arm, Microsoft, NVIDIA, Oracle, and OpenAI are the key initial technology partners. The buildout is currently underway, starting in Texas, and we are evaluating potential sites across the country for more campuses as we finalize definitive agreements. As part of Stargate, Oracle, NVIDIA, and OpenAI will closely collaborate to build and operate this computing system. This builds on a deep collaboration between OpenAI and NVIDIA going back to 2016 and a newer partnership between OpenAI and Oracle. This also builds on the existing OpenAI partnership with Microsoft. OpenAI will continue to increase its consumption of Azure as OpenAI continues its work with Microsoft with this additional compute to train leading models and deliver great products and services. All of us look forward to continuing to build and develop AI—and in particular AGI—for the benefit of all of humanity. We believe that this new step is critical on the path, and will enable creative people to figure out how to use AI to elevate humanity.

English

5.4K

10.4K

59.4K

33.9M

Ari Seff retweetledi

Sulin Liu@su_lin_liu·16 Eki

Discrete generative models use denoisers for generation, but they can slip up. What if generation *isn’t only* about denoising?🤔 Introducing DDPD: Discrete Diffusion with Planned Denoising🤗🧵(1/11) w/ @junonam_ @AndrewC_ML @HannesStaerk @xuyilun2 Tommi Jaakkola @RGBLabMIT

English

234

40.3K

Ari Seff@ari_seff·16 Eki

@anbarbazan @pengzh97 @luyirenmax @cole_gulino @JustinFu769512 For sure, planning is also a natural use case of this type of agent simulation. For scope reasons, the paper focuses on agent realism and offboard eval as case studies

English

Ari Seff@ari_seff·12 Eki

Evaluating autonomous vehicles in simulation is only useful if the other agents exhibit realistic reactions. We introduce a simple scheme for running RL on top of pre-trained behavior models to improve agent realism. (new paper with my former colleagues at @Waymo):

English

1.4K

Ari Seff@ari_seff·16 Eki

@wilson1yan @matei_zaharia @VladMnih @pabbeel @AleksandraFaust @haoliuhl Very nice work! Qq: since it’s always the rightmost latents that are masked, if one was using a CNN-based VQ-VAE (as opposed to ViT-based), I assume this would result in the reconstruction degradation being explicitly biased towards specific image regions - is that accurate?

English

Wilson Yan@wilson1yan·14 Eki

@matei_zaharia @VladMnih @pabbeel @AleksandraFaust @haoliuhl We believe that ElasticTok opens up a promising direction towards adaptive tokenizers that can enable us to more efficiently process long-context data. Paper: arxiv.org/abs/2410.08368 Code: github.com/LargeWorldMode… Website: largeworldmodel.github.io/elastictok/

English

597

Wilson Yan@wilson1yan·14 Eki

We are excited to announce ElasticTok, a simple yet scalable visual tokenizer that can adaptively encode image and video to variable-length sequences! Our model enables more efficient token usage over different images and long dynamic videos.

English

181

27.3K

Ari Seff@ari_seff·10 Eki

Very cool postdoc opportunity at Princeton. Had the great pleasure of working with Ryan during my PhD. If you're an AI researcher considering a postdoc, highly recommend taking a look:

Ryan Adams@ryan_p_adams

We are looking for AI Postdoctoral Fellows! Be a part of AI for Accelerating Invention, a new research initiative out of the Princeton AI Lab. You can learn about AI for Accelerating Invention here: invent.ai.princeton.edu and apply for the postdoc here: puwebp.princeton.edu/AcadHire/apply….

English

931

Ari Seff@ari_seff·16 Eyl

@littmath Oh nw!

166

Daniel Litt@littmath·16 Eyl

@ari_seff Oh nice, sorry I missed this!

English

726

Daniel Litt@littmath·16 Eyl

Flip 100 coins, marked 1-100. Each second, Alice and Bob simultaneously check one coin. Alice goes in order (1, 2, 3, …); Bob checks the odd coins, then the even (so 1, 3, 5, …, 99, 2, 4, 6, …). Who is more likely to see 26 total heads *first*?

English

156

84.4K

Keşfet

@PengchuanZ @ptievgaleks @yuntiandeng @junonam_ @AndrewC_ML @HannesStaerk @xuyilun2 @RGBLabMIT