Kevin Stone

38 posts

Kevin Stone

@kevinleestone

Research @ OpenAI, previously at FAIR, TRI, and Google working on LLMs, RL, and Robotics.

California, USA Katılım Nisan 2008

490 Takip Edilen2.3K Takipçiler

Sabitlenmiş Tweet

Kevin Stone@kevinleestone·20 Eyl

This effort should be very interesting as we push the reasoning performance of 🍓 even further. This role is a good fit for someone with strong engineering background and good ml intuitions.

Noam Brown@polynoamial

.@OpenAI is hiring ML engineers for a new multi-agent research team! We view multi-agent as a path to even better AI reasoning. Prior multi-agent experience isn't needed. If you'd like to research this area with @kevinleestone and me fill out this form: jobs.ashbyhq.com/openai/form/oa…

English

10.9K

Kevin Stone retweetledi

Derya Unutmaz, MD@DeryaTR_·15 Eyl

On the contrary to below point, I really believe AI is going to dramatically expand our cognitive abilities. Let me share a personal experience to show you what I mean: Last night, I spent over four hours, way past my usual bedtime, brainstorming with GPT-4o and especially o1 about some specialized immune models on cancer and aging therapies. It was such an enjoyable experience, especially when the GPT started offering genuinely insightful ideas, some simple maybe known but makes you think “why didn’t I think of that!” and that triggers another idea and so on. And when I push back on certain concepts, it patiently suggests alternative after alternative. I haven’t felt this mentally engaged in a long time! Honestly, it felt more stimulating than most scientific interactions I’ve had with grad students or postdocs I’ve trained in immunology and biomedical science over the years. That’s really mind blowing to think about! Now, it’s become routine for me to check in with the LLMs whenever I have a new thought or idea. I ask them for opposing or supporting viewpoints—it’s like stress-testing my own thinking. It’s not that I’m short on ideas—if anything, I have way too many at any given moment (thanks to severe ADD☺️), and it makes it tough to focus or dig deep into just one idea. That’s why brainstorming is so crucial, especially in fields like science and engineering. You never know what you’re missing, and we all have our biases or dogmas. It’s hard to challenge your own thoughts, which usually leads to uncomfortable cognitive dissonance. But with AI, it feels like I now have this super thoughtful, endlessly patient “friend” who helps me think more clearly and deeply. It’s like AI enhances my cognitive abilities and helps me push past those mental roadblocks. Sometimes it even helps me understand my own thoughts better! For anyone who’s willing to really go through this rabbit hole, I think AI can help us reach entirely new levels of thinking and greatly boost our cognitive abilities. Thus, AI is the accelerator of human brain intelligence evolution!

Nick Dobos@NickADobos

Ai will significantly halt evolution of the human brain

English

105

164

1.2K

197.1K

Kevin Stone retweetledi

Derya Unutmaz, MD@DeryaTR_·14 Eyl

I couldn’t agree more with this! I just had OpenAI GPT o1 work with me to write a major cancer treatment project, and in less than an hour, it was phenomenal and saved me many days of work! That’s worth a lot of Big Mac meals, though I would strongly advise you NOT to eat those☺️

roon@tszzl

‘The average price of a Big Mac meal, which includes fries and a drink, is $9.29.’ for two Big Mac meals a month you get access to ridiculously powerful machine intelligence, capable of high tier programming, phd level knowledge people don’t talk about this absurdity enough

English

186

31.3K

Kevin Stone retweetledi

Kyle Kabasares@kylekabasares·15 Eyl

@emollick Thank you for highlighting this! I think it shows the amazing potential these models have for being amazing assistants in research. I wish I had it for that 10 month span, could have done a lot more actual research!

English

3.6K

Kevin Stone@kevinleestone·12 Eyl

@johnjhorton I sampled three times from o1-preview and got 2 different answers.

English

John Horton@johnjhorton·12 Eyl

Ok! "There is a job market with N job-seekers and M jobs. A job application costs c to send. Getting a job has value v to a risk-neutral worker. They send applications (allow it to be continuous) at random until marginal benefit equals marginal cost. Firms hire at random among applicants they receive. Workers can only have one job and choose randomly if multiple offers. How many applications does each worker send and what is the per-application win-rate? Make reasonable assumptions as necessary for tractability."

English

John Horton@johnjhorton·12 Eyl

On my phone at the moment so can’t, but would love to see outcomes of o1-preview taking a crack at hard econ problems, particularly decision problems we model humans as solving

Kevin Stone@kevinleestone

Proud to release o1-preview to the world. Now that we have started to crack the challenge of getting models to “think” we are able to get large improvements on complex tasks by just letting them think harder.

English

1.7K

Kevin Stone@kevinleestone·12 Eyl

English

123

20.5K

Kevin Stone@kevinleestone·19 Nis

@natfriedman @Michael48462009 Hey @sheeprobotics, is pick and place on the roadmap for Verdie? Sounds like a fun challenge.

English

982

Nat Friedman@natfriedman·19 Nis

@Michael48462009 Yes, perfect

English

9.4K

Nat Friedman@natfriedman·19 Nis

Instead of leaf blowers, I want a quiet little robot that picks leaves up one at a time and puts them in a bag, at night while I'm sleeping.

English

243

183

3.7K

1.3M

Kevin Stone@kevinleestone·18 Nis

Congrats Llama team! Very impressive results especially at the 70b scale.

Tianle Cai@tianle_cai

Llama 3: Better data is all you need

English

1.3K

Kevin Stone@kevinleestone·18 Nis

@magpie_rayhou Congrats on the release!

English

123

Rui Hou@magpie_rayhou·18 Nis

Excited to release a preview version of Llama3 with superb performance to the community! More to come soon!

Ahmad Al-Dahle@Ahmad_Al_Dahle

It’s here! Meet Llama 3, our latest generation of models that is setting a new standard for state-of-the art performance and efficiency for openly available LLMs. Key highlights • 8B and 70B parameter openly available pre-trained and fine-tuned models. • Trained on more than 15T tokens, 7x+ larger than Llama 2's dataset! • Improved tokenizer with vocabulary of 128K tokens for better performance. • State-of-the-art performance across industry benchmarks. • New capabilities, including enhanced reasoning and coding. • 3x more efficient training than Llama 2. • New trust and safety tools with Llama Guard 2, Code Shield, and CyberSec Eval 2. • Integrated into Meta AI, and available in more countries across our apps. • And, just the beginning with more models and new capabilities coming soon! Visit the Llama 3 website to read more and download the models. llama.meta.com/llama3

English

6.1K

Kevin Stone retweetledi

AI at Meta@AIatMeta·28 Tem

To better enable the community to build on our work — and contribute to the responsible development of LLMs — we've published further details about the architecture, training compute, approach to fine-tuning & more for Llama 2 in a new paper. Full paper➡️ bit.ly/44JAELQ

English

532

340.9K

Kevin Stone retweetledi

Aravind Rajeswaran@aravindr93·24 Tem

✈️ Just landed in Hawaii 🌴 to present two cool projects at #ICML2023 🚀 Masked Trajectory Models (w/ @philippswu, @arjunmajum, @kevinleestone, @yixin_lin_ , @IMordatch, @pabbeel) 📚 LfS Revisited (w/ @ncklashansen, @haosu_twitr, @HarryXu12, @xiaolonw et al.) Details in 🧵👇

English

6.6K

Kevin Stone@kevinleestone·19 Tem

Thrilled to release Llama 2 today (ai.meta.com/llama), our next-gen open-source LLM. Eager to see how the community will use and extend it. So grateful for the chance to work with such an amazing team and for Meta's resources and support to pull this off.

English

3.2K

Kevin Stone retweetledi

Jim Fan@DrJimFan·18 Tem

You'll soon see lots of "Llama just dethroned ChatGPT" or "OpenAI is so done" posts on Twitter. Before your timeline gets flooded, I'll share my notes: ▸ Llama-2 likely costs $20M+ to train. Meta has done an incredible service to the community by releasing the model with a commercially-friendly license. AI researchers from big companies were wary of Llama-1 due to licensing issues, but now I think many of them will jump on the ship and contribute their firepower. ▸ Meta's team did a human study on 4K prompts to evaluate Llama-2's helpfulness. They use "win rate" as a metric to compare models, in similar spirit as the Vicuna benchmark. 70B model roughly ties with GPT-3.5-0301, and performs noticeably stronger than Falcon, MPT, and Vicuna. I trust these real human ratings more than academic benchmarks, because they typically capture the "in-the-wild vibe" better. ▸ Llama-2 is NOT yet at GPT-3.5 level, mainly because of its weak coding abilities. On "HumanEval" (standard coding benchmark), it isn't nearly as good as StarCoder or many other models specifically designed for coding. That being said, I have little doubt that Llama-2 will improve significantly thanks to its open weights. ▸ Meta's team goes above and beyond on AI safety issues. In fact, almost half of the paper is talking about safety guardrails, red-teaming, and evaluations. A round of applause for such responsible efforts! In prior works, there's a thorny tradeoff between helpfulness and safety. Meta mitigates this by training 2 separate reward models. They aren't open-source yet, but would be extremely valuable to the community. ▸ I think Llama-2 will dramatically boost multimodal AI and robotics research. These fields need more than just blackbox access to an API. So far, we have to convert the complex sensory signals (video, audio, 3D perception) to text description and then feed to an LLM, which is awkward and leads to huge information loss. It'd be much more effective to graft sensory modules directly on a strong LLM backbone. ▸ The whitepaper itself is a masterpiece. Unlike GPT-4's paper that shared very little info, Llama-2 spelled out the entire recipe, including model details, training stages, hardware, data pipeline, and annotation process. For example, there's a systematic analysis on the effect of RLHF with nice visualizations. Quote sec 5.1: "We posit that the superior writing abilities of LLMs, as manifested in surpassing human annotators in certain tasks, are fundamentally driven by RLHF." Congrats to the team again 🥂! Today is another delightful day in OSS AI.

English

161

1.1K

5.4K

1.4M

Kevin Stone retweetledi

Andrej Karpathy@karpathy·18 Tem

Huge day indeed for AI and LLMs, congrats to Meta 👏 This is now the most capable LLM available directly as weights to anyone from researchers to companies. The models look quite strong, e.g. Table 4 in the paper: MMLU is good to look at, the 70B model is just below GPT-3.5. But HumanEval (bad misnomer) shows coding capability is quite a bit lower (48.1 vs 29.9).

Yann LeCun@ylecun

This is huge: Llama-v2 is open source, with a license that authorizes commercial use! This is going to change the landscape of the LLM market. Llama-v2 is available on Microsoft Azure and will be available on AWS, Hugging Face and other providers Pretrained and fine-tuned models are available with 7B, 13B and 70B parameters. Llama-2 website: ai.meta.com/llama/ Llama-2 paper: ai.meta.com/research/publi… A number of personalities from industry and academia have endorsed our open source approach: about.fb.com/news/2023/07/l…

English

496

3.8K

Kevin Stone retweetledi

Soumith Chintala@soumithchintala·18 Tem

LLaMa-2 from @MetaAI is here! Open weights, free for research and commercial use. Pre-trained on 2T tokens. Fine-tuned too (unlike v1). 🔥🔥🔥 Lets gooo.... ai.meta.com/llama/ The paper lists the amazing authors who worked to make this happen night and day. Be sure to thank them for their tireless pursuit of open science and true democratization! ai.meta.com/research/publi…

English

176

1.1K

182.3K

Kevin Stone@kevinleestone·31 Eki

@realmarcraibert Yes please! I would love to inspire young minds with Spot, beginning with an upcoming robotics demonstration at my daughters school. Watching Spot has been a favorite pastime of ours. Even the ability to lease/borrow one would be incredible.

English

Kevin Stone@kevinleestone·9 Mar

Check out some great work from our intern @mzubairirshad. Real-time category-level pose estimation and shape completion from RGB-D. #real2sim #icra2022

Zubair Irshad@mzubairirshad

Super excited to share my internship work at @ToyotaResearch on category-level 3D object understanding and single-shot real2sim asset creation, accepted at #ICRA2022! zubair-irshad.github.io/projects/Cente… @GTrobotics @ICatGT @mlatgt @ieee_ras_icra ⬇️(1/6)

English

Kevin Stone@kevinleestone·29 Oca

@soumithchintala We have not tried running it on any embedded GPUs yet. More details in the paper, but we get a throughput of about 151 megapixels/sec on a Titan RTX. The source code is available at sites.google.com/view/stereofor…. But, we haven't released the TensorRT version or weights at this point.

English

Soumith Chintala@soumithchintala·29 Oca

@kevinleestone this is really cool. Would love to use this in our own work. I have two questions: 1. Have you tried running the model on Jetson Nano or Jetson Xavier, if so, what was the perf? 2. Do you plan to open-source the models and/or code?

English

Kevin Stone@kevinleestone·28 Oca

We published more details on our learned stereo system we use on our robots. We have found it to be more useful than existing active/passive depth sensors especially on shiny surfaces which are common in the home environment.

English

370

Kevin Stone@kevinleestone·28 Oca

Blog post: medium.com/toyotaresearch… IEEE RA-L paper: ieeexplore.ieee.org/document/96849…

Nederlands

Keşfet

@emollick @johnjhorton @natfriedman @Michael48462009 @sheeprobotics @magpie_rayhou @philippswu @arjunmajum