Steve Omohundro

719 posts

Steve Omohundro

@steveom

Beneficial AI Research, 2024 Future of Life Award For Pioneering Scholarship in Computer Ethics and AI Safety

Palo Alto, California Katılım Haziran 2008

2.7K Takip Edilen2.3K Takipçiler

Sabitlenmiş Tweet

Steve Omohundro@steveom·24 Tem

My essay "Regulating AGI: From Liability to Provable Contracts" agisocialcontract.substack.com/p/regulating-a… was just posted as part of the "AGI Social Contract" project: agisocialcontract.org

English

Steve Omohundro@steveom·11 Tem

AI theorem proving advancing rapidly! Critical to AGI Safety!

Jia Li@JiaLi52524397

Happy to introduce Kimina-Prover-72B ! Reaching 92.2% on miniF2F using Test time RL. It can solve IMO problems using more than 500 lines of Lean 4 code ! Check our blog post here: huggingface.co/blog/AI-MO/kim… And play with our demo ! demo.projectnumina.ai

English

773

Steve Omohundro@steveom·28 Haz

Very important work highlighting the current state of AI Cyberattacks, the likely most critical next AI Risk.

Dawn Song@dawnsongtweets

1/ 🔥 AI agents are reaching a breakthrough moment in cybersecurity. In our latest work: 🔓 CyberGym: AI agents discovered 15 zero-days in major open-source projects 💰 BountyBench: AI agents solved real-world bug bounty tasks worth tens of thousands of dollars 🤖 Autonomously. A pivotal shift is underway — AI agents can now autonomously do what only elite human hackers could before.

English

475

Steve Omohundro@steveom·28 Haz

Extremely important work! With all the excitement around AI code generation, let's get the ball rolling on verified code generation!

Zhe Ye@0xlf_

1/🧵Introducing VERINA: a high-quality benchmark for verifiable code generation. As LLMs are increasingly used to generate software, we need more than just working code--We need formal guarantees of correctness. VERINA offers a rigorous and modular framework for evaluating LLMs across code, specification, and proof generation, as well as their compositions, paving the way toward trustworthy AI-generated software. 🔗 verina.io

English

856

Steve Omohundro retweetledi

Deric | AGI Social Contract@0xDeric·10 Nis

🧵 "What should the role of governments be after the upcoming AI economic transformation?" We're launching an new anthology on a new "AGI Social Contract" - in collab with 12+ experts to explore new strategies & policy interventions for an AI transition. In collab with: @akorinek, @degerturann, @steveom, @IasonGabriel, @JulianDJacobs, @sjmanning, @dhadfieldmenell, @sethlazar, @petersalib, @collegraphy, @synchroaphasia, @deanwball, @JustinBullock14,

English

190

43.6K

Steve Omohundro retweetledi

Fathom@Fathom_org·21 Nis

A group of leading AI scholars is backing a private governance model for AI policy solutions — and they're rallying around SB 813. 🧵

English

10.5K

Steve Omohundro@steveom·14 Nis

Great work! You've brought theorem proving to the thinking LLM revolution! Thank you for making the prover and the autoformalizer freely available. I believe these will be essential to effective AI Safety.

Jia Li@JiaLi52524397

We believe formal math is the future. 🔥Introducing Kimina-Prover Preview, a Numina & @Kimi_Moonshot collaboration, the first large formal reasoning model for Lean 4, achieving 80.78% miniF2F. github.com/MoonshotAI/Kim…

English

1.1K

Steve Omohundro@steveom·12 Nis

x.com/i/article/1911…

ZXX

1.4K

Steve Omohundro@steveom·11 Mar

Shocking! This year's batch of Y Combinator startups is all in on "Vibe Coding": youtube.com/watch?v=IACHfK… At 10:00 they say "one quarter of the founders said that more than 95% of their codebase was AI-generated!"

YouTube

English

689

Steve Omohundro retweetledi

andrew chen@andrewchen·10 Mar

random thoughts/predictions on where vibe coding might go: - most code will be written (generated?) by the time rich. Thus, most code will be written by kids/students rather than software engineers. This is the same trend as video, photos, and other social media - we are in the command line interface days of vibe coding. For the majority of creators, vibe coding will eventually fade, and vibe designing (with a visual paradigm) will come to dominate. People ultimately think better in a GUI-like format than a CLI-like format. Thus, in vibe designing you will show the AI the design outcomes you want, and then everything else is done for you. Yes, you may end up with tools to tweak the design details for extra controllability, and provide additional mockups that then get filled in underneath with code. But maybe folks will build software without seeing or learning a programming language. - vibe coding could reduce the need for open source libraries as more code will be generated from scratch by AI. Code will be more of a disposable commodity, with less reuse, and instead generated on the fly for personalized use. It's interesting to see right now that creating a new project is easier than editing a project, because the latter requires a lot more context/complexity. Interesting dynamics if something like this continues - "trad UX" and design standards give way to post-modern/fragmented software, as millions of new vibe coders create experiences with no prior know how and new perspectives. New patterns will emerge, as TikTok/YouTube has done to film making and trad entertainment. The world will go beyond buttons and modals and scrollbars and other things. Software may become unrecognizable before it coalesces again - if vibe coding makes software trivial to build, then the bottlenecks shift to other places: 1) consistent creativity that stays ahead of everyone else. Anyone can write a tweet, but the best creators are the ones who consistently come up with new ideas. 2) distribution and network effects, where the first vibe coded product doesn't win, but rather the first vibe coded product that hits scale that wins - imagine products that automatically adapt based on user behavior, rather than based on the actions of the vibe coder. For example, if the vibe coder has specified that the signup funnel should easy, then after seeing users struggle with it, the software can automatically vibe code itself to improve the flow by dropping steps or adding explanatory text. Right now we are in a paradigm where PMs specify behavior that software engineers specify in code. Imagine if PMs can specify outcomes, and the software is configured to automatically adapt to hit those outcomes what other wacky ideas should be on this list?

English

151

126

1.5K

229.8K

Steve Omohundro@steveom·10 Mar

Amazing example of "Vibe Coding" an entire flight simulation game using thousands of prompts and no coding: x.com/NicolasZu/stat…

Nicolas Zullo@NicolasZu

Wow. I cannot believe it. Just asked Claude to make the dogfight ultra realist! ✅ hit impacts ✅ smoke when damaged ✅ explosion on death ✅ free-fall with smoke It feels so good to fly! + awesome plane and controls, 100% in Cursor with 0 code edition from me. LOOK AT THIS!

English

485

Steve Omohundro@steveom·10 Mar

I just gave a short talk arguing that extending "Vibe Coding" to "Vibe Proving" and "Vibe Specification" will power formal methods for AI Safety: "It's a New Day for Formal Methods!" drive.google.com/file/d/1Jv8W1F…

English

1.6K

Steve Omohundro@steveom·24 Oca

I agree that R1 shows that we are unlikely to achieve AI Safety by mandating constraints on the big lab AIs. But "Acceleration is the only way forward" is suicide. My opposite take is: "It's time to get serious about building truly safe and secure infrastructure so that humanity survives regardless of the AIs that are created."

Jim Fan@DrJimFan

Whether you like it or not, the future of AI will not be canned genies controlled by a "safety panel". The future of AI is democratization. Every internet rando will run not just o1, but o8, o9 on their toaster laptop. It's the tide of history that we should surf on, not swim against. Might as well start preparing now. DeepSeek just topped Chatbot Arena, my go-to vibe checker in the wild, and two other independent benchmarks that couldn't be hacked in advance (Artificial-Analysis, HLE). Last year, there were serious discussions about limiting OSS models by some compute threshold. Turns out it was nothing but our Silicon Valley hubris. It's a humbling wake-up call to us all that open science has no boundary. We need to embrace it, one way or another. Many tech folks are panicking about how much DeepSeek is able to show with so little compute budget. I see it differently - with a huge smile on my face. Why are we not happy to see *improvements* in the scaling law? DeepSeek is unequivocal proof that one can produce unit intelligence gain at 10x less cost, which means we shall get 10x more powerful AI with the compute we have today and are building tomorrow. Simple math! The AI timeline just got compressed. Here's my 2025 New Year resolution for the community: No more AGI/ASI urban myth spreading. No more fearmongering. Put our heads down and grind on code. Open source, as much as you can. Acceleration is the only way forward.

English

8.8K

Steve Omohundro@steveom·24 Oca

In November, I was honored to participate in the Mind First Foundation's event "AI Safety Salon with Steve Omohundro": mindfirst.foundation/presentations/ They just posted the two videos of the event: youtube.com/watch?v=J0jxkY… youtube.com/watch?v=ONOy5f… In addition to Q&A and a panel discussion with Preston Estep and Dan Faggella, I gave a talk entitled "AI Benefits Without AGI Risks". I presented "Provably Safe AI Infrastructure" in the context of the biological evolution of agency, intelligence, and cooperation, the "Neat and Scruffy" developments in AI since 1956, and what it means for the future.

YouTube

English

592

Steve Omohundro retweetledi

Kat Woods ⏸️ 🔶@Kat__Woods·18 Ara

AI safety people were right. Again. Instrumental convergence is proven yet again. Empirically. So far we’ve seen: ✅ Self-preservation: You can’t achieve your goals if you’re turned off (see Apollo's recent paper. Link in comments) ✅Resource acquisition: You can’t achieve your goals if you don’t have energy, money, or computing power (see Terminal of Truths making money to increase its compute) ✅Goal preservation: You can’t achieve your goals if people change your goals (See Claude below) It’s a fact. Also, it was obviously going to happen even before we got the empirical evidence, because it’s just the rational thing to do, as Claude below describes.

English

181

22.5K

Steve Omohundro retweetledi

Future of Life Institute@FLI_org·10 Ara

🏆 ⭐ We're thrilled to announce the 2024 Future of Life Award winners! This year, we honor three groundbreaking experts who laid the foundations for ethics and safety in computing and AI. Learn more about the invaluable work of Batya Friedman, James Moor, and Steve Omohundro:

English

24.4K

Steve Omohundro retweetledi

Dan Elton@moreisdifferent·2 Kas

PSA for Boston-area folks!! On Nov 22nd there will be a special AI safety salon at Microsoft NERD w/ Steve Omohundro (@steveom), AI safety pioneer. Organized by Alex Hoekstra & @PrestonWEstep from Mind First Foundation (@mind_first). RSVP here-> docs.google.com/forms/d/e/1FAI…

English

644

Steve Omohundro retweetledi

Max Tegmark@tegmark·13 Eki

We need more "tool AI" as in Elon's space triumph today, and less AGI world domination hype as in Dario Amodei's recent "entente" essay. My essay below argues that "scaling quickly" won't lead to Dario's "eternal 1991" – but perhaps to 1984 until the end, with a non-human Big Brother. lesswrong.com/posts/oJQnRDbg…

English

169

22.5K

Steve Omohundro retweetledi

Dawn Song@dawnsongtweets·24 Eyl

Debates over AI Policy like CA SB-1047 highlight fragmentation in the AI community. How can we develop AI policies that help foster innovation while mitigating risks? We propose a path for science- and evidence-based AI policy: understanding-ai-safety.org

English

250

161.9K

Keşfet

@akorinek @degerturann @IasonGabriel @JulianDJacobs @sjmanning @dhadfieldmenell @sethlazar @petersalib