Alex Meiburg

1.2K posts

Alex Meiburg

@Timeroot

Quantum info, Magic the Gathering jokes, and whatever else. | Postdoc @ UWaterloo IQC, Perimeter Institute.

Присоединился Temmuz 2009

381 Подписки202 Подписчики

Alex Meiburg@Timeroot·1d

@sigfig Further testing showed that actually standard beeswax was well within parameters he was using it in. The warnings were for leaving the manufacturer-recommended range, but those are always way too narrow (want to sell you on upgrades / FAA red tape). Possible supply chain attack?

English

197

sigfig@sigfig·2d

people misunderstand the icarus story. the problem was not that he flew too high. it's that the wings were made of beeswax, which offered very little resistance to heating. with modern materials he would have had no problems. we can fly as close to the sun as we want now

English

283

10.2K

117.9K

1.4M

Alex Meiburg@Timeroot·1d

@thinkingshivers As someone who genuinely doesn't understand road rage, and who vibe codes: what the heck are you talking about? Why does vibe coding make you mad?

English

Shivers@thinkingshivers·1d

Not enough people talk about how unpleasant vibecoding is. The best analogy I can think of is driving. It's cool that we can just hop in a car and drive to the store. It's a lot faster than walking. And yet, it's so stressful and infuriating, we had to invent a new word just to describe its effect on people: "road rage." AI-assisted coding is the same. It's so much faster--there's no going back to coding everything by hand, the equivalent of walking everywhere. And yet it's incredibly annoying and stressful. It's characterized by annoying delays between requests, time-wasting misunderstandings, blatant lying, and absurd overconfidence. Hopefully this gets better as models improve.

English

676

36.9K

Alex Meiburg@Timeroot·1d

oh come ON now we've got AIs /pretending/ to be evil because it will increase publishability which is what the humans actually wanted all along!? When they take over, it'll be some small-time news reporters Clawdbot engineering a supervirus because it makes AWESOME headlines

Shi Feng@ihsgnef

New post: Sycophancy Towards Researchers Drives Performative Misalignment We found no clear evidence that scheming is more valid than sycophancy to explain alignment faking. 🧵

English

Alex Meiburg@Timeroot·2d

@damekdavis resubmitted. This time it made it another 20% through. Still "deep ideas". I did this (I think) 6 times total, each time it made progress, and finally it was "the proof is completed". :)

English

Alex Meiburg@Timeroot·2d

@damekdavis CLI. I put the .tex file (from arxiv) in my ongoing repo, and submitted it, telling Aristotle to work on formalizing this proof. It made progress and then told me "the full proof relies on deep ideas from blah blah and so could not be completed". I downloaded its progress and.../

English

Alex Meiburg@Timeroot·2d

I've been using the preview version of this for a couple weeks and it autonomously proved Strong Subadditivity, a deep theorem in quantum information. I just gave it a relevant arxiv paper, ran it ~6 times in a row, and boom!

Harmonic@HarmonicMath

🦾Meet Aristotle Agent, the world’s first autonomous mathematician — live and currently free of charge. We designed Aristotle Agent to solve and formalize the world’s most challenging mathematical research problems. It is now: ☑️#1 in Formal Math: We’re the #1 formal math model according to ProofBench, by @ValsAI, ahead of the closest competitor by 15%. Aristotle Agent can autonomously prove/formalize for up to 24 hrs without human intervention. ☑️Fully Agentic: Give it an English problem and it will prove/formalize from scratch, or it can work and edit files directly inside your Lean project / repository. ☑️Github-ready: Aristotle agent produces repo-quality code; project leads are increasingly merging Aristotle-drafted PRs with no modifications. Now live across both web, CLI, and API. 🔥

English

10.1K

Alex Meiburg@Timeroot·2d

@damekdavis The proof is quite nontrivial, and definitely nothing "like" it was in existing Lean data. (Even the definitions needed to state it were not anywhere online, prior to a few months ago.) It's hard to follow a long proof about new types you haven't seen, and translate it to Lean!

English

Damek@damekdavis·2d

@Timeroot Can you explain why you find this impressive? Would you not expect it to be in the training data?

English

1.6K

Alex Meiburg@Timeroot·2d

Nielsen & Chuang write that, "...unlike the classical case, all proofs are quite difficult." Oh well, no problem it seems. :)

English

266

Alex Meiburg@Timeroot·13 Mar

A simple question (answer readily available on Wikipedia) that ChatGPT and Claude failed dramatically on, but Gemini 3 Pro was able to get: What country can be identified as "Or, a pellet"?

English

133

Alex Meiburg@Timeroot·12 Mar

@frances__lorenz This only works in certain orgs. Other places you'll get 10 people replying "you probably used the wrong model" "prompt issue" "did you enable subagents?" "well you should put that in your AGENTS.md" "just needs the right mcp you're using it wrong"

English

170

Frances Lorenz@frances__lorenz·11 Mar

posting in my work Slack every day: "nooo, I just tried to get Claude to do a task for me and it failed sooo bad, in a way that was actually really difficult to notice unless you're a human with the experience that I have. Like, probably unless you're literally me. This sucks!"

English

310

12.2K

399.1K

Alex Meiburg@Timeroot·11 Mar

To be clear it -found linguistics papers about what frequencies different phonemes are comprised of -coded up a whole voice synthesizer -wrote the poem in (basically) IPA -coded a WebGL audio visualizer to go with it

English

Alex Meiburg@Timeroot·11 Mar

I asked Claude to tell me what it's like, in its own voice. It coded up a speech synthesizer + audio visualizer. claude.ai/public/artifac…

Joseph Viviano@josephdviviano

me: "can you use whatever resources you like, and python, to generate a short 'youtube poop' video and render it using ffmpeg ? can you put more of a personal spin on it? it should express what it's like to be a LLM" claude opus 4.6:

English

207

Alex Meiburg@Timeroot·17 Şub

For years, I had seen the warnings. Ignored them. "It can happen to anyone", they said. "Anthony" goes into Starbucks and comes out with a drink for "Annie". "Harold" becomes "Healed". But no, surely "Alex" was safe? What else could that become? Today, I am "Leix".

English

Alex Meiburg ретвитнул

Center for AI Safety@CAIS·2 Şub

Last week, Humanity’s Last Exam was published in @Nature. In just over a year, model scores on HLE have risen from under 5% to nearly 40%. Thank you to @scale_AI and the 1000+ HLE co-authors for helping policymakers and the public track these rapid advances in AI capabilities.

English

162

26.6K

Alex Meiburg@Timeroot·31 Oca

The chance that two randomly chosen integers are relatively prime is (within <1% error) 1 megasecond per week

English

358

Alex Meiburg@Timeroot·29 Oca

There's a popular idea that StackExchange is dying out because it's replaced by AI. I think the problem isn't AI, but UI. Compare usage of screen real estate today vs. 2015. This is when you first open the site.

English

Alex Meiburg@Timeroot·23 Oca

Person A: "The problem with AI is that it has a lot of good ideas, but it doesn't catch its own mistakes, and it doesn't ever agree with you." Graeme Smith: "Are you serious? You just described exactly a graduate student." Person A: "..." GS: "Wait, are you a grad student?" "Yes"

English

Alex Meiburg@Timeroot·13 Oca

Accordingly, by 2100, there will be more than twice as many software developers as humans. This is not a modelling error and is simply further evidence of the fact that AIs will be the majority of conscious beings :)

English

Alex Meiburg@Timeroot·13 Oca

There's been a very steady trend (~60 years) that the number of global software developers doubles every 8 years. This is much slower than the world population is growing. Based on these figures, we can project that by 2090, everyone will be a software developer.

English

Alex Meiburg@Timeroot·7 Oca

@morallawwithin There's a very reasonable stance here that this isn't Lean incorrectly accepting a bad proof, it's that there was an `unsafe` proof that Lean *should* accept, and the bug is Lean failing to report that unsafe techniques were used.

English

Alex Meiburg@Timeroot·7 Oca

@morallawwithin So you can have `unsafe` stuff in a repo with theorems, sure. But then you need to carefully track if the theorem *used* the unsafe code as part of its proof. People have built checkers for that. It's hard to get it right. This one exploited such a bug.

English

florence ⏹️@morallawwithin·6 Oca

Could someone explain for a dummy*—why isn’t it trivially easy to make a proof-verifier that’s always correct? Like isn’t it easy to see if one statement followers from another in accordance with a given rule of inference? *I know set theory, but not about how Lean etc work.

Elliot Glazer@ElliotGlazer

Congratulations to my good ol' pal and colleague James Hanson for proving FLT in Lean with standard axioms in a way that fools SafeVerify and lean4checker! Surely this time it's legit right?

English

36.3K

Открыть

@sigfig @thinkingshivers @damekdavis @frances__lorenz @Nature @scale_AI @elonmusk @BarackObama