Edward Lockhart

69 posts

Edward Lockhart

@freeetext

Researcher / Engineer at @GoogleDeepMind Co-led our IMO 🥇 effort Now working at the intersection of formal math and LLMs

London, United Kingdom Katılım Mart 2024

210 Takip Edilen196 Takipçiler

Edward Lockhart@freeetext·31 Mar

@rivermillion @grahamscheper Also minuend and subtrahend for subtraction

English

Richard Vermillion@rivermillion·31 Mar

@grahamscheper summand: something which is to be added. mulitiplicand: something which is to be multiplied

English

778

Grǣġhama@grahamscheper·31 Mar

There’s a cool feature in Latin, the gerundive, which is a verbal adjective implying that something should/must happen. A few of them survive into English words: If someone should be revered, they are “reverend”. If something must be cut away, it is a “dividend”. My favorite: if a story is so good that it simply must be read by everyone, it is “legend”! Latin reverendus/a, dividendum, and legenda. Pretty cool that what is sometimes a difficult concept for Latin learners is actually present in English, albeit rarely. Do let me know if there are any good ones I’ve missed.

English

149

392

4.3K

209.7K

Edward Lockhart@freeetext·15 Mar

@samth @finn_hulse You could use ~abs(popcount - log2 n) but the distribution isn't so easy to work with, and you get less resolution. Eg for 4 bits, the freqs are 2 8 6 vs 1 1 2 4 8 for trailing zeroes

English

Sam Tobin-Hochstadt@samth·15 Mar

@finn_hulse This works for any counting property of the hash, right? Eg you could use popcount.

English

409

Finn Hulse@finn_hulse·15 Mar

i had to retire my favorite technical interview problem so it is time to ask it to my loyal followers (solution in replies) given a stream of N not necessarily distinct integers from an O(N) sized universe, for some massive N, find a way to estimate how many distinct integers appear, only using O(log(log(N)) persistent storage use 5 lines of pseudocode there was a time in my life where i wouldn't work with someone who couldn't answer this

English

456

97K

Edward Lockhart@freeetext·23 Ara

@_Mira___Mira_ I think there may be too many theorems for retrieval on its own to be a useful general strategy.

English

160

Mira@_Mira___Mira_·23 Ara

I want the world's biggest proof database. Pose any theorem and bots will try to prove it. If they fail, then it's an open question. People can collaborate to prove it or make partial progress, generalize it, use it as an axiom. As AI improves, we become the "Google of math".

English

399

16.8K

Edward Lockhart@freeetext·11 Ara

@Imported_Fun @IMAO_ Also on a single drive machine, A and B were potentially two different removable floppy disks time-sharing the same drive.

English

American ExPat@Imported_Fun·11 Ara

@IMAO_ A and B were floppy drives and predate hard drives. My first PC only had 2 360k floppy drives. A was for the program and B was for your work. When I got my first 5 MB hard drive, I thought it was so huge I’d never need to get a bigger one.

English

413

4.2K

Frank J. Fleming@IMAO_·11 Ara

Maybe it’s time to admit the C drive should be the A drive (and what even was the B drive?).

English

607

631

71.1K

Edward Lockhart@freeetext·11 Ara

@paul_wilson_nz @garylfrancione @MForstater Closer, but still not exact

English

Paul Wilson 🇳🇿🧠🔬🦕🎲👾🏳️‍🌈🦑@paul_wilson_nz·11 Ara

@garylfrancione @MForstater The quote is from a different Court of Appeal case: [2021] EWCA Civ 348

English

Maya Forstater@MForstater·10 Ara

One of the many things wrong with the Sandie Peggie judgment. This "quote" from my judgment doesn't come from my judgment. It is completely made up. 🤯

English

287

1.3K

5.7K

998.6K

Edward Lockhart@freeetext·11 Ara

@CatGodSandHive @HarmonicMath @zjasper Having an LLM compute the entropy / information content is one option

English

CatGod@CatGodSandHive·11 Ara

@freeetext @HarmonicMath @zjasper Yup, The idea of weighing proof steps is cool, but how would you assess the "obviousness" of each step?

English

Harmonic@HarmonicMath·11 Ara

Now that AI tools such as Aristotle are becoming increasingly powerful, it becomes possible to undertake a rigorous study of proof complexity, in ways that just weren’t feasible prior. Running problems through the same AI model under roughly the same conditions and comparing average proof lengths is likely the recipe for a good quantitative measure of problem difficulty / proof complexity, similar to Kolmogorov complexity. Please reach out to us if you’re interested in undertaking such a study.

Jasper@zjasper

@HarmonicMath It’s possible that Putnam problems might be easier than IMO in terms of logic complexity but just comparing Lean4 proof length for different problems are not rigorous. And even if you try to compare the proof length, how can you make sure you use the shortest proof?

English

40.8K

Edward Lockhart@freeetext·11 Ara

@HarmonicMath @zjasper A proof with many obvious steps may be simpler than a shorter proof where some steps require non-obvious insight. My guess is that some measure of the time taken to find the proof would be more informative. Information content / entropy might also be revealing.

English

Harmonic@HarmonicMath·11 Ara

@zjasper There are likely other factors one would need to consider, such as genus and proof depth, but minimal proof length would certainly be a major factor.

English

437

Edward Lockhart@freeetext·8 Ara

@ChShersh Run 100 tasks in parallel, each scanning 1/100th of the file.

English

139

Dmitrii Kovanikov@ChShersh·8 Ara

Here’s a real task from my job. I have a 100GB binary file. Produced daily. I can’t grep it. But I can decode it. However, I can’t store the decoded version either. It’s too big. How do I efficiently query it? Decoding piped to grep takes 2 minutes. I want 2 seconds.

English

2.1M

Edward Lockhart@freeetext·7 Ara

@alexolegimas @alz_zyd_ @danielrock The reasoning approach effectively includes iteration and backtracking in the generation process. The models can (and do) spot mistakes and fix them.

English

Alex Imas@alexolegimas·6 Ara

@alz_zyd_ @danielrock Yes, but it's interesting that a machine with "only" ~100 billion weights with no loops or ability to iterate forward and backwards can do it.

English

1.4K

Alex Imas@alexolegimas·6 Ara

Read Wolfram's excellent "What is ChatGPT Doing..." (h/t @danielrock). He writes that we learned a lot about how language works from fact that GPT3, with only 175 billion weights, is able to emulate it so well. This implies it's computationally a lot simpler than we may have thought. But what about math? At time this was written (2023), GPT was still very bad at math. The models became very (very) good at math when the first reasoning model came out (o1), which relied a lot more on reinforcement learning rather than just brute force pretraining. Wonder what this says about math? Conceptually, language is a lot "fuzzier" than math: multiple words can sound "right" in the same spot in a sentence. This is what makes the probabilistic LLM architecture work. Math is less fuzzy. This is perhaps why the more "rule based" RL step was crucial. But this also implies formal math is less computationally complex than we thought. Thoughts? @littmath @alz_zyd_

English

101

799

64.9K

Edward Lockhart@freeetext·7 Ara

@xzai259 @CarinaLHong @axiommathai Yup, the Putnam problems require more background knowledge but seem to be easier to solve (for reasoning LLMs).

English

101

Carina Hong@CarinaLHong·7 Ara

Putnam, the world's hardest undergrad math contest, ended 4pm PT yesterday. By 3:58pm, AxiomProver @axiommathai autonomously solved 8/12 of Putnam2025 in Lean, a 100% verifiable language. Last year, our score would've been #4 of ~4000 and a Putnam Fellow (top 10 in recent yrs)

English

125

1.3K

453.2K

Edward Lockhart@freeetext·6 Ara

@DoozerDiffuser @TMcirony22088 @pickover Nonstandard models are only equivalent inside PA. From the outside they're clear different. They're generally excluded by moving to second-order logic and a stronger induction axiom.

English

🜛∞@DoozerDiffuser·5 Ara

@freeetext @TMcirony22088 @pickover First order logic cannot define N up to isomorphism. First order PA has *non-standard models.* And calling them "non-standard" is tantamount to ad hominem. If they match the axioms, they are equally valid, and probably interchangeable.

English

Cliff Pickover@pickover·5 Ara

Mathematics. A mathematician emerges from a cave, hands you the slip of paper below, and asks "Is the sum of all real numbers equal to zero?" What is your response?

English

241

423

56K

Edward Lockhart@freeetext·5 Ara

@DoozerDiffuser @TMcirony22088 @pickover PA is defined in terms of first-order logic, which requires formulae to be finite. See for example en.wikipedia.org/wiki/Well-form… You could look at what happens when you remove that restriction, but it won't be the natural numbers any more.

English

🜛∞@DoozerDiffuser·5 Ara

@freeetext @TMcirony22088 @pickover Wrong. Thats not how N is defined. Neither PA nor N are defined using the term "finite".

English

Edward Lockhart@freeetext·5 Ara

@DoozerDiffuser @TMcirony22088 @pickover The natural numbers are by definition each defined by a finite formula, i.e. a finitely many applications of the successor function. If you want to infinitely many applications of successor you need a different type of logic and what you get won't be the natural numbers.

English

🜛∞@DoozerDiffuser·5 Ara

@freeetext @TMcirony22088 @pickover Tell me exactly what did I do wrong? For the arrow: you can say that's equals, you could say it's a limit as x approaches infinity. But I think you get exactly what I'm saying but you're assuming I'm wrong because "how could all of mathematics have it wrong?" Right?

English

Edward Lockhart@freeetext·5 Ara

@DoozerDiffuser @TMcirony22088 @pickover What does the arrow mean there? I don't think what you're doing is well-defined.

English

🜛∞@DoozerDiffuser·5 Ara

@freeetext @TMcirony22088 @pickover I take a subset of N (0,x] not incl. 0 for simplicity. How many elements are in the subset? x. How many times did successor get applied? x. How many digits are in the largest number? x /mod 10. Now x->Infinity Infinite elements Infinite successor Infinite digits

English

Edward Lockhart@freeetext·5 Ara

@DoozerDiffuser @TMcirony22088 @pickover All of its members are finite, but the set itself is infinite.

English

🜛∞@DoozerDiffuser·5 Ara

@freeetext @TMcirony22088 @pickover So N is finite.

English

Edward Lockhart@freeetext·5 Ara

@DoozerDiffuser @TMcirony22088 @pickover Yes, that's right - you will never actually reach an infinite number when "counting to infinity".

English

🜛∞@DoozerDiffuser·5 Ara

@freeetext @TMcirony22088 @pickover Count to infinity by counting a finite number of times. Good luck!

English

Edward Lockhart@freeetext·5 Ara

@DoozerDiffuser @TMcirony22088 @pickover It's possible for every element to be finite but also for the whole set to be infinite and unbounded. (Indeed this is the case for the natural numbers)

English

🜛∞@DoozerDiffuser·5 Ara

@freeetext @TMcirony22088 @pickover It does. Otherwise it would extend to infinity. Is it unbounded? NOPE. R has infinity elements between zero and one, because those elements have *infinite information*. You can NOT store infinite information in a finite space. So you must choose: infinite or finite?

English

Edward Lockhart@freeetext·5 Ara

@DoozerDiffuser @TMcirony22088 @pickover Yes, if it had a maximum value it would be finite. But it doesn't.

English

🜛∞@DoozerDiffuser·5 Ara

@freeetext @TMcirony22088 @pickover Ok fine, i wont bother to ask where you got that (finite) rule from, I'll just accept it. So if successor cannot be called infinitely many times, then all numbers in N are finite. If N had a defined maximum value, that maximum value would be a finite number. N is finite.

English

Edward Lockhart@freeetext·5 Ara

@michaelpforan I think it's a fair question in some ways. It was clear since the 2022 FWS ruling, if not before, that people without a GRC remained their biological sex. So why did Girl Guides wait until FWS 2025 (plus 8 months) before acting?

English

1.4K

Michael Foran@michaelpforan·5 Ara

I know it was a long judgment but the answer to the question vexing the Good Law Project is contained at paragraph 26:

Good Law Project@GoodLawProject

The Supreme Court only ruled on whether people with a Gender Recognition Certificate count as women under the Equality Act. You have to be 18 to get a GRC. But the Guides say they’re kicking out trans girls – u18s – as a result. What’s made them do that? goodlaw.social/phon

English

159

900

52.9K

Keşfet

@rivermillion @grahamscheper @samth @finn_hulse @_Mira___Mira_ @Imported_Fun @IMAO_ @paul_wilson_nz