Tom Nicholson

6.2K posts

Tom Nicholson

@TFWNicholson

Building the cognitive architecture for whatever comes next - @_mindanu https://t.co/usNSoYlpuz - AI from @Cambridge_Uni - 20+ years coding experience

Cambridge, England Katılım Eylül 2007

864 Takip Edilen546 Takipçiler

Sabitlenmiş Tweet

Tom Nicholson@TFWNicholson·17 Eyl

@elonmusk If human lifetimes are extended, maybe it doesn't matter. We have AI to come up with new ideas, so necrosis of thought doesn't matter that we don't die. This is neither good nor bad, it just is

English

112

345

5.1M

Tom Nicholson@TFWNicholson·35m

@arankomatsuzaki Re parallel agents: tree of thoughts type branching etc? At every decision point, try them all and see what the outcome of that step is, rinse, repeat, and backtrack

English

Aran Komatsuzaki@arankomatsuzaki·2h

i've been running Codex for ~8-24h per open math/physics research problem. few thoughts: parallel agents don't seem to scale that cleanly for a lot of problems. many of these are just extremely sequential. you don't really get to "spawn 50 agents and solve it from nowhere." it's more like: tiny move, check, reframe, tiny move, dead end, try again. hours/days of serial cognition, which honestly rhymes with how these fields move over decades. this updates me a bit against the sci-fi picture of "superhuman math/physics intelligence" as some alien oracle that instantly sees the proof / theory. the actual superhuman-ness is more mundane and maybe more important: the agent has absorbed a huge prior, can read long papers basically instantly, can think/write at >50 tok/s, and you can clone it across dozens of problems. speed + knowledge volume + multiplicability. that's the superpower. also: frontier physics seems much more tractable for these agents than decade-old open math problems. for some physics directions, ~8h is enough to get something paper-shaped and nontrivial. big caveat tho: research taste is still missing. the agent is a pretty good problem-solver, but not yet a top-tier problem-picker. it can push hard once the direction is chosen, but you probably still want a human with taste choosing the problem / framing / bet. current model: agents are becoming very strong research labor, but the bottleneck shifts upward into taste, problem selection, and knowing which hill is worth climbing.

English

5.3K

Tom Nicholson@TFWNicholson·14h

@deanwball If there are no more algorithmic and data improvements (a fairly ridiculous assumption), the pace of change will be fast due to effects hardware advancements/ investments

English

Dean W. Ball@deanwball·1d

I feel us approaching yet another summer of discontent with ai, just like last year, when many of my peers in the ai commentariat declared deep learning to have hit a wall because of gpt-5 blah blah blah.

English

267

41.3K

Tom Nicholson@TFWNicholson·23h

@WillrichOstmann @pushmeet @GoogleDeepMind Now we just need to prove Lean is correct

English

Willrich@WillrichOstmann·1d

@pushmeet @GoogleDeepMind The breakthrough isn't "AI solved 9 Erdős problems" — it's that the proofs are formally CHECKABLE in Lean/Coq. Not natural-language math you have to trust. That's why the 56-yr-old solutions are credible. Standard LLM math claims fail this bar.

English

849

Pushmeet Kohli@pushmeet·1d

AI agents are advancing research-level math. 🚀 I’m thrilled to share @GoogleDeepMind’s AlphaProof Nexus - an agentic framework for formal proof search powered by Gemini. When applied to a set of open formal math problems, our agent autonomously solved: ✅ 9 open Erdős problems (including two open for 56 years!) ✅ 44 Online Encyclopedia of Integer Sequences (OEIS) problems ✅ A 15-year-old open problem in algebraic geometry ✅ A 7-year-old open question in min-max optimization We are collaborating with mathematicians across disciplines - from combinatorics and graph theory to quantum optics. Ultimately, these results show the massive potential of even simple agentic loops powered by Gemini. Read the paper here: arxiv.org/abs/2605.22763…

English

213

1.4K

132.7K

Tom Nicholson@TFWNicholson·4d

@8teAPi People say that every year, we still have a

English

Prakash@8teAPi·4d

you think you’re down bad.. there are like 10 startups that raised money to build an AI mathematician and OpenAI just used a general model to go beyond the frontier… so what do you do now ? try to position for an acquihire ? because the clock is ticking… you probably have till the end of the year at best

English

3.5K

Tom Nicholson@TFWNicholson·17 May

@jxnlco Front-end prototyping

English

jason@jxnlco·17 May

When do you reach for other models instead of Codex? What can we do better? Hit me with all of your frustrations. dms open. If you can give me detail (e.g. specifics/transcipts) - it'll help a lot in finding out exactly what we need to do to improve the next model

English

418

845

184.6K

Tom Nicholson@TFWNicholson·17 May

@thsottiaux Linux version

English

Tibo@thsottiaux·17 May

For those of you living inside the codex app, what should we prioritize among features, reliability or performance?

English

1.9K

2.1K

281K

Tom Nicholson@TFWNicholson·10 May

@kimmonismus iOS only no doubt

English

Chubby♨️@kimmonismus·10 May

Looks like we got an answer to that cryptic openai post. codex mobile app. cant verify, hope its real :) would be really cool to see!

Quipra@Quipra_

Hell yeah .

English

455

54.6K

Tom Nicholson@TFWNicholson·9 May

@sama Cost and speed

English

Sam Altman@sama·9 May

what would you most like to see improve in our next model?

English

8.3K

304

1.4M

Tom Nicholson@TFWNicholson·9 May

@thsottiaux Show Linux some love!

English

Tibo@thsottiaux·9 May

As a Codex user, which platform are you on

English

507

707

220K

Tom Nicholson@TFWNicholson·8 May

@OpenAI Linux?

OpenAI@OpenAI·7 May

Codex now works directly in Chrome on macOS and Windows. It’s even better at working with apps and sites in Chrome, and now works in parallel across tabs in the background without taking over your browser. To get started, install the Chrome plugin in the Codex app.

English

631

1.3K

13.4K

2.5M

Tom Nicholson@TFWNicholson·7 May

@chrisgpt @basedjensen *yet

Chris@Chrisgpt·7 May

@basedjensen Remember @basedjensen 200-300k people a year get infected with Hanta in Asia and Europe But this is the first human to human multi person case Still not air born like Covid though

English

1.8K

Hensen Juang@basedjensen·7 May

Aw shit ... Here we go

Insider Paper@TheInsiderPaper

BREAKING: Two Singapore residents isolated, awaiting hantavirus test results, officials say

English

9.6K

Tom Nicholson@TFWNicholson·7 May

@thsottiaux Option to use pro. Grammar constrained decoding. More explicit search functionality

English

Tibo@thsottiaux·7 May

We are seeing strong traction and working to improve Codex for scientists across mathematics, physics, chemistry, biology and more. What do you wish it were capable of that it cannot do today?

English

530

112.7K

Tom Nicholson@TFWNicholson·6 May

@WorldsStrongest "Reverse parked my 18-wheeler first time, get in"

English

SBD World's Strongest Man@WorldsStrongest·6 May

Caption this 📸

English

Tom Nicholson@TFWNicholson·6 May

@TodayinHistory The most intelligent thing I've heard out of the White House for a long time

English

Today in History@TodayinHistory·6 May

This may be the most articulate response I’ve ever heard to this question.

English

2.9K

17K

110.2K

6.8M

Tom Nicholson@TFWNicholson·6 May

@aidan_mclau @pigeon__s Work on the personality

English

Aidan McLaughlin@aidan_mclau·6 May

@pigeon__s what can we do to close the gap?

English

102

8.4K

ρ:ɡeσn@pigeon__s·6 May

ok so gpt-5.5-instants personality seems to be like 3598x better than 5.3s and genuinely not slop i would say but i still feel like i prefer claudes a lot and even kimis but its like good enough now that i dont really care that much

English

122

10.7K

Tom Nicholson@TFWNicholson·27 Nis

@IterIntellectus Not irrational, just have bounded rationality, which is also modelled with game theory

English

vittorio@IterIntellectus·26 Nis

the problem with game theory is that it assumes rational players but most humans are irrational

English

380

105

782.5K

Tom Nicholson@TFWNicholson·26 Nis

@SimplyKatie___ Learn from them

English

Katie 🇺🇸@SimplyKatie___·25 Nis

What’s the best way to handle someone who’s more knowledgeable or intelligent than you?

English

1.5K

312

44.1K

Tom Nicholson@TFWNicholson·25 Nis

@oscpmentor @tenobrus He's not some dude, he's the dude

English

Bandors@oscpmentor·25 Nis

@tenobrus Did that guy ever turn out to be anything important? Is he actually an insider somewhere? Or just some dude?

English

1.3K

Tenobrus@tenobrus·25 Nis

so... where's gpt-6 strawberry boy??

🍓🍓🍓@iruletheworldmo

🚨BREAKING FRONTIER MODEL NEWS gpt-6 set for release april 14th altman's team has been leaking like a sieve lately, here's what openai staff are saying privately. >pretraining completed march 17th. post-training and red-teaming already done. this thing is ready. >benchmarks are absurd. outperforms gpt-5.4 by 40%+ on coding, reasoning, and agentic tasks. >natively multimodal from the ground up. text, audio, images, video one architecture >openai killed sora and redirected every GPU to this model. the billion-dollar disney deal is dead. that's how serious this is. >product org officially renamed to "AGI Deployment." it’s agi time baby. >brockman says AGI is 70-80% achieved. internally they think gpt-6 closes most of the remaining gap. >2 million token context window. double what gpt-5.4 offered. >priced at $2.50/$12 per million tokens. barely above gpt-5.4. so like mythos intelligence, but you can afford it. >safety team moved under the CRO. altman stepped back from safety oversight entirely to focus on data centers. >openai has been in internal "code red" since december 2025. this is their answer. >powers the new desktop "superapp", chatgpt, codex, and atlas browser merged into one agent. the potato is cooked. spud is agi.

English

414

34K

Tom Nicholson@TFWNicholson·25 Nis

@iruletheworldmo Rips through tokens like a laxative

English

🍓🍓🍓@iruletheworldmo·25 Nis

codex feels genuienly remarkable with 5.5, it's hard to overstate just how much this level of intelligence with this level of speed and efficiency changes things. it's picking up nuances and understanding my intent far far better than any other model and rather feeling like im wrangling my way through one frustrating bottleneck after the next i'm coming away from sessions feeling genuinely delighted please just point this thing at something outrageous and see if it will do it, you may find yourself blown away by the results

English

377

16.6K

Tom Nicholson@TFWNicholson·25 Nis

@Hitchslap1 I'm doing it right now, so definitely

English

Hitchslap@Hitchslap1·25 Nis

Serious question. Do you believe time travel is possible?

English

1.5K

548

57K

Keşfet

@arankomatsuzaki @deanwball @WillrichOstmann @pushmeet @GoogleDeepMind @8teAPi @jxnlco @thsottiaux