Neel Somani

2.7K posts

Neel Somani banner
Neel Somani

Neel Somani

@neelsomani

Formal methods & mechanistic interpretability. Prev: Founder of Eclipse, QR at Citadel. Proud Cal Bear.

San Francisco, CA Katılım Nisan 2012
253 Takip Edilen19.4K Takipçiler
Sabitlenmiş Tweet
Neel Somani
Neel Somani@neelsomani·
It's my birthday today. To celebrate, I'm releasing an extension of OpenAI's recent interpretability work. My project extracts Python functions from an LLM, revealing the behaviors encoded in its weights (e.g. quotation closing). Demo + code: github.com/neelsomani/sym…
Neel Somani tweet mediaNeel Somani tweet media
Leo Gao@nabla_theta

Excited to share our latest work on untangling language models by training them with extremely sparse weights! We can isolate tiny circuits inside the model responsible for various simple behaviors and understand them unprecedentedly well. openai.com/index/understa…

English
17
50
542
148.4K
Tenobrus
Tenobrus@tenobrus·
at google this was known as "buying the gnome". there's like a billion tweets about this already but basically the story goes back in like 2005 or something they were building out their shopping search system, and it was working pretty well. except for the fact that if you searched for sneakers, the top result was a garden gnome. engineers were going crazy trying to fix the ranking bug, but eventually someone noticed that the gnome listing was on ebay, and there was only one of them, and it cost like $50. so they just bought the gnome and suddenly the listing was gone, problem solved. why bother fixing software issues when you can just change the world to fit your software instead?
Tenobrus@tenobrus

if you're about to release a model that you know has the ability to reveal zerodays in every commonly used open source project you could delay release for a few years or spend another ten billion on alignment RL. or you could just secretly fix all the zerodays yourself first.

English
28
236
6.1K
359.3K
Neel Somani
Neel Somani@neelsomani·
Don't buy life insurance. Never be more valuable dead than alive.
English
3
0
19
9K
Neel Somani retweetledi
Kpaxs
Kpaxs@Kpaxs·
High-agency people seem to have insane luck. They don't. They just tried 47 things while everyone else tried two and gave up. The conviction that reality is negotiable is generative, it makes you creative. Because if you believe there's always another angle, you start looking for angles other people don't see.
Kpaxs@Kpaxs

High-agency people genuinely believe that reality is negotiable in a "there are always more levers to pull" way. It's about having this bone-deep conviction that if you keep poking at something from different angles, eventually something will give.

English
38
433
4K
294.9K
Neel Somani retweetledi
Dwarkesh Patel
Dwarkesh Patel@dwarkesh_sp·
AI has solved 50 Erdős problems in the last year. But on a wider sweep of problems, the models’ success rate is only about 1-2%: labs have just been publishing the wins. This isn’t because AI isn’t useful for mathematicians. Terence Tao thinks the models are currently at the level of a trustworthy coworker. But while they’ve got a strong ability to apply standard math techniques to problems, often more reliably than humans, Terence thinks they currently aren’t great at iterating on partial successes - their understanding of the mathematical object does not advance from session to session. I swear I wasn’t trying to get him to talk about continual learning.
English
51
66
768
144.5K
Neel Somani
Neel Somani@neelsomani·
@samiannoesis @Restructuring__ Or I guess an easier way of thinking about it is just look at a supply/demand chart. You're already used to marginal cost setting the price in an efficient market
English
1
0
1
48
Neel Somani
Neel Somani@neelsomani·
@samiannoesis @Restructuring__ It's approximately DSIC assuming the market participants are all "small". A true Vickrey auction would be too difficult to compute for the multiple timestep version of this optimization
English
1
0
1
41
Neel Somani retweetledi
Restructuring__
Restructuring__@Restructuring__·
I now understand how electricity is priced better than 99% of people after watching this video 2 minutes well spent, watch this ex-Citadel simply explain the power market
English
25
86
1.4K
179.5K
Neel Somani
Neel Somani@neelsomani·
@chang_defi I just post about different stuff on Instagram lol
English
1
0
25
3.5K
Christos Tzamos
Christos Tzamos@ChristosTzamos·
1/4 LLMs solve research grade math problems but struggle with basic calculations. We bridge this gap by turning them to computers. We built a computer INSIDE a transformer that can run programs for millions of steps in seconds solving even the hardest Sudokus with 100% accuracy
English
251
812
6.1K
1.8M
Amjad Masad
Amjad Masad@amasad·
Software isn’t merely technical work anymore. It’s creative. Introducing Replit Agent 4. The first AI built for creative collaboration between humans and agents. Design on an infinite canvas, work with your team, run parallel agents, and ship working apps, sites, slides & more.
English
620
676
6.7K
2.9M
Neel Somani retweetledi
Kevin Weil 🇺🇸
Kevin Weil 🇺🇸@kevinweil·
If true, this would be the first of @EpochAIResearch's Frontier Math open problems to be resolved by AI. "The result emerged from a single GPT-5.4 Pro run and was subsequently refined into Lean with GPT-5.4 XHigh which ran for a few hours."
spicylemonade@spicey_lemonade

We believe we have fully resolved, in Lean and python, one of @EpochAIResearch Frontier Math open problems: a Ramsey-style problem on hypergraphs. The result emerged from a single GPT-5.4 Pro run and was subsequently refined into Lean with GPT-5.4 XHigh which ran for a few hours. github.com/spicylemonade/… @Jsevillamol

English
15
36
507
75.8K
Neel Somani
Neel Somani@neelsomani·
@eduwithtina @justinskycak For the question shown above, I can guarantee you that they're not asking anything about angles. That's why I'm asking for the specific multiple choice options. Unless SAT math has suddenly become tremendously harder (it hasn't), the method they're doing likely isn't necessary
English
1
0
0
52
Tina Sindwani
Tina Sindwani@eduwithtina·
The questions can be regarding area. e.g. compare the area of the shades vs unshaded area. It could be about angles. e.g. with a triangle inscribed inside a circle, what angle does the inscribed triangle make? Given arc length or 90 degree triangle with some side lengths. Or what the arc length is given the angle. You can test basic algebra, area formulas, or various aspects of trig including sine/cosine/tan. And so on.
English
1
0
0
65
Justin Skycak
Justin Skycak@justinskycak·
This “inscribing” trick is a perfect example of the kind of skill that shows up on the SAT, but most students won’t learn it even if they ace all of their math classes at school. There is a gigantic “missing middle” on these exams, a chasm between the standard curriculum and what’s on the test, the purpose of which is to raise the ceiling of the test’s ability to measure 1. cognitive advantage (IQ, generalization ability, etc.) and 2. willingness to put in the work to train up their skills outside the standard curriculum. But unfortunately, most SAT prep resources either address little to none of this “missing middle,” and whatever is addressed is typically presented with so little pedagogical effort that even highly capable and motivated students find it difficult to process. As a result, the “missing middle” primarily serves to measure (1) and not (2). We are changing that. We identified this gigantic missing middle and added it to our finely scaffolded knowledge graph, so that we can put hardworking students in the best possible position to succeed on these exams. By the way, we’ll be doing this for the ACT too, and all the other common exams. And it’s the same idea for competition math, where the missing middle is even bigger.
Justin Skycak tweet media
Alex Smith@ninja_maths

I'm delighted to announce that Math Academy's SAT Math Prep course is now available for registration! This course functions as an advanced performance-training environment. Students engage exclusively with high-fidelity SAT-style problems mirroring those on the official exam. Link and more info in the comments.

English
9
7
125
13.3K
Neel Somani retweetledi
Kahlil Lalji
Kahlil Lalji@bykahlil·
210 days ago, @naturalpay was just a one-pager and a memo. Today, we’re coming out of stealth. Natural is the agentic payments platform powering frictionless money movement between agents, businesses, and consumers. Wallets. Payments. Ledgering. Routing. Identity. Compliance. Credit. Observability. Risk. Everything needed to move money. Engineered for agents and designed for humans. These primitives give you the ability to transact without becoming a payments expert or stitching together a dozen fragmented tools. Huge thanks to our team of 10 (soon to be 25), our early investors, and the supporters who believed in this vision from the start. If you want to help shape how money moves over the coming decades, we’re hiring. And if you’re building agents you should probably be moving money too. Reach out. Read more about Natural and the products we’re launching in the blog below. natural.co/blog/introduci…
San Francisco, CA 🇺🇸 English
106
51
701
141.4K
Neel Somani
Neel Somani@neelsomani·
Disclosure: It takes longer than Brett's 5-minute time limit (~1 hr). There are also two (previously identified) bugs in the regression test website, which I've patched. Open-source repo w/ run instructions: github.com/neelsomani/adc…
English
0
0
6
2.1K
Neel Somani
Neel Somani@neelsomani·
I define a domain-specific language to describe a webpage and the set of permissible browser actions. The LLM iterates with the compiler until it produces valid output. Here's a video of it summarizing info about Ada Lovelace on Wikipedia:
English
1
0
5
2.5K
Neel Somani
Neel Somani@neelsomani·
I asked the Codex app to build a browser agent more capable than anything that exists today. It burns tokens at $30/hour using GPT-5.2. But it can handle hard web tasks - including @adcock_brett's challenge below:
Brett Adcock@adcock_brett

Solve this in under 5 minutes and I’ll offer you $500k/year in cash plus several million in equity I'm building a Computer-Use team, goal is to use computers better than humans No experience or PhD needed Instructions: 1. Solve all 30 challenges on this website in under 5 minutes: serene-frangipane-7fd25b.netlify.app 2. Feel free to use any tools or vibe code it. Provide us a zip folder with instructions on how to run the agent and reproduce your results, as well your run statistics 3. The agent should be able to solve all the challenges, use browser, and provide overall metrics around time taken, token usage and token cost. Your agent must solve this challenge in under 5 minutes Email your response: agents@brettadcock.com If you have any questions about this challenge, feel free to email us

English
1
3
22
9.1K
Benji
Benji@benjiianc·
Built this app with @blakeandersonw @benwxng in the first few months of joining @10x_apps, right now it is doing $30K a month. Amazing what happens when you're surrounded by people who push you to operate at 10x.
Benji tweet media
English
16
2
125
39.6K