Jacob Rothfield

2.6K posts

Jacob Rothfield

@JacobRothfield

Entrepreneur & Engineer (BSc, BEng, MFin).

Melbourne, Australia Katılım Nisan 2009

1.2K Takip Edilen629 Takipçiler

Jacob Rothfield retweetledi

Kol Tregaskes@koltregaskes·13h

Many developers have suspected for months that GPT-5.5 outperforms Claude Sonnet for coding. But SWE-Bench reported near-parity, and it made people question what they’d been seeing in practice. DeepSWE aligns more closely with that day-to-day experience: GPT-5.5 scores 70% versus Claude Sonnet at 32%. That difference is substantial. DeepSWE focuses on what tends to matter in real workflows: whether an agent can take a short behavioral prompt, locate the correct area of the codebase, and implement the change cleanly - without needing you to enumerate files, modules, and functions. SWE-Bench often fails to capture that, due to dataset contamination and weaker verification. deepswe.datacurve.ai/blog

English

1.2K

124.2K

Jacob Rothfield@JacobRothfield·1d

Composer 2.5 is excellent. Ironically the weakest point is the harness now.

English

Jacob Rothfield@JacobRothfield·1d

@Youssofal_ 💯

QME

102

Youssof Altoukhi@Youssofal_·1d

After spending time with Qwen 3.6 27B in Cursor I’ve come to realise the constraint on local models isn’t intelligence but the harnesses. Local model harnesses are TERRIBLE. Pi, open code etc are genuinely bad. As a community, we need to do better than this.

English

180

753

76.1K

Jacob Rothfield@JacobRothfield·1d

@antigravity why is it costing me quota when it doesn't work?

English

Jacob Rothfield@JacobRothfield·1d

@OpenAI @OpenAIDevs please fix this bug

English

Jacob Rothfield@JacobRothfield·2d

Claude code is the floor.. not the ceiling

English

Jacob Rothfield@JacobRothfield·3d

@OpenAI @OpenAIDevs pls fix the web ui

English

Jacob Rothfield@JacobRothfield·3d

@GeminiApp please fix

English

Jacob Rothfield@JacobRothfield·3d

@0xSero It's quite ineffective.

English

0xSero@0xSero·3d

Claude Code is unbelievable man, sessions that start with /loop don't save and are not resumable.

English

149

11.2K

Jacob Rothfield retweetledi

Theo - t3.gg@theo·3d

A lot of people are building with the assumption that the codebases we work in today will still matter next year. I’m not sure if that’s the case.

English

2.4K

261.9K

Jacob Rothfield@JacobRothfield·3d

@L1vsun Include transaction costs, latency, liquidity, and risk. Never confuse a neat state taxonomy with alpha.

English

Jacob Rothfield@JacobRothfield·3d

@L1vsun Model the market as a probabilistic, partially observed, non-stationary state process. Define state variables carefully. Estimate transition probabilities conditionally on current information. Allow transition probabilities to change. Validate out of sample.

English

Livsun@L1vsun·6d

a citadel quant told me something that broke my entire trading framework "we don't predict markets. we model the state machine" he explained markov chains in 90 seconds the market is never random - it always exists in one of three states trending up, trending down, ranging - each has a fixed probability of shifting to another build the transition matrix from real price data: > trending up -> 68% stays trending, 21% flips to range, 11% reverses > ranging -> 54% stays range, 28% breaks up, 18% breaks down > trending down -> 61% stays falling, 24% flips to range, 15% reverses now you're not guessing, you're playing probability identify current state, enter with the 68% edge, size with kelly criterion based on that probability the formula is public - markov published it in 1906 hedge funds use it, the math costs nothing what costs you is asking the wrong question "where is price going?" is random "what state am I in right now?" has an answer transition matrix built from 10 years of data is your edge Bookmark it not a signal, not an indicator - just conditional probability that compounds every single trade

Roan@RohOnChain

x.com/i/article/2053…

English

134

838

259.5K

Jacob Rothfield@JacobRothfield·3d

@fcoury Thank you.

English

Felipe Coury 🦀@fcoury·4d

It’s done. Enjoy your weekend with Codex limits reset!

Tibo@thsottiaux

Some of you noticed limits drained faster in Codex, we root caused it to an optimization that we rolled back that had an impact on cache hit rates when compacting across long running sessions. We fixed this and have now reset usage limits for all accounts. Enjoy the weekend.

English

313

20.7K

Jacob Rothfield@JacobRothfield·3d

8/ Cursor is competitive, however, the space moves fast and the LLM failure modes are not fully mitigated.

English

Jacob Rothfield@JacobRothfield·3d

7/ Agents should also stop asking humans to do obvious executable checks. If a smoke test, browser walk, API call, or page inspection is safe and available, the agent should run it, record the proof, and continue. Ask for product judgment, approvals, or irreversible actions.

English

Jacob Rothfield@JacobRothfield·3d

1/ I appreciate @cursor_ai's generous Composer 2.5 model. I genuinely like 2.5, and I also like the direction of Cursor Cloud Agents. The model quality is not my main concern. The hard part is the control runtime around the models.

English

Keşfet

@Youssofal_ @antigravity @OpenAI @OpenAIDevs @GeminiApp @0xSero @L1vsun @fcoury