Jacob Rothfield

2.6K posts

Jacob Rothfield banner
Jacob Rothfield

Jacob Rothfield

@JacobRothfield

Entrepreneur & Engineer (BSc, BEng, MFin).

Melbourne, Australia Katılım Nisan 2009
1.2K Takip Edilen629 Takipçiler
Jacob Rothfield retweetledi
Kol Tregaskes
Kol Tregaskes@koltregaskes·
Many developers have suspected for months that GPT-5.5 outperforms Claude Sonnet for coding. But SWE-Bench reported near-parity, and it made people question what they’d been seeing in practice. DeepSWE aligns more closely with that day-to-day experience: GPT-5.5 scores 70% versus Claude Sonnet at 32%. That difference is substantial. DeepSWE focuses on what tends to matter in real workflows: whether an agent can take a short behavioral prompt, locate the correct area of the codebase, and implement the change cleanly - without needing you to enumerate files, modules, and functions. SWE-Bench often fails to capture that, due to dataset contamination and weaker verification. deepswe.datacurve.ai/blog
Kol Tregaskes tweet media
English
84
89
1.2K
124.2K
Jacob Rothfield
Jacob Rothfield@JacobRothfield·
Composer 2.5 is excellent. Ironically the weakest point is the harness now.
English
0
0
0
17
Youssof Altoukhi
Youssof Altoukhi@Youssofal_·
After spending time with Qwen 3.6 27B in Cursor I’ve come to realise the constraint on local models isn’t intelligence but the harnesses. Local model harnesses are TERRIBLE. Pi, open code etc are genuinely bad. As a community, we need to do better than this.
English
180
21
753
76.1K
Jacob Rothfield
Jacob Rothfield@JacobRothfield·
Claude code is the floor.. not the ceiling
English
0
0
0
12
0xSero
0xSero@0xSero·
Claude Code is unbelievable man, sessions that start with /loop don't save and are not resumable.
0xSero tweet media
English
25
0
149
11.2K
Jacob Rothfield retweetledi
Theo - t3.gg
Theo - t3.gg@theo·
A lot of people are building with the assumption that the codebases we work in today will still matter next year. I’m not sure if that’s the case.
English
66
67
2.4K
261.9K
Jacob Rothfield
Jacob Rothfield@JacobRothfield·
@L1vsun Include transaction costs, latency, liquidity, and risk. Never confuse a neat state taxonomy with alpha.
English
0
0
0
4
Jacob Rothfield
Jacob Rothfield@JacobRothfield·
@L1vsun Model the market as a probabilistic, partially observed, non-stationary state process. Define state variables carefully. Estimate transition probabilities conditionally on current information. Allow transition probabilities to change. Validate out of sample.
English
1
0
0
53
Livsun
Livsun@L1vsun·
a citadel quant told me something that broke my entire trading framework "we don't predict markets. we model the state machine" he explained markov chains in 90 seconds the market is never random - it always exists in one of three states trending up, trending down, ranging - each has a fixed probability of shifting to another build the transition matrix from real price data: > trending up -> 68% stays trending, 21% flips to range, 11% reverses > ranging -> 54% stays range, 28% breaks up, 18% breaks down > trending down -> 61% stays falling, 24% flips to range, 15% reverses now you're not guessing, you're playing probability identify current state, enter with the 68% edge, size with kelly criterion based on that probability the formula is public - markov published it in 1906 hedge funds use it, the math costs nothing what costs you is asking the wrong question "where is price going?" is random "what state am I in right now?" has an answer transition matrix built from 10 years of data is your edge Bookmark it not a signal, not an indicator - just conditional probability that compounds every single trade
Roan@RohOnChain

x.com/i/article/2053…

English
41
134
838
259.5K
Jacob Rothfield
Jacob Rothfield@JacobRothfield·
8/ Cursor is competitive, however, the space moves fast and the LLM failure modes are not fully mitigated.
English
0
0
0
15
Jacob Rothfield
Jacob Rothfield@JacobRothfield·
7/ Agents should also stop asking humans to do obvious executable checks. If a smoke test, browser walk, API call, or page inspection is safe and available, the agent should run it, record the proof, and continue. Ask for product judgment, approvals, or irreversible actions.
English
1
0
0
13
Jacob Rothfield
Jacob Rothfield@JacobRothfield·
1/ I appreciate @cursor_ai's generous Composer 2.5 model. I genuinely like 2.5, and I also like the direction of Cursor Cloud Agents. The model quality is not my main concern. The hard part is the control runtime around the models.
English
1
0
0
43