Tejpal Singh

1K posts

Tejpal Singh banner
Tejpal Singh

Tejpal Singh

@tsv650

building coding agents @ https://t.co/ZJUjbhMXlB

94102 انضم Ocak 2022
5.1K يتبع1.8K المتابعون
تغريدة مثبتة
Tejpal Singh
Tejpal Singh@tsv650·
Introducing cc-canary: a skill and open-source CLI tool that detects early signs of regressions in Claude Code by analyzing your local session logs.
Tejpal Singh tweet media
English
1
2
5
683
Tejpal Singh أُعيد تغريده
Tanay Jaipuria
Tanay Jaipuria@tanayj·
shots fired by Google Cloud CEO Thomas Kurian
Tanay Jaipuria tweet media
English
22
64
1.1K
88.5K
Tejpal Singh أُعيد تغريده
Tejpal Singh
Tejpal Singh@tsv650·
Anthropic's Opus 4.7 shipped with a new tokenizer, which makes it up to 50% more expensive for some users. I built a skill (/cc-markup) that estimates the price hike, backtested on your past sessions👇
Tejpal Singh tweet media
English
1
2
3
886
Tejpal Singh
Tejpal Singh@tsv650·
Introducing cc-canary: a skill and open-source CLI tool that detects early signs of regressions in Claude Code by analyzing your local session logs.
Tejpal Singh tweet media
English
1
2
5
683
Tejpal Singh
Tejpal Singh@tsv650·
In some ways, Haseeb is stating that if labor productivity continues to be decoupled from wages, value is maximized via rent-seeking behavior. This is precisely why public trust in institutions is so low, and will get much worse with AI––even if broader society adopts AI and becomes more productive, they reap a fraction of the benefits.
Tejpal Singh tweet media
Haseeb >|<@hosseeb

The highest-value human work in the AI era will be in domains with sparse reward signals. Internalize this, or watch your value erode over the next decade. Math, programming, rote memorization, data science, all fucked. The classic “smart nerd” jobs are exactly where AI is strongest, because the feedback loops are dense. You can check the answer. You can run the test. That means AI can improve quickly, and humans will rapidly fall behind. Your advantage as a human is in messy domains. Taste. Judgment. Negotiation. Risk-taking. Politics. Sales. Science at the frontier. Anything you can only really learn by doing. Cross-disciplinary stuff. The valuable domains will be the ones guarded by secrets, tacit knowledge, weak labels, long feedback cycles, and ambiguous outcomes. Places where the training data is scarce, the ground truth is disputed, and it's impossible to explain why something is good. AI will still enter these domains. But we will be slower to trust it unsupervised there, because it will be harder to tell when it is right, harder to prove when it is wrong, and difficult to construct secure sandboxes. The stakes will be too high to YOLO it. I find myself saying this over and over again to young people today: the future does not belong to people who are able to get good grades on tests. It belongs to people who can operate under uncertainty, in domains where correctness is hard to define. Those domains will become the thin waist of the economy: as productivity everywhere else accelerates, the humans who excel there will become our economic Strait of Hormuz. The best humans in these domains will demand an enormous cut of the growing economic pie. Your imperative going forward is to make sure you're one of these people. (Or become an electrician. That probably works too.)

English
0
0
1
185
Tejpal Singh
Tejpal Singh@tsv650·
If this holds true for one-on-one tutoring with ChatGPT (curious if anyone at OpenAI has run this experiment), the long-term societal impacts of Bloom's 2 Sigma is far higher than most other applications of AI
Tejpal Singh tweet media
English
0
0
2
184
Tejpal Singh
Tejpal Singh@tsv650·
@yush_g I started it a few weeks ago! Down to do this
English
0
0
1
141
Yush
Yush@yush_g·
anyone else going through cs336 seriously over the next few weeks (at ~1 lecture a day aiming for full understanding), and want to co-work on it together or know any good spots in the city that others are doing this
English
1
0
14
1.3K
Jeremy Nguyen ✍🏼 🚢
Jeremy Nguyen ✍🏼 🚢@JeremyNguyenPhD·
Claude Code usage: Matt reports that downgrading to earlier version (2.1.71) saves 5x usage compared to current version (2.1.120) with Opus 4.6. npm install -g @anthropic-ai/claude-code@2.1.71 Anyone have experience with specific older versions? I just downgraded, let's see.
Jeremy Nguyen ✍🏼 🚢 tweet media
Matt Henderson@matthen2

The latest claude code burns $$ roughly 5x faster than version 2.1.71 with opus 4.6- I tested it today and tracked my usage I'm downgrading for now!

English
17
8
152
32.7K
Matt Henderson
Matt Henderson@matthen2·
The latest claude code burns $$ roughly 5x faster than version 2.1.71 with opus 4.6- I tested it today and tracked my usage I'm downgrading for now!
English
32
9
425
67.6K
Tibo
Tibo@thsottiaux·
@kevinxu You can still retire with this if you start using Codex instead of the other one
English
49
11
1.2K
36.8K
Kevin Xu
Kevin Xu@kevinxu·
My net worth is $10,602,789.50. 20 years ago you could retire with this What happened
English
494
19
1.2K
367.4K
elvis
elvis@omarsar0·
Things have been degrading super fast in Claude Code. I still use Claude Code, but my default is now Codex. I still prefer Opus models for coding, and so I will try again with the fixes. I appreciate the post-mortem, but I don't trust that all issues have been resolved. Claude Code, in general, has been barely usable for me in the past couple of days. I got excited about Opus 4.7 (1M), but there is something really off about the thinking/reasoning. The model tends to either put in too much or too little effort, no matter the setting. I prefer it's smarter about how much effort to put. Responses are too verbose, and it really degrades the experience. In a lot of cases, I find myself doing things manually that Opus 4.6 had no issues solving for me at all. As they report, it might not be the model. But that means that the harness required a bit more testing. Not my favorite type of thing to tweet about, but as an avid Claude Code user, I would prefer that the Claude Code team properly test things before shipping them. Look, it's nice to show that you can move fast, and it's necessary in some instances, but the user experience cannot be the tradeoff. There are a lot of people (including myself) who depend on the quality of the product for very important work. The Claude Code experience has been so bad for a lot of devs I know (and me), so I've recently been more open to exploring other coding agents/harnesses. Also worth checking out Hermes Agent, pi, and OpenCode. I have nothing but love for the Claude Code team, but I hope they consider reassessing their strategy for how they move forward with improvement releases.
ClaudeDevs@ClaudeDevs

Over the past month, some of you reported Claude Code's quality had slipped. We investigated, and published a post-mortem on the three issues we found. All are fixed in v2.1.116+ and we’ve reset usage limits for all subscribers.

English
44
46
449
52.8K
ℏεsam
ℏεsam@Hesamation·
AMD Senior AI Director confirms Claude has been nerfed. She analyzed Claude's session logs from Janurary to March: > median thinking dropped from ~2,200 to ~600 chars > API requests went up 80x from Feb to Mar. less thinking and failed attempts meaning more retries, burning more tokens, and spending more on tokens > reads-per-edit dropped from 6.6x → 2.0x. model stops researching code before touching it. > model tried to bail out or ask "should i continue" 173 times in 17 days (0 times before March 8). > self-contradiction in reasoning ("oh wait, actually...") tripled. > conventions like CLAUDE.md get ignored because there's less thinking budget to cross-check edits > 5pm and 7pm PST are the worst hours, late night is significantly better. this means the thinking allocation is most likely GPU-load-sensitive.
ℏεsam tweet media
English
327
1K
9.6K
3.9M
Tejpal Singh
Tejpal Singh@tsv650·
Soon, cc-canary will suggest the right reasoning effort and model for your session so your Claude Code experience stops feeling nerfed! Stay tuned by starring the repo: github.com/delta-hq/cc-ca…
English
0
0
0
72
Tejpal Singh
Tejpal Singh@tsv650·
What it looks at: • Read:edit ratio (edit hygiene) • Reasoning loops, premature stops, self-admitted errors • Thinking-signature length (reasoning depth) • Interrupts, cost/turn, tokens/turn • User-prompt word-frequency shift All compared pre/post an auto-detected inflection.
English
1
0
0
72