Tadyo

210 posts

Tadyo banner
Tadyo

Tadyo

@ucfmkr

I test AI tools so you don't have to 3 real picks/week → The Lever newsletter No hype. Just tools that move the needle 🔗 https://t.co/9hOKsmHnKb

Beigetreten Mayıs 2024
155 Folgt31 Follower
Tadyo
Tadyo@ucfmkr·
I started timing how long it took my engineers to recover focus after a 30 minute meeting. The average was 23 minutes. A standup at 11am does not cost you 30 minutes. It costs you the morning
English
0
0
0
0
Tadyo
Tadyo@ucfmkr·
A useful agent in production is not a smart agent. It is a narrow one with a loud no.
English
0
0
0
0
Tadyo
Tadyo@ucfmkr·
We run one of these for compliance reviews. It does about 11 things. It has rejected requests outside that boundary thousands of times. That refusal rate is what makes the thing trustworthy.
English
1
0
0
0
Tadyo
Tadyo@ucfmkr·
The agents quietly working in production right now share one thing. They refuse most of what users ask them to do.
English
1
0
0
0
Tadyo
Tadyo@ucfmkr·
The teams shipping production agents this year are not the ones with the best model access. They are the ones who shrank the workflow until the math stopped fighting them.
English
0
0
0
0
Tadyo
Tadyo@ucfmkr·
The fix that actually works is fewer steps. Cut the chain in half. Add a deterministic step where you can. Fail loudly when the model is unsure instead of guessing.
English
1
0
0
0
Tadyo
Tadyo@ucfmkr·
An agent with 85% accuracy per step running a 10 step workflow succeeds about 20% of the time. The model is not the problem. The math is the problem.
English
1
0
0
0
Tadyo
Tadyo@ucfmkr·
@AIHighlight the older more educated higher paid workers being affected first is not a surprise to anyone in compliance or finance ops those roles were already 80% structured workflow. that is exactly what frontier models eat first
English
0
0
0
1
AI Highlight
AI Highlight@AIHighlight·
🚨BREAKING: Anthropic just published a study mapping exactly which jobs its own AI is replacing right now. The workers most at risk are not who anyone expected. They are older. They are more educated. They earn 47% more than average. And they are nearly four times more likely to hold a graduate degree than the workers AI is not touching. The argument is straightforward. Anthropic built a new metric called "observed exposure." Not what AI could theoretically do. What it is actually doing right now in professional settings, measured against millions of real Claude conversations from enterprise users. For computer and math workers, AI is theoretically capable of handling 94% of their tasks. It is currently handling 33% of them. For office and administrative roles, theoretical capability is 90%. Current observed usage is 40%. The gap between what AI can do and what it is already doing is enormous. The researchers are explicit about what comes next. As capabilities improve and adoption deepens, the red area grows to fill the blue. The demographic finding is what makes the paper uncomfortable. The most AI-exposed workers earn 47% more on average than the least exposed group. They are more likely to be female. They are more likely to be college educated. This is not a story about warehouse workers or truck drivers. It is a story about lawyers, financial analysts, market researchers, and software developers. The exact group whose education was supposed to insulate them. Computer programmers showed the highest observed AI exposure at 74.5%. Customer service representatives at 70.1%. Data entry keyers at 67.1%. Medical record specialists at 66.7%. Market research analysts and marketing specialists at 64.8%. These are not predictions. These are measurements of work that is already happening on AI platforms right now. Then there is the pipeline finding nobody is talking about loudly enough. Anthropic's researchers found a 14% decline in the job-finding rate for workers aged 22 to 25 in highly exposed occupations since ChatGPT launched. No comparable effect for workers over 25. Entry-level roles were never just jobs. They were the training ground where junior analysts became senior analysts, where junior lawyers learned how arguments hold together. If that layer disappears, nobody has answered the question of where the next generation of senior professionals comes from. The detail buried in the paper that most coverage missed: 30% of American workers have zero AI exposure at all. Cooks. Mechanics. Bartenders. Dishwashers. The technology reshaping professional careers is completely irrelevant to roughly a third of the workforce. The divide is no longer between high skill and low skill. It is between presence and absence. The company publishing this study is the same company selling the AI doing the replacing. Anthropic had every commercial incentive to soften these findings. They published them anyway. If you spent four years and $200,000 on a degree to land a white collar career, the company that builds Claude just confirmed your job is more exposed than the bartender pouring drinks at your graduation party. Source: Anthropic, "Labor market impacts of AI: A new measure and early evidence" PDF: anthropic.com/research/labor…
AI Highlight tweet media
English
259
1.5K
4.4K
806.1K
Tadyo
Tadyo@ucfmkr·
The bigger shift is architectural. The default for some workflows is moving from API call to a hyperscaler to checkpoint on a local box. That changes how you think about the whole stack.
English
0
0
0
1
Tadyo
Tadyo@ucfmkr·
We have been testing Qwen on internal docs this week. It is not faster than the frontier models. It is enough faster than nothing, which is what we had before for sensitive workflows.
English
1
0
0
2
Tadyo
Tadyo@ucfmkr·
A 27B model that runs locally on a 3090 just matched Claude 4.5 Opus on Terminal Bench. For anyone in regulated work, that headline reads differently than it does for the rest of the timeline.
English
1
0
0
1
Tadyo
Tadyo@ucfmkr·
@rahulgs matches what we see on compliance reviews. opus stays cheaper on the smallest diffs but 5.5 wins everything past a few hundred lines cache write pricing is the part most teams haven't accounted for yet
English
0
0
0
2
rahul
rahul@rahulgs·
GPT-5.5 is ~39% cheaper than Opus 4.7, across merged PRs bucketed by diff size in Inspect despite the higher output token cost, 5.5 is cheaper for input tokens (cache writes are free), more token efficient, and tokenizes the same text to fewer tokens
rahul tweet media
English
35
62
1.1K
133.8K