Raymond Valdes

81 posts

Raymond Valdes

@RaymondValdes5

ai agents are becoming an engineering management problem. Engineering leader writing about ai workflows, ownership, and turning demos into reliable systems

California Katılım Ekim 2021

636 Takip Edilen121 Takipçiler

Sabitlenmiş Tweet

Raymond Valdes@RaymondValdes5·17 May

i know how complex engineering work actually gets done hiring team design interfaces requirements verification technical judgment program risk coordination across specialties ai agents are entering that system now most people are only talking about the model

English

129

Raymond Valdes@RaymondValdes5·24 May

Delegating to agents is hard for the same reason delegating to people is hard. You have to be clear. What outcome do we want? What’s in scope? What should not change? When should it stop and ask? The agent exposes the gaps you were carrying in your head.

English

Raymond Valdes retweetledi

Hemachandiran@Hem_chandiran·21 May

@benln Well said. Excellent timing... Here is the YouTube link to watch the AI series : youtube.com/playlist?list=…

English

13K

Raymond Valdes@RaymondValdes5·21 May

@typesfast Because the code got faster but the roadmap still came from a meeting called “Q3 alignment sync final final.”

English

285

Ryan Petersen@typesfast·20 May

With all these AI coding improvements why isn't the software I use everyday getting better?

English

374

1.6K

138.2K

Raymond Valdes@RaymondValdes5·21 May

@paulg Keep learning fast enough and better options keep showing up. FOMO makes every idea look final.

English

Paul Graham@paulg·20 May

It's a fallacy to think you should drop out of college because you have an idea that has to be implemented right now, or it will be too late. If you stay in school you'll have other and better ideas.

English

307

284

4.7K

463.2K

Raymond Valdes@RaymondValdes5·21 May

@allTheYud A lot of “future-proof AI skills” are just workarounds for current model bugs. Temporary skills still matter

English

263

Eliezer Yudkowsky@allTheYud·20 May

Every month a new guy discovers LLMs; discovers a skill the current LLMs require to get good results; and writes about the future jobs that will always be available for smart people like HIM, that are SKILLED in using LLMs. The next generation of AIs doesn't need his fancy prompt. The image model goes from needing to type in just the right set of weird words and cryptic sorcerous invocations, to most people being able to type in English what they want and get a pretty good result. There are still tasks that require careful invocation. But they are a much smaller fraction of all the tasks people are trying to do, or you can get a bleh result without the elaborate invocation to get it really good. And to improve on the bleh result you need to be substantially more of an expert than back when the Guy was memorizing a rule about adding "trending on Artstation" to the image prompts, as would always require a human paid to do that. Another generation of AIs comes out. The next generation of Clever Skills is obsolete. Image models just obey the instructions for compositing panels without mixing them up, and you don't need to be an expert to get them to do it right. Another human value-add is gone. A wider set of tasks require no human expert. Now a new Guy notices LLMs have become useful in his field for the first time. He discovers they require SKILL to use CORRECTLY. He posts about how there will always be jobs for humans who are SKILLED in using LLMs like HIM. But it is not an infinite cycle. It is not the same each time it repeats. Now the Guy is a highly paid programmer or a career mathematician in 2026, instead of a graphic artist in 2023. In six months the models will no longer require his vaunted Skills. And by then there will be another Guy. But the process doesn't continue forever. The Guys are coming from fields that were harder and harder for AIs. The brief centaur eras are shorter and shorter. Today it is writers who are laughing at how bad the LLMs are at their job, and who will perhaps soon be posting about how it takes Skill to get an LLM to do their job Correctly. But the models are coming faster, and the eras of kinds of human value-add in each field are shortening. There is a point when you run out of Guys, either because the centaur eras are too short for people to develop SKILLs and post to Twitter about them; or because there are not lands left for AIs to conquer; or because ordinary people are not reassured by some Nobel laureate proclaiming there will always be jobs for Nobel laureates with the SKILLS to prompt robotized biology labs Correctly. But we'll never run out of amateur economists who assert entirely *without* a brief contemporary example that there will always be jobs for humans skilled at operating AIs! We'll run out of professional economists saying it when nobody is paid for that work anymore. I guess we'll also run out of amateur economists when they're dead.

English

798

48.1K

Raymond Valdes@RaymondValdes5·21 May

@thsottiaux @ajambrosino This is good leadership. Credit the team.

English

Tibo@thsottiaux·20 May

I like to think that Codex is the work of a great team working together in unison. It's the most collaborative team I've gotten the chance to work with. But I don't think we could have done it without @ajambrosino's magic. He's been the driving force behind what makes the Codex app the app that everyone wants to emulate. And we are barely getting started.

English

144

2.2K

144.9K

Raymond Valdes@RaymondValdes5·21 May

@willdepue the accountant line is funny because boring domains might be the final boss, bro. math proof: hard problem, clean target. real work: messy context, bad handoffs, unclear ownership, 14 exceptions, and steve from finance saying “quick question” at 4:57pm bro

English

5.5K

Raymond Valdes@RaymondValdes5·21 May

@sama Big milestone. Also feels like the next bottleneck becomes verification: not just whether the model found something, but how humans inspect it, trust it, and build on it without turning the process into theater.

English

Sam Altman@sama·20 May

a general-purpose model solved a major open problem in mathematics. we'll be saying this a lot over the coming years, but this is a kinda big milestone. i'm very excited for AI to greatly extend our understanding of the world, but still, i have complicated feelings today.

Timothy Gowers @wtgowers@wtgowers

If you are a mathematician, then you may want to make sure you are sitting down before reading further.

English

568

360

6.6K

743.1K

Raymond Valdes@RaymondValdes5·21 May

@CliftonSellers Same tool, different posture. Some people will use AI to avoid thinking. Some will use it to think sharper, test assumptions faster, and get through feedback loops with better judgment.

English

Clifton Sellers@CliftonSellers·20 May

I could be COMPLETELY WRONG but I think AI is going to make a lot of people lose the ability to think and the will to work (basically make people dumb and lazy) That being said, I DO believe the future belongs to the ones who value taste, authenticity, and problem solving

English

232

234

13.4K

Raymond Valdes@RaymondValdes5·21 May

This is why “learning to use agents” is less about clever prompts and more about learning to package work. A good handoff has a clear outcome, the right context, boundaries, examples, tests, and a way to know when the agent should stop. If you can’t explain the work clearly, the agent won’t magically fix that.

English

Raymond Valdes@RaymondValdes5·21 May

Delegating is hard because it forces you to be clear. A vague task you keep in your own head can still kind of work because you’re carrying all the missing context. The moment you hand it to someone else, or to an agent, the gaps show up. What outcome do we want? What is in scope? What is not? What tradeoffs are allowed? When should they stop and ask? Delegation exposes the quality of your thinking.

English

Raymond Valdes@RaymondValdes5·21 May

Teach ICs to delegate to agents the same way you’d teach them to delegate work to another engineer. Start with a small task. Write down the expected outcome. Name the files or systems in scope. Say what not to touch. Ask for a plan before changes. Review the diff. Run the test. Then make the next task slightly bigger. It’s not prompt magic. It’s reps.

English

Raymond Valdes@RaymondValdes5·21 May

Gemini Spark points at something interesting for roadmaps. If agents can run in the background, keep progress visible, and handle more of the execution layer, roadmapping becomes even more important. Not because teams need more planning. Because faster execution gives you more chances to learn, adjust, and point the work at what actually matters.

English

Raymond Valdes@RaymondValdes5·21 May

Google calling Gemini Spark a 24/7 agent is the right framing. The interesting part is not “chatbot gets smarter.” It’s work moving into the background: long-running tasks, progress updates, tool access, and agents that keep going after you close the laptop. That creates a new problem too. If agents are running while we’re not watching, the real product becomes trust: scope, permissions, traceability, and knowing exactly when the human needs to step back in.

English

Raymond Valdes@RaymondValdes5·21 May

The more I use AI agents, the less I think the hard part is prompting. It feels a lot more like managing work. You have to give enough context, define the edges, check the output, and not pretend “it ran” means “it understood.” Still useful. Just not magic.

English

Raymond Valdes retweetledi

Google Labs@GoogleLabs·19 May

Today, we introduced Gemini For Science, a collection of experimental tools designed to expand the scale and precision of scientific exploration. Included in Gemini for Science are three (!!!) brand new Google Labs experiments. Meet your new AI research partners: 🧵👇

English

208

1.5K

92.9K

Raymond Valdes@RaymondValdes5·19 May

@chamath The sync is valuable. But the chain can’t stop at requirements. Requirements need to trace back to use cases, business needs, constraints, and actual user value. Otherwise you just get beautifully synchronized drift.

English

103

Chamath Palihapitiya@chamath·19 May

A cool feature of 8090’s Software Factory is our ability to bind critical artifacts together: Requirements <—> Blueprints <—> Work Orders <—> Code <—> Tests / Evals Make a change in any one place and our agent cascades it everywhere so artifacts are updated and stay in synch, minimizing drift.

8090@8090_Factory

Architecture docs die the moment code ships. Code-to-Blueprint sync now catches the drift. Every merged PR triggers an agent in Software Factory to compare the code to the architecture and leaves comments where they disagree. Learn more about Software Factory: 8090.ai

English

178

91K

Raymond Valdes@RaymondValdes5·19 May

@49agents Yes, it’s a constant reminder because they’re not magic. We need to keep them grounded and keep them in check.

English

49 Agents IDE - IDE for Agentic Coding@49agents·19 May

@RaymondValdes5 this is the best description of coding agents ive seen. the amnesia is the real danger - it forgets context mid-task and starts refactoring random stuff. small tasks plus sharp constraints plus tests that bite. thats the only way to use them safely

English

Raymond Valdes@RaymondValdes5·19 May

Coding agents are genius interns with amnesia and write access. They move fast, misread one sentence, and somehow end up refactoring auth because “it seemed related.” Be kind. But use small tasks, sharp constraints, and tests that bite.

English

Raymond Valdes@RaymondValdes5·19 May

@jun_song Yes this exactly. Ask it to then test the harness and scaffolding with increasingly difficult scenarios. It’ll find the gaps and push harder so it orchestrates better . Repeat.

English

Jun Song@jun_song·18 May

Orchestrating with GPT 5.5 is the way to properly utilize local LLMs. Local LLMs are developing rapidly, so until they reach the frontier level, this method is the most useful. Due to cybersecurity issues, open-source models will be equal to or even stronger than frontier models in 3 months.

Jun Song@jun_song

로컬LLM 모델을 잘 활용하는 방법 중 하나 : 사용중인 Codex, Claude에게 스킬을 만들어달라고 합니다. “앞으로 너는 오케스트레이터고, 로컬LLM은 너의 지시를 따르는 부하야, 로컬LLM은 너보다 멍청하니까 지시를 내릴땐 아주 구체적으로 줘야해, 그리고 앞으로 니가 할 일은 계획을 짜서, 지시를 만들고, 작업물을 평가하는거야.” 이제 당신은 $200 구독에서 $20 구독으로 변경해도 사용한도가 충분합니다.

English

12.3K

Keşfet

@benln @typesfast @paulg @allTheYud @thsottiaux @ajambrosino @willdepue @sama