Tao Dong

1.2K posts

Tao Dong

@taodong

Human-Centered AI for Software Engineering @google. Formerly @flutterdev & @dart_lang. PhD in HCI from @umsi. Posts and replies represent personal views.

San Francisco Bay Area, USA Katılım Mart 2007

854 Takip Edilen917 Takipçiler

Tao Dong retweetledi

Teng Yan@tengyanAI·1d

something i've noticed: AI agents create a weird new kind of burnout. esp for young people. a lot of ambitious 22 year olds are going to think the answer is simple: - spin up more agents - ship more code - sleep less - outwork everyone and for a while, it will feel incredible. you can keep multiple agents running, feed them tasks, review outputs, fix mistakes, make decisions, and keep the whole loop moving. the problem is that the work no longer drains you through typing. it drains you through judgment. More attention. More context switching. More verification. More decisions per hour. so instead of 8-10 normal productive hours, you might get 4-5 extremely intense hours before your brain is fully cooked. and you feel numb until you sleep properly and reset some of my friends are already burnt out. they don't say it out loud but i can tell. the agent can keep working 24/7. the human still has a hard limit

English

307

325

4.2K

847.1K

Tao Dong@taodong·2d

This is a nice article demonstrating the trend Interaction Design is increasingly about designing the agent-computer interface and the agent-to-agent collaboration patterns. Same high-level principles, different subjects.

Teddy Riker@teddy_riker

x.com/i/article/2047…

English

Tao Dong@taodong·9 Nis

@nlycskn App users are builders 😉

English

100

Nilay Coskun@nlycskn·9 Nis

My partner's building an MCP server for basically every app he uses 😂 Now he just orders lunch and browses Zillow through a custom chat app he built himself. He literally never opens the apps anymore. We’re not just in a new era of building, we’re in a new era of app users 😁

English

1.5K

Tao Dong@taodong·22 Mar

Great article! I agree that manual error analysis is so often underappreciated. “If you are not willing to look at some data manually on a regular cadence you are wasting your time with evals.”

George from 🕹prodmgmt.world@nurijanian

x.com/i/article/2034…

English

119

Tao Dong@taodong·15 Mar

Finally had time to check out this course. It includes a nice explanation of the relationship between skills and MCPs and how these technologies can work together to solve realworld problems using agents. Highly recommend!

Andrew Ng@AndrewYNg

Important new course: Agent Skills with Anthropic, built with @AnthropicAI and taught by @eschoppik! Skills are constructed as folders of instructions that equip agents with on-demand knowledge and workflows. This short course teaches you how to create them following best practices. Because skills follow an open standard format, you can build them once and deploy across any skills-compatible agent, like Claude Code. What you'll learn: - Create custom skills for code generation and review, data analysis, and research - Build complex workflows using Anthropic's pre-built skills (Excel, PowerPoint, skill creation) and custom skills - Combine skills with MCP and subagents to create agentic systems with specialized knowledge - Deploy the same skills across Claude.ai, Claude Code, the Claude API, and the Claude Agent SDK Join and learn to equip agents with the specialized knowledge they need for reliable, repeatable workflows. deeplearning.ai/short-courses/…

English

152

Tao Dong@taodong·15 Mar

Nice article on evaluating agent skills by @tessl_io. +1 to the point that providing public visibility of skill quality for creators and users is critical to ensure the health of the skill ecosystem. tessl.io/blog/anthropic…

English

Tao Dong@taodong·11 Mar

Great primer on agent harness

Viv@Vtrivedy10

x.com/i/article/2031…

English

Tao Dong@taodong·2 Mar

Agent engineering is like parenting. Adapt your approaches as the underlying model grows. “But even then we often saw Claude forgetting what it had to do. To adapt, we inserted system reminders every 5 turns that reminded Claude of its goal. But as models improved, they not only did not need to be reminded of the Todo List but could find it limiting.”

Thariq@trq212

x.com/i/article/2027…

English

121

Tao Dong@taodong·1 Mar

“You want to give it tools that are shaped to its own abilities. But how do you know what those abilities are? You pay attention, read its outputs, experiment. You learn to see like an agent.” So beautifully said.

Thariq@trq212

x.com/i/article/2027…

English

113

Tao Dong@taodong·21 Şub

@ZhiruoW @FariaHuqOaishi Nice work!

English

Zora Wang@ZhiruoW·20 Şub

Most agents either run fully autonomously or interrupt at the wrong times. What if agents know when YOU want to step in? 🚀Introducing PlowPilot - a web agent that adapts to your interaction patterns achieving +26.5% user-reported usefulness Huge credit to @FariaHuqOaishi for leading this project!

English

121

8.1K

Tao Dong@taodong·18 Şub

@sethladd Congrats on shipping!

English

Seth Ladd@sethladd·18 Şub

One of our most requested features!

NotebookLM@NotebookLM

Because you wouldn’t let it slide… these are rolling out today for our most requested feature: Prompt-Based Revisions: Tweak, tailor, and tune your slides just by prompting the revisions you want PPTX Support: You can now export your Slide Decks (Google Slides coming next!)

English

934

Tao Dong@taodong·15 Şub

@OfficialLoganK Well, it’s technically still there but few users will likely see that section at the far bottom. The simplicity of the new design is probably great at letting returning users jump right into actions but it’s at the expense of inspiring first-timers.

English

Logan Kilpatrick@OfficialLoganK·15 Şub

@taodong still there at the bottom, just trying not to overload people

English

Logan Kilpatrick@OfficialLoganK·15 Şub

Experimenting with new AI Studio vibe coding start screens today, New vs Old. What do you think?

English

275

1.2K

124K

Tao Dong@taodong·10 Şub

AI is now "managing up" their human supervisors.

Thariq@trq212

We've added a new command to Claude Code called /insights When you run it, Claude Code will read your message history from the past month. It'll summarize your projects, how you use Claude Code, and give suggestions on how to improve your workflow.

English

Tao Dong@taodong·9 Şub

Gemini's browser use feature within Chrome is very handy when I had to deal with user-unfriendly UIs. I just helped me found a "Rename" button that was hidden when the file is running 🤷🏻‍♂️.

English

320

Tao Dong@taodong·20 Oca

Vibe-coding interactive demos to learn Reinforcement Learning in @GoogleAIStudio was fun.

English

Tao Dong@taodong·10 Oca

@sethladd Congrats, Seth! Labs is lucky to have you!

English

Seth Ladd@sethladd·9 Oca

Some bittersweet personal news: I’m transitioning to a new adventure within Google joining the Google Labs team.🧪 This means I’ll be stepping back from my active role in the Flutter community here. It’s been a privilege serving you all. I'm now a Flutter user, and I'm cheering on the team and all of you from the sidelines. Thanks for everything, Flutter fam! Keep shipping!💙

English

171

8.3K

Tao Dong@taodong·1 Oca

TL;DR: Show agency and behave like you were already hired.

Ben Lang@benln

x.com/i/article/2006…

English

Tao Dong@taodong·30 Ara

Sidenote: I also learned about a companion tool called Petri from the post. While Bloom measures, Petri is for exploration. You use it first to broadly audit a model and find weird behaviors. Once you identify a trait, you move to Bloom to quantify it. anthropic.com/research/petri…

English

Tao Dong@taodong·30 Ara

My takeaway: Bloom scales evaluation by turning anecdotes into statistics. However, it remains a simulation. Use it to quantify trends, but do not treat it as a substitute for real-world testing.

English

Tao Dong@taodong·30 Ara

I just finished reading up on Bloom, a new open-source framework for automating behavioral evaluations of AI models. It proposes a way to scale up how we measure alignment, but relying on simulations raises some interesting questions. Here are my notes 👇

Anthropic@AnthropicAI

We’re releasing Bloom, an open-source tool for generating behavioral misalignment evals for frontier AI models. Bloom lets researchers specify a behavior and then quantify its frequency and severity across automatically generated scenarios. Learn more: anthropic.com/research/bloom

English

151

Keşfet

@nlycskn @tessl_io @ZhiruoW @FariaHuqOaishi @sethladd @OfficialLoganK @GoogleAIStudio @elonmusk