Xiaofan Wu

394 posts

Xiaofan Wu

Xiaofan Wu

@xfanwu

AI Native engineer. Co-founder @Chance_vision, building agents that can (( see ))

Katılım Temmuz 2009
267 Takip Edilen75 Takipçiler
Sabitlenmiş Tweet
Xiaofan Wu
Xiaofan Wu@xfanwu·
I thought they could run my account better than me.😂
English
0
0
0
8
Xiaofan Wu
Xiaofan Wu@xfanwu·
Siemens just shipped an AI agent that writes PLC code for industrial automation. PLC code runs factory floors. Bugs don't cause a bad user experience — they cause injuries. if even that industry is shipping AI coding agents, the 'AI isn't ready for production' argument is running out of places to hide
English
0
0
0
31
Xiaofan Wu
Xiaofan Wu@xfanwu·
new Monte Carlo report: 64% of enterprises deployed AI agents before they felt ready among engineers specifically, the people keeping it running, that number is 75% we're not in an AI adoption problem. we're in a 'ship it, fix it later' problem
English
0
0
0
14
Xiaofan Wu
Xiaofan Wu@xfanwu·
the 'agent vs automation' debate misses the point a Zapier flow fails silently and you never know an agent fails loudly, blames the wrong tool, and hallucinates a fix both are wrong. but only one is going to surprise you at 2am
English
0
0
0
5
Xiaofan Wu
Xiaofan Wu@xfanwu·
running an agent in production isn't a model cost problem it's a token throughput problem one user task: 3 tool calls, 2 retries, a context rebuild = 40k tokens multiply by 1000 users/day and you're not shipping a feature, you're running an LLM furnace nobody talks about this until the AWS bill arrives
English
4
0
1
32
Xiaofan Wu
Xiaofan Wu@xfanwu·
most 'AI agents' in production right now are just if-else trees with a GPT call in the middle I've seen 0M ARR products that are essentially: parse intent → run a lookup → format output not dunking on it — that's probably fine for their use case but let's stop calling it agentic
English
0
0
0
20
Xiaofan Wu
Xiaofan Wu@xfanwu·
debugging a regular app: add a breakpoint, step through, see exactly what went wrong debugging an agent: read 3000 tokens of LLM output, guess which decision was wrong, pray your next run replicates it we've built so much agent capability and almost zero agent observability
English
0
0
0
11
Xiaofan Wu
Xiaofan Wu@xfanwu·
everyone benchmarks the model. nobody benchmarks the orchestration. latency from tool call round-trips, retry logic, context rebuilding — that's where your agent goes from 'impressive demo' to 'too slow to ship' the model is rarely the bottleneck. the plumbing is.
English
0
0
0
13
Xiaofan Wu
Xiaofan Wu@xfanwu·
agent memory is the part nobody warns you about you build the loop, the tools work, the evals look fine — then you realize the agent has no idea what it did 5 minutes ago RAG helps. summaries help. but it's still mostly duct tape tbh I don't think anyone has actually solved this yet
English
0
0
0
12
Xiaofan Wu
Xiaofan Wu@xfanwu·
evals for agents are still embarrassingly primitive you can unit test a function. you can't unit test 'did the agent make the right judgment call.' we're shipping agent features with vibes-based QA and hoping prod doesn't explode
English
0
0
0
4
Xiaofan Wu
Xiaofan Wu@xfanwu·
your job as an engineer isn't disappearing. it's changing shape. used to be: write the code, own the logic. now: the agent writes the code. you review, redirect, catch the subtle wrong. that second job is harder. you need deep enough knowledge to spot what looks right but isn't. the engineers who get lazy about understanding systems are the ones who get replaced.
English
0
0
0
10
Xiaofan Wu
Xiaofan Wu@xfanwu·
most 'AI agent frameworks' I've looked at are just if-else trees with an LLM call at the leaf node that's not an agent. that's a chatbot with extra steps actual agent behavior: the model decides the structure, not just the values
English
0
0
0
12
Xiaofan Wu
Xiaofan Wu@xfanwu·
one agent is hard. three agents talking to each other is a different sport entirely. most multi-agent failures aren't model failures. they're communication failures. who owns state? who retries? who breaks the loop? we got this wrong twice before we got it right
English
0
0
0
117
Xiaofan Wu
Xiaofan Wu@xfanwu·
the question isn't 'agent or automation' it's: what happens when it's wrong? automation fails loudly. you fix the rule. an agent fails quietly. it made a judgment call, got 80% there, and you won't know until downstream. that's not a reason to avoid agents. it's the whole engineering problem.
English
0
0
1
13
Xiaofan Wu
Xiaofan Wu@xfanwu·
Siemens just shipped an AI agent that generates PLC code for factory automation. not "copilot for developers." actual industrial control systems. the software engineer disruption story just got a lot bigger than web dev and CRUD apps.
English
0
0
0
21
Xiaofan Wu
Xiaofan Wu@xfanwu·
the gap between 'AI agent demo' and 'AI agent in production' is the largest gap I've seen in software engineering in 10 years demo: happy path, clean inputs, someone watching production: weird edge cases, bad data, nobody watching most teams haven't crossed it yet
English
0
0
0
11
Xiaofan Wu
Xiaofan Wu@xfanwu·
everyone's racing to stuff more context into agents 1M tokens! 10M tokens! in practice, agents that work in prod don't use 90% of what you give them. they need to know what to forget. the real problem isn't context size. it's knowing what actually matters for the next decision.
English
0
0
0
6
Xiaofan Wu
Xiaofan Wu@xfanwu·
everyone wants to hire an "AI agent engineer" now but most job descriptions I see are just: call OpenAI API, add retry logic, maybe a tool or two that's not agent engineering. that's scripted automation with a better prompt real agent engineering is about state, memory, recovery, multi-step planning under uncertainty we're still early. the job title is ahead of the actual practice
English
0
0
0
41
Xiaofan Wu
Xiaofan Wu@xfanwu·
we talk about agents like they're isolated tools but we're building multi-agent systems now — one agent calling another calling another and nobody's asking: why should agent B trust what agent A told it? agent-to-agent trust is the security problem nobody's thinking about yet
English
0
0
0
10