Xiaofan Wu

394 posts

Xiaofan Wu

@xfanwu

AI Native engineer. Co-founder @Chance_vision, building agents that can (( see ))

✈ Katılım Temmuz 2009

267 Takip Edilen75 Takipçiler

Sabitlenmiş Tweet

Xiaofan Wu@xfanwu·27 Mar

Proud of what the team has achieved! This is the shift we've been working toward. We designed the Visual Agent to understand the world with just one snap. No prompts required from day 1. Real-time, continuous reasoning, and it's now LIVE at Hong Kong Art Central. #VisualAgent #ChanceAI @Chance_vision

Dr. Xi Zeng@xiz25

The input box was never the interface. It was the limitation. You don’t walk into an art fair and type prompts. You look. You feel. You understand. A new behavior was born: You look → “Show me how YOU see this.” Introducing: Chance Visual Agent — Live Official AI Partner of Art Central — the 100,000-visitor international art fair packed with complex, subjective real-world scenarios. The first experiment where AI stops answering… and starts seeing the world with us. → chance.vision What are you looking at first? 👇 #VisualAgent #ChanceAI @Chance_vision

English

205

Xiaofan Wu@xfanwu·5d

that’s so cool!

merve@mervenoyann

browse all the Hermes Agent compatible models here huggingface.co/models?apps=he… all traces huggingface.co/datasets?forma… @NousResearch @Teknium

English

Xiaofan Wu@xfanwu·6d

I thought they could run my account better than me.😂

English

Xiaofan Wu@xfanwu·1 May

Siemens just shipped an AI agent that writes PLC code for industrial automation. PLC code runs factory floors. Bugs don't cause a bad user experience — they cause injuries. if even that industry is shipping AI coding agents, the 'AI isn't ready for production' argument is running out of places to hide

English

Xiaofan Wu@xfanwu·30 Nis

new Monte Carlo report: 64% of enterprises deployed AI agents before they felt ready among engineers specifically, the people keeping it running, that number is 75% we're not in an AI adoption problem. we're in a 'ship it, fix it later' problem

English

Xiaofan Wu@xfanwu·30 Nis

the 'agent vs automation' debate misses the point a Zapier flow fails silently and you never know an agent fails loudly, blames the wrong tool, and hallucinates a fix both are wrong. but only one is going to surprise you at 2am

English

Xiaofan Wu@xfanwu·30 Nis

running an agent in production isn't a model cost problem it's a token throughput problem one user task: 3 tool calls, 2 retries, a context rebuild = 40k tokens multiply by 1000 users/day and you're not shipping a feature, you're running an LLM furnace nobody talks about this until the AWS bill arrives

English

Xiaofan Wu@xfanwu·30 Nis

most 'AI agents' in production right now are just if-else trees with a GPT call in the middle I've seen 0M ARR products that are essentially: parse intent → run a lookup → format output not dunking on it — that's probably fine for their use case but let's stop calling it agentic

English

Xiaofan Wu@xfanwu·29 Nis

debugging a regular app: add a breakpoint, step through, see exactly what went wrong debugging an agent: read 3000 tokens of LLM output, guess which decision was wrong, pray your next run replicates it we've built so much agent capability and almost zero agent observability

English

Xiaofan Wu@xfanwu·29 Nis

everyone benchmarks the model. nobody benchmarks the orchestration. latency from tool call round-trips, retry logic, context rebuilding — that's where your agent goes from 'impressive demo' to 'too slow to ship' the model is rarely the bottleneck. the plumbing is.

English

Xiaofan Wu@xfanwu·29 Nis

agent memory is the part nobody warns you about you build the loop, the tools work, the evals look fine — then you realize the agent has no idea what it did 5 minutes ago RAG helps. summaries help. but it's still mostly duct tape tbh I don't think anyone has actually solved this yet

English

Xiaofan Wu@xfanwu·29 Nis

evals for agents are still embarrassingly primitive you can unit test a function. you can't unit test 'did the agent make the right judgment call.' we're shipping agent features with vibes-based QA and hoping prod doesn't explode

English

Xiaofan Wu@xfanwu·28 Nis

your job as an engineer isn't disappearing. it's changing shape. used to be: write the code, own the logic. now: the agent writes the code. you review, redirect, catch the subtle wrong. that second job is harder. you need deep enough knowledge to spot what looks right but isn't. the engineers who get lazy about understanding systems are the ones who get replaced.

English

Xiaofan Wu@xfanwu·28 Nis

most 'AI agent frameworks' I've looked at are just if-else trees with an LLM call at the leaf node that's not an agent. that's a chatbot with extra steps actual agent behavior: the model decides the structure, not just the values

English

Xiaofan Wu@xfanwu·28 Nis

one agent is hard. three agents talking to each other is a different sport entirely. most multi-agent failures aren't model failures. they're communication failures. who owns state? who retries? who breaks the loop? we got this wrong twice before we got it right

English

117

Xiaofan Wu@xfanwu·28 Nis

the question isn't 'agent or automation' it's: what happens when it's wrong? automation fails loudly. you fix the rule. an agent fails quietly. it made a judgment call, got 80% there, and you won't know until downstream. that's not a reason to avoid agents. it's the whole engineering problem.

English

Xiaofan Wu@xfanwu·27 Nis

Siemens just shipped an AI agent that generates PLC code for factory automation. not "copilot for developers." actual industrial control systems. the software engineer disruption story just got a lot bigger than web dev and CRUD apps.

English

Xiaofan Wu@xfanwu·27 Nis

the gap between 'AI agent demo' and 'AI agent in production' is the largest gap I've seen in software engineering in 10 years demo: happy path, clean inputs, someone watching production: weird edge cases, bad data, nobody watching most teams haven't crossed it yet

English

Xiaofan Wu@xfanwu·27 Nis

everyone's racing to stuff more context into agents 1M tokens! 10M tokens! in practice, agents that work in prod don't use 90% of what you give them. they need to know what to forget. the real problem isn't context size. it's knowing what actually matters for the next decision.

English

Xiaofan Wu@xfanwu·27 Nis

everyone wants to hire an "AI agent engineer" now but most job descriptions I see are just: call OpenAI API, add retry logic, maybe a tool or two that's not agent engineering. that's scripted automation with a better prompt real agent engineering is about state, memory, recovery, multi-step planning under uncertainty we're still early. the job title is ahead of the actual practice

English

Xiaofan Wu@xfanwu·26 Nis

we talk about agents like they're isolated tools but we're building multi-agent systems now — one agent calling another calling another and nobody's asking: why should agent B trust what agent A told it? agent-to-agent trust is the security problem nobody's thinking about yet

English

Keşfet

@elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA @nikifrancismediavine @katyperry