Henry Mao

6.5K posts

Henry Mao banner
Henry Mao

Henry Mao

@Calclavia

Co-founder/CEO @ https://t.co/jUkhv9ulAd @SmitheryDotAI, backed by @southpkcommons; Prev. co-founded https://t.co/PrvwcJieE7 (exited)

Singapore Katılım Mayıs 2009
594 Takip Edilen6.6K Takipçiler
Sabitlenmiş Tweet
Henry Mao
Henry Mao@Calclavia·
The future of the internet will be dominated by tool calls, not clicks. We're building Smithery to orchestrate this new era of AI-native services for AI agents. Read our mission 👇
Henry Mao tweet media
English
11
14
116
27.1K
Linda Chen
Linda Chen@linderps·
how to find a gf in sf: - dress nicely. no tech bro logos. style your hair, wear a nicely fitted tee, go to the gym - go where she goes. farmer's markets, coffee shops, dinner parties. get a sexy hobby - say hi. compliment her style, her bag, her hair. find out what she's doing there. dont bring up AI or agents lastly, don't give up. you don't find her by waiting, you find her by trying. good luck 🫶
Don@donatelli2026

question for the tech bros: what's the best way to find a girlfriend in San Francisco?

English
98
15
838
403.4K
Henry Mao
Henry Mao@Calclavia·
@kunchenxyz This is a valid criticism. I did a simple transform to control for variability in design, otherwise it'll be too hard to compare. Ultimately, given the minor difference in performance, I think a well crafted CLI/MCP will triumph over the other.
English
0
0
1
17
Kun Chen
Kun Chen@kunchenxyz·
Great analysis. However the approach of using heuristics to transform REST APIs to CLI and MCP makes this comparison not representative of the true debate. A great MCP server and agent-first CLI should not be made through a generic transformation from REST. They need to be purposefully designed, just like how much time people spend on designing a great user interface for our phones and apps. I know the article touched on the “interface design” element and dismissed it, but I think that’s part of the debate, and since the analysis is benchmarking token efficiency and success rate and trying to settle the debate with today’s models, it can’t ignore how much impact the interface design can have on the results. There’s a massive difference between a CLI / MCP that’s well optimized for agent ergonomics vs a wrapper of REST APIs, and I hope to see an analysis that truly covers that.
English
1
0
0
35
Arvind Jain
Arvind Jain@jainarvind·
MCP isn’t dead – it was just pointed at the wrong problem. On your laptop, CLIs and ad-hoc wrappers win. But at company scale, you need central auth, shared telemetry, and one integration surface for every AI host, and that’s exactly where remote MCP servers start to shine.
Tony Gentilcore@tonygentilcore

x.com/i/article/2033…

English
8
7
83
18.7K
Rhys
Rhys@RhysSullivan·
skills is still not sitting right with me as a concept i think it's because companies rushed to them as the next big thing as is what happens with all ai things now everyone is their docs as skills but it's recreating all the issues (authority, up to dateness) docs solved
English
72
7
263
29K
Henry Mao
Henry Mao@Calclavia·
@brianchew We might start making flame merch if enough people want it 🤔
English
1
0
1
12
Brian Chew
Brian Chew@brianchew·
@Calclavia props to the design folk who did the OG design.. the mascot is really neat! look at how cute it is
Brian Chew tweet media
English
1
0
1
87
Henry Mao
Henry Mao@Calclavia·
@zeeg Codebases become hard to understand when you move too fast with LLMs. What helps is trying to move more slowly despite having fast tools. Takes discipline.
English
0
0
0
100
David Cramer
David Cramer@zeeg·
im fully convinced that LLMs are not an actual net productivity boost (today) they remove the barrier to get started, but they create increasingly complex software which does not appear to be maintainable so far, in my situations, they appear to slow down long term velocity
English
467
227
3.5K
657.2K
Henry Mao
Henry Mao@Calclavia·
Small feature launch of the day: You can now upload/host your Skills on Smithery instead of being limited to only GitHub URLs.
Henry Mao tweet media
English
2
3
12
781
Sophia Xu
Sophia Xu@thesophiaxu·
tool i've been building: it OCRs my screen in background (like Rewind), hierarchially-summarizes into a timeline, then makes it available via a local api then i just point claude code at it and ask it to identify my inefficient workflows, and it found a bunch
English
9
4
109
15.2K
Henry Mao
Henry Mao@Calclavia·
Annual plans for AI subscriptions are now a nonstarter. You have to be ready to hop to the best tool of the month.
English
0
0
2
276
Henry Mao
Henry Mao@Calclavia·
Is it just me, or are CAPTCHAs hard enough now that I'm starting to fail more?
English
3
0
2
464
Henry Mao
Henry Mao@Calclavia·
@oscrhong As you're executing a long running task and you notice it goes in the wrong direction, you can point it in the right way!
English
1
0
1
22
Henry Mao
Henry Mao@Calclavia·
Steering on Codex is an underrated feature
English
2
0
9
510
Yoko
Yoko@stuffyokodraws·
You open @openclaw today thinking you will be vibe automating and vibe coding 10 hrs later you are just vibe copy pasting API keys and creating 11th account for Openclaw Can someone solve this integration problem for agents
English
143
6
381
55K
Henry Mao
Henry Mao@Calclavia·
@karpathy Agents today are productive, not yet creative
English
0
0
0
111
Andrej Karpathy
Andrej Karpathy@karpathy·
I had the same thought so I've been playing with it in nanochat. E.g. here's 8 agents (4 claude, 4 codex), with 1 GPU each running nanochat experiments (trying to delete logit softcap without regression). The TLDR is that it doesn't work and it's a mess... but it's still very pretty to look at :) I tried a few setups: 8 independent solo researchers, 1 chief scientist giving work to 8 junior researchers, etc. Each research program is a git branch, each scientist forks it into a feature branch, git worktrees for isolation, simple files for comms, skip Docker/VMs for simplicity atm (I find that instructions are enough to prevent interference). Research org runs in tmux window grids of interactive sessions (like Teams) so that it's pretty to look at, see their individual work, and "take over" if needed, i.e. no -p. But ok the reason it doesn't work so far is that the agents' ideas are just pretty bad out of the box, even at highest intelligence. They don't think carefully though experiment design, they run a bit non-sensical variations, they don't create strong baselines and ablate things properly, they don't carefully control for runtime or flops. (just as an example, an agent yesterday "discovered" that increasing the hidden size of the network improves the validation loss, which is a totally spurious result given that a bigger network will have a lower validation loss in the infinite data regime, but then it also trains for a lot longer, it's not clear why I had to come in to point that out). They are very good at implementing any given well-scoped and described idea but they don't creatively generate them. But the goal is that you are now programming an organization (e.g. a "research org") and its individual agents, so the "source code" is the collection of prompts, skills, tools, etc. and processes that make it up. E.g. a daily standup in the morning is now part of the "org code". And optimizing nanochat pretraining is just one of the many tasks (almost like an eval). Then - given an arbitrary task, how quickly does your research org generate progress on it?
Thomas Wolf@Thom_Wolf

How come the NanoGPT speedrun challenge is not fully AI automated research by now?

English
561
802
8.7K
1.6M