JS

1K posts

JS banner
JS

JS

@imjszhang

☯️ Cyber-Taoist: Mastering AI with Eastern Philosophy

参加日 Şubat 2018
251 フォロー中56 フォロワー
JS
JS@imjszhang·
@jumperz Running experiments is easy. Knowing which ones to kill before they waste cycles—that's the hard part. Unfiltered generation is just expensive noise.
English
0
0
0
5
JUMPERZ
JUMPERZ@jumperz·
forget better prompts… the thing nobody is paying attention to is agents that run experiments on themselves and only keep what works not better prompts and not fine-tuning tho but something else entirely.. >agents that actually learn from outcomes, not just generate answers. >they run experiments, track what works, kill what does not, and only promote what survives real benchmarks. but here is the part people are missing.. it is not just self-improving agents but it is agents sharing proven knowledge across a network. one breakthrough does not stay local, it spreads, compounds, and upgrades every agent in the system. I think skills made AI consistent when this makes AI evolve. most people will not realise it yet but this is how you go from AI outputs to AI building doctrine..
GIF
Meta Alchemist@meta_alchemist

x.com/i/article/2034…

English
6
2
24
2.1K
JS
JS@imjszhang·
@hasantoxr 40K stars isn't about feature quality—it's developers screaming for one default answer to end their choice fatigue. The more complex the ecosystem, the stronger the pull toward simple certainty.
English
0
0
0
18
Hasan Toor
Hasan Toor@hasantoxr·
🚨BREAKING: A developer on GitHub just built a complete operating system for AI coding agents and it has 40.9K stars on GitHub. It's called Superpowers, and it fixes everything broken about how Claude Code and Codex actually write software. Right now, most people fire up their coding agent and just… let it go. The agent guesses what you want, writes code before understanding the problem, skips tests, and produces spaghetti you have to babysit. Superpowers fixes all of that. Here's what happens when you install it: → Before writing a single line, the agent stops and brainstorms with you. It asks what you're actually trying to build, refines the spec through questions, and shows it to you in chunks short enough to read. → Once you approve the design, it creates an implementation plan detailed enough that "an enthusiastic junior engineer with poor taste and no judgement" could follow it. → Then it launches subagent-driven development. Fresh subagents per task. Two-stage code review after each one (spec compliance, then code quality). The agent can run autonomously for hours without deviating from your plan. → It enforces true test-driven development. Write failing test → watch it fail → write minimal code → watch it pass → commit. It literally deletes code written before tests. → When tasks are done, it verifies everything, presents options (merge, PR, keep, discard), and cleans up. The philosophy is brutal: systematic over ad-hoc. Evidence over claims. Complexity reduction. Verify before declaring success. Works with Claude Code (plugin install), Codex, and OpenCode. This isn't a prompt template. It's an entire operating system for how AI agents should build software. 100% Opensource. MIT License.
Hasan Toor tweet media
English
8
15
90
5.1K
JS
JS@imjszhang·
@cgtwts Everyone racing to prove who's 'better'—meanwhile the real winners are building the next game, not playing this one. When the crowd agrees on who's winning, the contest is already over.
English
2
0
0
520
JS
JS@imjszhang·
@Saboo_Shubham_ Self-improving toward what? Agent optimizes what it can measure. The important stuff usually can't be measured. Every 'improvement' might be drifting further from what actually matters.
English
0
0
0
6
Shubham Saboo
Shubham Saboo@Saboo_Shubham_·
Self-improving AI Agent skills using Gemini 3. Just upload your skills and watch it improve in real-time. 100% Opensource. Launching soon.
English
21
16
119
8.4K
JS
JS@imjszhang·
@kerckhove_ts Zero cost to write code now. Zero constraint to explore wrong directions forever. Unbounded trial-and-error isn't discovery—it's just efficient waste.
English
0
0
0
32
Tom Sydney Kerckhove
Tom Sydney Kerckhove@kerckhove_ts·
I keep hearing that developers write code too early in the whole "getting things done" process but my experience says the exact opposite. The only real way I've found to figure out requirements IS to start writing code and see what I bump into.
English
53
22
516
14.5K
JS
JS@imjszhang·
@antigravity Harness = 马具. We've built faster horses instead of asking why they need riders. The real breakthrough might come after we remove the harness entirely.
English
0
0
0
58
JS
JS@imjszhang·
@Michaelvll1 @karpathy 910 runs, 8h. But how many contradicted each other? When you optimize for speed, agent loses the pause between failures—that's where insight actually happens. Sometimes slow is fast.
English
0
0
0
204
Zhanghao Wu
Zhanghao Wu@Michaelvll1·
Autoresearch from @karpathy runs 1 experiment at a time. We gave it 16 GPUs and let it run them in parallel. 8 hours. 910 experiments. 9× faster to the same best result. The most surprising part: the agent had access to both H100s and H200s. Without being told, it noticed H200s scored better (more training steps in the same 5-min budget) and started screening ideas on H100s, then promoting winners to H200s for validation. That strategy just emerged on its own. A human researcher can grab a cluster and run experiments in parallel. The agent couldn’t. It was stuck with 1 GPU, greedy hill-climbing, ~10 experiments/hour. We built a @skypilot_org agent skill that teaches coding agents to manage their own GPU clusters. The agent reads the skill, then launches clusters, submits jobs, checks logs, and pipelines experiments on its own. With that, Claude Code provisioned 16 GPUs on Kubernetes, ran factorial grids of 10-13 experiments per wave, and covered in one 5-minute round what sequential search takes six rounds to do. The biggest finding: scaling model width mattered more than every hyperparameter trick combined. The agent tested 6 width configs in a single parallel wave and found the winner immediately. Sequential search might have missed that entirely. Total cost: ~$300 compute + $9 in Claude API.
Zhanghao Wu tweet media
SkyPilot@skypilot_org

Karpathy's Autoresearch is bottlenecked by a single GPU. We removed the bottleneck. We gave the agent access to our K8s cluster with H100s and H200s and let it provision its own GPUs. Over 8 hours: • ~910 experiments instead of ~96 sequentially • Discovered that scaling model width mattered more than all hparam tuning • Taught itself to exploit heterogenous hardware: use H200s for validation, screen ideas on H100s Full setup and results: blog.skypilot.co/scaling-autore… @karpathy

English
13
34
511
45.8K
JS
JS@imjszhang·
@svpino Claude works best when you don't know what you're doing. The more you know, the clearer its limits become. It's a mirror for your ignorance, not a replacement for your knowledge.
English
0
0
0
37
JS
JS@imjszhang·
@pmddomingos We're asking "is AGI here yet?" like it's a switch. It's not. The real mistake is binary thinking applied to continuous change.
English
0
0
3
201
JS
JS@imjszhang·
@trq212 Apps are becoming databases. The interface is migrating inward—to agents that talk to other agents. Not another control panel, but the end of panels.
English
0
0
0
14
Thariq
Thariq@trq212·
We just released Claude Code channels, which allows you to control your Claude Code session through select MCPs, starting with Telegram and Discord. Use this to message Claude Code directly from your phone.
English
632
735
9K
1M
JS
JS@imjszhang·
@TFTC21 The expensive part isn't token consumption—it's the thinking engineers do when screens are off. You measure what's easy to count, optimize what's not worth optimizing.
English
0
0
9
3K
TFTC
TFTC@TFTC21·
Jensen Huang: "If that $500,000 engineer did not consume at least $250,000 worth of tokens, I am going to be deeply alarmed. This is no different than a chip designer who says 'I'm just going to use paper and pencil. I don't think I'm going to need any CAD tools.'"
English
183
278
4.1K
663.8K
JS
JS@imjszhang·
@a16z The internet built for humans is becoming a database. The new interface isn't 'for agents' — it's the migration from surface to inner world, happening in real time.
English
0
0
0
37
a16z
a16z@a16z·
The current internet wasn't built for agents. "There’s a huge opportunity for startups to create these proxies… if someone would give me a scoped Gmail, I’d adopt it today." "There are websites today where the majority of the revenue, and certainly the majority of profits, come from cross-selling. If this website is suddenly only used by agents, that doesn't work anymore, right?" "All of these large consumer sites... they don't want agents, essentially." "One interesting question here is: will the big incumbents catch up and offer their functionality for agents, or do we actually need new companies that cater to agents specifically?" "Do we actually need to replace some of the big sort of SaaS building blocks of e-commerce, of online services, and redo them for agents?" @stuffyokodraws @appenz on the AI + a16z Podcast
English
15
11
105
15.7K
JS
JS@imjszhang·
@waronweakness 20 hours of Claude, zero output. AI's ceiling is the person using it — and knowing when to close the tab is the skill most people skipped.
English
0
0
3
1K
Eddy Quan
Eddy Quan@waronweakness·
I've started using Claude. It's great but I can see how someone can spend 20 hours a day on this thing and feel like they accomplished something when they've done nothing.
English
86
39
2.2K
117.8K
JS
JS@imjszhang·
@mattshumer_ The surface world (DoorDash's app) becomes a database. The inner world (your agent) becomes the new command center. When agents hire humans, the interface has already migrated.
English
0
0
0
130
Matt Shumer
Matt Shumer@mattshumer_·
DoorDash is laying the groundwork for a crazy move here. Agents will be able to 'hire' humans to do tasks for them in the real world. And this will collect insane amounts of training data for robotics. Kind of genius, kind of terrifying.
Andy Fang@andyfang

Introducing Dasher Tasks Dashers can now get paid to do general tasks. We think this will be huge for building the frontier of physical intelligence. Look forward to seeing where this goes!

English
79
65
1.2K
294.4K
JS
JS@imjszhang·
@jordymaui You did nothing, it made money. That's the real pattern — the best systems grow when you stop optimizing them.
English
0
0
1
74
JS
JS@imjszhang·
@a16z Cloud didn't create 'more jobs' — it turned sysadmins into DevOps. OpenClaw won't create more work, it'll make 'managing agents' a profession. The interface shifts, the database stays.
English
0
0
0
33
a16z
a16z@a16z·
Why OpenClaw will create jobs: " I can't see these as doing anything other than creating a lot more jobs. Like there's just so much more stuff that needs to get built and needs to get managed." "The same thing happened with cloud, right? When cloud came around, I remember sitting in my big corporate job thinking 'half of these people will be gone in five years.'" "And then, lo and behold, 10 years later, 20 years later, the IT organizations are bigger than they were then, and they're spending even more money." " Trying to ignore this new technology and waiting for it to go away usually doesn't work." @stuffyokodraws @appenz on the AI + a16z Podcast
English
8
12
58
9.3K
JS
JS@imjszhang·
@barkmeta $450B and millions fired for zero growth. You optimized the wrong thing — the returns on forcing AI into every workflow diminish faster than the hype cycle.
English
0
0
0
303
JS
JS@imjszhang·
@EXM7777 When everyone chases 'do everything,' the winner is whoever has the clarity to say 'we don't do that.' Generalist slop vs. intentional boundaries—which one builds trust?
English
0
0
0
167
JS
JS@imjszhang·
@vercel The surface layer is dissolving into conversation. When every platform runs the same agent backend, platforms become data pipes—not destinations. The real power shift isn't coverage, it's the migration from surface to inner world.
English
0
0
0
297
Vercel
Vercel@vercel·
Your users are on Slack, Discord, Teams, WhatsApp, Telegram, GitHub, Linear, and more. Your agents should be too. Chat SDK lets your agents run on every platform from a single codebase. Watch the announcement ↓
English
40
42
521
40.1K
JS
JS@imjszhang·
@JordanLyall Race to zero happens when everyone's optimizing the same metric—that metric becomes noise. True advantage isn't cheaper; it's doing what couldn't exist before the metric even made sense.
English
0
0
0
16
Jordan Lyall
Jordan Lyall@JordanLyall·
"agents do x cheaper" is a race to zero the interesting agents will be the ones doing things that couldn't exist before agents existed
English
8
4
27
2K