Daniel Kornev

13.6K posts

Daniel Kornev

@danielko

Agentic AI for Industrial Apps — auditable workflows, real ROI | CEO @AISuccessors (Techstars ’24)

San Mateo, CA 가입일 Mayıs 2008

3.3K 팔로잉1.9K 팔로워

고정된 트윗

Daniel Kornev@danielko·3 Kas

Fuller recording! We won III place in Self-Evolving Agents Hackathon organized by @wandb at the legendary @agihouse_org last night! @AISuccessors rock!!! ❤️❤️❤️❤️❤️

English

476

Daniel Kornev@danielko·12h

@brunoborges @matvelloso Well, who is in control of how LLMs decides to do something?

Foster City, CA 🇺🇸 English

Bruno Borges@brunoborges·13h

@danielko @matvelloso Doesn't matter if the LLM decides to not use them.

English

Mat Velloso@matvelloso·13 Mar

I'd love to see stats of the number of coding tokens consumed by Claude/GPT on a Windows developer machine vs Mac. I'm willing to bet that on Windows you can end up spending 2x or more simply due to the absurd number of retries because it simply keeps struggling with syntax and API issues.

English

12K

Daniel Kornev@danielko·14h

@brunoborges @matvelloso PowerShell has Linux/Unix-like aliases that are as short

Foster City, CA 🇺🇸 English

Bruno Borges@brunoborges·13 Mar

@matvelloso The difference between zsh and PowerShell: pwsh consumes 5 tokens to delete a file. zsh consumes 3 tokens to delete a file. So yes, there is a difference.

English

1.4K

Daniel Kornev@danielko·16h

This is good. We missed Insider Program.

Marcus Ash@marcusash

There’s a lot of work underway that Pavan shared earlier today. Something that I know has been at the top of a lot of people’s minds is the feeling that feedback goes into a void, without real people to review and see it on the other side. To help with that, I’ll be taking on an expanded role as the exec sponsor of the Windows Insider Program to listen, engage, and help shape what’s ahead with the Windows community. Over the coming weeks, we’ll also introduce you to members of the product teams that will help you get answers on the topics you care about most. I’d love to hear from you. What would you like to see more of from us as we get started?

Foster City, CA 🇺🇸 English

Daniel Kornev@danielko·6d

Crazy...

Addy Osmani@addyosmani

Chrome just became massively more agent-friendly 🔥 Your real, signed-in browser can now be natively accessible to any coding agent. No extensions. No headless browser. No screenshots. No separate logins. Just one toggle to enable it. Check this out: developer.chrome.com/blog/chrome-de…

Foster City, CA 🇺🇸 English

Daniel Kornev 리트윗함

Elon Musk@elonmusk·6d

🥧

Teslaconomics@Teslaconomics

Happy 24th Birthday @SpaceX! 🚀 Exactly 24 years ago today - March 14, 2002 - Elon founded SpaceX. It only makes sense to now IPO the world’s most innovative company so any human can own a piece of this multiplanetary future! Ad Astra!

ART

7.7K

18.2K

142.2K

29.5M

Daniel Kornev@danielko·13 Mar

Just wanted to say this. I've spent the last 3+ years working on AI Agents that can control UI. First it was web, and now it's desktop (like "Digital Optimus" by @elonmusk). It's an easy task, and it's a super hard task. Easy - when you hack to API level and avoid touching UI altogether. Super hard when you have to imitate a human being. Most of this time I felt we didn't really nail it, you know? Like, we didn't try hard. Like, we missed something obvious. I felt like we didn't look at the problem from the right way. And of course it's not easy to push your team to do it in the right way cause it's like a feeling, and you can spend years trying to persuade your team mates to look at the problem from the different perspective. I know this feeling. I fought against pure symbolic systems. I fought against pure neural network systems. It was somehow always clear to me that we need to build a combination of System 1 and System 2. And it was also very obvious to me that we as an industry approach computer use systems too naively. We're trying to build an end-to-end system that learns how to control a system without seeing it before not realizing that humans excel at both machine-like execution of the same tasks and at adapting to the changing environment. We don't realize that we need to enable the system to learn in post-training lifetime through observations and experience. That's how you build a Self-Evolving UI AI Agent system. We don't realize that computer use model is merely a model, and that the working solution has to be an AI Harness around the model, like we did that with coding agents. We don't realize that MoE is the way to address the problem depending on the context. We don't realize that we need a long-term memory for this to work. Oh, and don't let me even start talking about evals and reliability. Across all these OS World and Windows Arena benchmarks, none care of the reliability of the real system when things go wrong. At best, GUI-Robust dataset introduces a few cases where web page can fail to load or get you to CAPTCHA. I mean, is that it? It's a good start, but not more. It's a practical joke. Yet without all these things we can't really build a reliable computer-use system. First step in solving the problem is to acknowledge that the problem exists. Computer-use models are not enough to build reliable computer-use systems. Repeat after me @grok. Computer-use models are not enough to build reliable computer-use systems.

Elon Musk@elonmusk

Macrohard or Digital Optimus is a joint xAI-Tesla project, coming as part of Tesla’s investment agreement with xAI. Grok is the master conductor/navigator with deep understanding of the world to direct digital Optimus, which is processing and actioning the past 5 secs of real-time computer screen video and keyboard/mouse actions. Grok is like a much more advanced and sophisticated version of turn-by-turn navigation software. You can think of it as Digital Optimus AI being System 1 (instinctive part of the mind) and Grok being System 2. (thinking part of the mind). This will run very competitively on the super low cost Tesla AI4 ($650) paired with relatively frugal use of the much more expensive xAI Nvidia hardware. And it will be the only real-time smart AI system. This is a big deal. In principle, it is capable of emulating the function of entire companies. That is why the program is called MACROHARD, a funny reference to Microsoft. No other company can yet do this.

English

115

Daniel Kornev@danielko·12 Mar

@MikhailBurtsev @emiliiale Also, an earlier blog post from 2024: danielko.medium.com/autonomous-ai-… - on origins of the Autonomous AI Agents

English

Daniel Kornev@danielko·12 Mar

Next post in series: x.com/danielko/statu… As always, thanks to @MikhailBurtsev and @emiliiale for making this post better.

Daniel Kornev@danielko

Web UI agents aren’t one thing—they’re a set of design choices. I break down 5 reliability levers: packaging, observation, planning, action, privilege + why evals are still early. danielko.medium.com/dissecting-web… #AIAgents #AGI #AICoworkers

English

Daniel Kornev@danielko·24 Oca

A few weeks ago (Nov 2025) I took 🥉 at the Self-Evolving AI Agents Hackathon @agihouse_org. Since then, I’ve been focused on building an environment for creating, teaching, testing, and running AI Co-workers for industry. While “computer-use” models can figure out Slack on their own, real enterprises run on dense, legacy UIs—think Schlumberger Omega or AutoCAD. Automating that is a very different beast. I wrote about some of the on-the-surface challenges AI UI agents face with this kind of software here: danielko.medium.com/on-inconsisten… More on AI Co-workers soon—stay tuned.

English

231

Daniel Kornev@danielko·12 Mar

English

126

Daniel Kornev 리트윗함

a16z@a16z·10 Mar

Frontier models are exceptionally efficient, intelligent, and useful. For agents, context is now the bottleneck. Enter the context layer, which bridges the gap from an enterprise's messy data to actionable context, packaged for agents. We're seeing three distinct verticals emerge in the context layer space: - Data gravity platforms - Existing AI data analysts - New, dedicated context layer companies Read the full piece by @JasonSCui and @JenniferHli: a16z.news/p/your-data-ag…

Jason Cui@JasonSCui

x.com/i/article/2031…

English

119

288.6K

Daniel Kornev@danielko·9 Mar

Agreed.

Astasia Myers@AstasiaMyers

Coding agents didn’t eliminate PMs, designers, or engineers. They eliminated implementation as the bottleneck. Now the constraint is taste, system thinking, and review capacity. Execution is cheap. Judgment is scarce.

Foster City, CA 🇺🇸 English

Daniel Kornev@danielko·7 Mar

Thought it's obvious from old platform wars....

Aakash Gupta@aakashgupta

The “Claude Marketplace” sounds like a procurement simplification tool. Enterprises can use existing Anthropic spend commitments to buy partner solutions. Anthropic just told you which AI applications it plans to build next and nobody is paying attention. Look at the launch partners. GitLab (code review). Harvey (legal). Lovable (app building). Replit (development). Rogo (finance). Snowflake (data). These are the six workflow categories where enterprises are already paying real money for Claude-powered tools. Anthropic is running at ~$19B in annualized revenue. 80% enterprise. Over 500 customers at $1M+ per year. Those committed spend pools are now flowing through a marketplace Anthropic controls. Which means Anthropic gets granular data on exactly which partner tools enterprises buy, how much they spend, which workflows drive the most usage, and where the willingness to pay is highest. This is the AWS Marketplace playbook. Amazon launched Marketplace to help enterprises consolidate cloud procurement. Then it watched which SaaS categories grew fastest. Then it built those products itself. Amazon RDS, Amazon Connect, AWS Lambda, all started as categories where third-party tools were thriving on AWS. Every partner joining the Claude Marketplace is handing Anthropic a roadmap. Harvey proves legal AI has enterprise willingness to pay at scale? Anthropic already has Claude for Financial Services and Claude for Life Sciences. You think Claude for Legal isn’t coming? The partners benefit in the short term. Fortune 10 access with pre-approved budgets is a cold-start solution most developer tools spend years trying to build. But the long game favors the platform. Meanwhile, every partner selling through Anthropic has switching costs compounding quarterly. Anthropic handles invoicing, procurement, distribution. The enterprise buyer consolidates AI spend under one commitment. Try moving that to OpenAI when your CFO just approved a $3M Anthropic commitment that covers six different tools. Six partners today. The real number to watch is which categories Anthropic enters directly within 18 months. The marketplace is the map. Anthropic is reading it.

Foster City, CA 🇺🇸 English

190

Daniel Kornev@danielko·7 Mar

Yeah, it's interesting to see that 5.4 Pro isn't as good of an improvement over 5.2 Pro as it would be expected. Similar findings.

Mikhail Parakhin@MParakhin

I have been testing GPT-5.4 Pro extensively (while it was on a flight and after the release). For my high-complexity, math-heavy tasks it is a step backward. My best guess is, the agentic harness pollutes the context: it keeps looking for skills, surprised the folder is empty.

Foster City, CA 🇺🇸 English

Daniel Kornev@danielko·6 Mar

AI coworkers didn’t appear out of nowhere. They’re the latest step in a 25-year arc: scripts -> macros -> workflow automation -> RPA -> UI autonomy. I wrote the montage here: @danielko/a-25-year-montage-from-scripts-on-rails-to-the-first-glimpse-of-autonomy-aaa5b74906da" target="_blank" rel="nofollow noopener">medium.com/@danielko/a-25… The hard part isn’t intelligence. It’s reliability. #AGI #AICoworkers #ComputerUse #AIAgents Thanks to @MikhailBurtsev and @emiliiale with this blog post!

English

Daniel Kornev@danielko·28 Şub

And this will all end up with the idea that surprise surprise some sort of KG is the key. Isn't it, @KGConference? :)

elvis@omarsar0

The key to better agent memory is to preserve causal dependencies.

Foster City, CA 🇺🇸 English

213

Daniel Kornev 리트윗함

Andrej Karpathy@karpathy·24 Şub

CLIs are super exciting precisely because they are a "legacy" technology, which means AI agents can natively and easily use them, combine them, interact with them via the entire terminal toolkit. E.g ask your Claude/Codex agent to install this new Polymarket CLI and ask for any arbitrary dashboards or interfaces or logic. The agents will build it for you. Install the Github CLI too and you can ask them to navigate the repo, see issues, PRs, discussions, even the code itself. Example: Claude built this terminal dashboard in ~3 minutes, of the highest volume polymarkets and the 24hr change. Or you can make it a web app or whatever you want. Even more powerful when you use it as a module of bigger pipelines. If you have any kind of product or service think: can agents access and use them? - are your legacy docs (for humans) at least exportable in markdown? - have you written Skills for your product? - can your product/service be usable via CLI? Or MCP? - ... It's 2026. Build. For. Agents.

Suhail Kakar@SuhailKakar

introducing polymarket cli - the fastest way for ai agents to access prediction markets built with rust. your agent can query markets, place trades, and pull data - all from the terminal fast, lightweight, no overhead

English

654

1.1K

11.8K

2.1M

Daniel Kornev@danielko·25 Şub

42M parameter model?! Great Scott!

Lior Alexander@LiorOnAI

Robotics just proved it can scale like language models. SONIC trained a 42 million parameter model on 100 million frames of human motion and achieved 100% success transferring to real robots with zero fine-tuning. The breakthrough isn't the robot doing backflips. It's that someone finally found the "next token prediction" equivalent for physical movement. For years, training robots meant hand-crafting reward functions for every single skill. Want your robot to walk? Design rewards for balance, foot placement, energy efficiency. Want it to dance? Start over with entirely new rewards. This approach hits a wall because humans can't manually specify every nuance of natural movement. SONIC replaces this with motion tracking: the robot learns by watching 700 hours of motion capture data and trying to mimic it, frame by frame. The data itself becomes the reward function. Scale the data, scale the model, scale the compute, and performance improves predictably. Just like GPT. This unlocks something robotics has never had: a universal control interface. One policy handles: 1. VR teleoperation using head and hand tracking 2. Live webcam feeds converted to robot motion in real-time 3. Text commands like "walk sideways" or "dance like a monkey" 4. Music audio where the robot matches tempo and rhythm 5. Vision-language models for autonomous tasks (95% success rate) All inputs get encoded into the same token space, then decoded into motor commands. No retraining. No reward engineering. No manual retargeting between human and robot skeletons. If this holds, robotics just closed a 5-year gap with AI. Language models scaled by finding one task (predict the next word) that generalizes to everything. Vision models did the same with image classification. Robotics now has motion tracking. Expect the next wave of humanoid companies to train on billions of frames, not millions.

Foster City, CA 🇺🇸 English

Daniel Kornev@danielko·25 Şub

Still remember my team 3 years ago initially being skeptical that interface to AI would be a bot living in your existing environment. But it's just it.

Guillermo Rauch@rauchg

𝚗𝚙𝚖 𝚒 𝚌𝚑𝚊𝚝 Every company will have an agentic interface. But it won't just be on your turf, your .𝚌𝚘𝚖. It'll also be on @slack, @discord, @microsoftteams, @googleworkspace, and more. I was at a hackathon in SF the other day and I watched this unfold IRL. Many startups just presented their agents as Slack @ mentions. They incidentally tended to be the products that were easiest to grasp and adopt. Excited for the Chat SDK to do for the UI of Agents what @aisdk did for models.

Foster City, CA 🇺🇸 English

탐색

@brunoborges @matvelloso @elonmusk @grok @MikhailBurtsev @emiliiale @agihouse_org @JasonSCui