Dmitry Filimonov

583 posts

Dmitry Filimonov

@dmi3f

Building Agentic AI systems. Co-founder @pyroscopeio.

Bay Area Beigetreten Eylül 2012

924 Folgt352 Follower

Dmitry Filimonov@dmi3f·5d

I tried all of the TUI coding agents and not a single one of them feels perfect, yet all have some unique features I like. Maybe the future is everyone building their own. It's easy to start and when you need new features you just say "hey copy feature X from agent Y".

English

Dmitry Filimonov@dmi3f·6d

There's been so much progress but this still feels like a major AI bottleneck.

Dmitry Filimonov@dmi3f

I wonder what it would like to automate the agency part too. I know people are working on AI agents but ironically it doesn’t feel like agents have agency. You still need to tell them what the goal is.

English

Dmitry Filimonov@dmi3f·6d

And over time the tools are getting simpler and simpler. It's the RISC-ification of LLM agents.

Dmitry Filimonov@dmi3f

I've been playing with tool_use a lot lately and keep leaning more towards a similar approach in my code — define some tools the system can use, and the business logic becomes just a prompt.

English

Dmitry Filimonov@dmi3f·6d

Assuming it's legally clean, if they have a model that is just as capable but much faster and cheaper that's more or less all you need to justify a valuation.

Aakash Gupta@aakashgupta

Cursor is raising at a $50 billion valuation on the claim that its “in-house models generate more code than almost any other LLMs in the world.” Less than 24 hours after launching Composer 2, a developer found the model ID in the API response: kimi-k2p5-rl-0317-s515-fast. That’s Moonshot AI’s Kimi K2.5 with reinforcement learning appended. A developer named Fynn was testing Cursor’s OpenAI-compatible base URL when the identifier leaked through the response headers. Moonshot’s head of pretraining, Yulun Du, confirmed on X that the tokenizer is identical to Kimi’s and questioned Cursor’s license compliance. Two other Moonshot employees posted confirmations. All three posts have since been deleted. This is the second time. When Cursor launched Composer 1 in October 2025, users across multiple countries reported the model spontaneously switching its inner monologue to Chinese mid-session. Kenneth Auchenberg, a partner at Alley Corp, posted a screenshot calling it a smoking gun. KR-Asia and 36Kr confirmed both Cursor and Windsurf were running fine-tuned Chinese open-weight models underneath. Cursor never disclosed what Composer 1 was built on. They shipped Composer 1.5 in February and moved on. The pattern: take a Chinese open-weight model, run RL on coding tasks, ship it as a proprietary breakthrough, publish a cost-performance chart comparing yourself against Opus 4.6 and GPT-5.4 without disclosing that your base model was free, then raise another round. That chart from the Composer 2 announcement deserves its own paragraph. Cursor plotted Composer 2 against frontier models on a price-vs-quality axis to argue they’d hit a superior tradeoff. What the chart doesn’t show is that Anthropic and OpenAI trained their models from scratch. Cursor took an open-weight model that Moonshot spent hundreds of millions developing, ran RL on top, and presented the output as evidence of in-house research. That’s margin arbitrage on someone else’s R&D dressed up as a benchmark slide. The license makes this more than an attribution oversight. Kimi K2.5 ships under a Modified MIT License with one clause designed for exactly this scenario: if your product exceeds $20 million in monthly revenue, you must prominently display “Kimi K2.5” on the user interface. Cursor’s ARR crossed $2 billion in February. That’s roughly $167 million per month, 8x the threshold. The clause covers derivative works explicitly. Cursor is valued at $29.3 billion and raising at $50 billion. Moonshot’s last reported valuation was $4.3 billion. The company worth 12x more took the smaller company’s model and shipped it as proprietary technology to justify a valuation built on the frontier lab narrative. Three Composer releases in five months. Composer 1 caught speaking Chinese. Composer 2 caught with a Kimi model ID in the API. A P0 incident this year. And a benchmark chart that compares an RL fine-tune against models requiring billions in training compute without disclosing the base was free. The question for investors in the $50 billion round: what exactly are you buying? A VS Code fork with strong distribution, or a frontier research lab? The model ID in the API answers that. If Moonshot doesn’t enforce this license against a company generating $2 billion annually from a derivative of their model, the attribution clause becomes decoration for every future open-weight release. Every AI lab watching this is running the same math: why open-source your model if companies with better distribution can strip attribution, call it proprietary, and raise at 12x your valuation? kimi-k2p5-rl-0317-s515-fast is the most expensive model ID leak in the history of AI licensing.

English

Dmitry Filimonov@dmi3f·6d

The most surprising thing to me is that usually it's the bigger company (by headcount) that runs into this issue, but here it's the other way around.

English

102

Dmitry Filimonov@dmi3f·14 Mar

@bcherny pls make it so that i can talk to it the same way as i talk to chatgpt voice

English

Boris Cherny@bcherny·14 Mar

🤯 You can now launch Claude Code sessions on your laptop *from your phone* This blew my mind the first time I tried it

Noah Zweben@noahzweben

Remote Control - Session Spawning: Run claude remote-control and then spawn a NEW local session in the mobile app. * Out to Max, Team, and Enterprise (>=2.1.74) *Have GH set up on mobile (relaxing soon) * Working on speeding up session start-time

English

302

274

4.6K

1.2M

Dmitry Filimonov@dmi3f·4 Mar

@clairevo 100% — lack of sleep messes with the rational parts of the brain. It was a big “aha” moment for me when the baby started sleeping through the night.

English

301

claire vo 🖤@clairevo·4 Mar

This reminds me of the best advice I got about having babies: No big decisions or statements of certainty until the *youngest* is well over two. No divorce, no quitting work forever, no deciding you hate motherhood and going on public record to say so (yikes!) When you have toddlers around, no one is sleeping, everyone is drowning, and it will *feel* terrible many days. Mostly for moms, which, like it or not, have to carry much of the physical and emotional burden those early years. But once they’re all out of diapers, you’re not nap trapped, they can make themselves a snack, and you consistently sleep through the night? The light comes back. I wish more moms were supported and gently guided in those hard early years vs exploited for clicks and quotes. It gets so much better!

Stephanie H. Murray@stephmurrayyyy

Ngl I think there is something kind of sinister about showcasing moms who are actively struggling through the early and notoriously-often-very-difficult-especially-if-you-are-undersupported stages of motherhood in a piece supposedly about "parental regret."

English

2.1K

304.7K

Dmitry Filimonov@dmi3f·19 Şub

@mcuban labs likely run models at high margins. open-source models could lower costs short term. running locally means paying mainly for hardware and electricity. but i guess if everyone does that, hardware and power costs rise so adoption won’t be instant. but long term agents will win

English

Mark Cuban@mcuban·19 Şub

This is the smartest counter I’ve seen to ai taking over jobs, in the short term. Is the ((aggregate tokens cost to do what an employee does + plus fully encumbered developer and maintenance costs ) / (fully encumbered employee cost ) )<= productivity ? If it takes 8 Claude agents, at $300 for tokens, per day, plus $200 per day in dev/maint , to do what an employee does per day, at a fully encumbered cost of $1200. That’s 2600/1200. But then you need to factor in the productivity rate. Is it more than 2.16 x productive ? Are there qualitative issues like morale, morality, whatever , that can’t be quantified, that need to go into the decision? What is the going forward progression of burdened costs for the tokens ? Curious what people think about this ?

The All-In Podcast@theallinpod

What Happens When AI Tokens Cost More Than Your Employees? @Jason: “We, with our agents, hit $300/day per agent using the Claude API, like instantly. And that was doing, maybe, 10 or 20%. That's $100k/year per agent.” @chamath: “We're getting to a place where we have to basically now say, ‘What is the token budget that we're willing to give our best devs?’” “And then if you aggregate it across all people, you can clearly see a trend where you're like, ‘Well, hold on a second, now they need to be at least 2x as productive as another employee.’” “That is actively happening inside my business, because otherwise I'll run out of money.” Jason: “Yeah. This is a very interesting trend that you're not going to hear anybody else talk about, but when do tokens outpace the salary of the employee?” “Because you're about to hit it. I'm about to hit it.”

English

865

416

5.8K

2.3M

Dmitry Filimonov@dmi3f·12 Şub

always has been

Max Rozen@RozenMD

wild how a fast CI/CD pipeline with good test coverage is basically a competitive advantage now

English

Dmitry Filimonov@dmi3f·10 Şub

@owengretzinger haha, saw your previous tweet and started building the same thing. I love how quickly you can just build things these days

English

124

owen@owengretzinger·10 Şub

release & demo soon, in the meantime give it a star ⭐️ github.com/owengretzinger…

English

2.2K

owen@owengretzinger·10 Şub

it's 2026 i built it myself

owen@owengretzinger

past few days i've tried every agent orchestration app all i want is 1) sidebar for managing worktrees 2) terminal (for claude code) 3) diff view & file editor 4) cron jobs it's 2026 maybe i'll just build it myself

English

238

35.6K

Dmitry Filimonov@dmi3f·20 Oca

And conversely, it was designed by only about five people.

Anna Riedl@AnnaLeptikon

Somehow it was learning how many people are fulltime employed to maintain the Golden Gate Bridge that flipped something inside of me in my understanding of the entropic force civilization has to constantly fight against. Before that moment I thought — I had not applied real conscious thought — you simply build a building or anything really and then you just … have it. After that I understood everything is constantly at the brink of being lost.

English

Dmitry Filimonov@dmi3f·26 Kas

@scottastevenson @JoshConstine @nikitabier i’m convinced at this point that this bug increases engagement that’s why it’s never fixed

English

Scott Stevenson@scottastevenson·25 Kas

@JoshConstine 😆Article will go down in the history books! @nikitabier still gotta fix the feed refresh. Happens to me daily.

English

Scott Stevenson@scottastevenson·25 Kas

3 months later, the GPT3 API was launched. Absurd.

English

735

212.4K

Dmitry Filimonov@dmi3f·14 Eki

@alexeykozy I’ve found that for performance or memory leak issues, connecting Cursor to an MCP server with profiling data gives really good results.

English

Alexey@alexeykozy·10 Eki

Been hacking on a new kind of debugger — one where the model sees the runtime. Early signals are strong. Watching it dig through the toughest bugs in a massive codebase in real time is wild 🤯

English

635

251.7K

Dmitry Filimonov@dmi3f·3 Tem

I wish our industry did 1 week trials instead of 8-round interviews. Part ways quickly if it’s not a fit. It’s a lot harder to fake being good at the actual job.

Suhail@Suhail

PSA: there’s a guy named Soham Parekh (in India) who works at 3-4 startups at the same time. He’s been preying on YC companies and more. Beware. I fired this guy in his first week and told him to stop lying / scamming people. He hasn’t stopped a year later. No more excuses.

English

875

Dmitry Filimonov@dmi3f·3 Tem

@Sirupsen A lot of database-on-top-of-S3 solutions are essentially adding a buffering layer that holds recent data, ensures durability, and flushes it to S3 in larger, consistent chunks. With conditional writes, much of the complexity can go away — really cool to see you all using it.

English

Simon Eskildsen@Sirupsen·3 Tem

conditional puffin’

turbopuffer@turbopuffer

S3’s conditional writes were the key ingredient that made our architecture possible now it's our time to pay it forward

English

3.1K

Dmitry Filimonov retweetet

tiffany jernigan 🍃@tiffanyfayj·3 Tem

Hey folks, on Thursday July 3rd at 8a PT/11 ET/5p CEST is the @PyroscopeIO community call. Bryan Huhta is going to talk about the #Pyroscope tools in the @Grafana MCP server and some of what’s new since the last call. Come join us! youtube.com/watch?v=XNj298…