Mateusz Kelner

483 posts

Mateusz Kelner

@MKelner

Building AGI @ https://t.co/IvP9rlX6pK Previously bootstrapped a consumer company to {redacted} revenue run rate

Katılım Eylül 2012

1.6K Takip Edilen255 Takipçiler

Mateusz Kelner@MKelner·34m

@creatine_cycle Recycling this one real fast man

English

atlas@creatine_cycle·38m

really heartbreaking to see openai take compute away from sora. terrible news for those who goon to fictional characters like spongebob and judy hopps from zootopia

English

729

Mateusz Kelner@MKelner·1h

@Vtrivedy10 Looks like the blog post link isn’t public

English

Viv@Vtrivedy10·4h

if you’re using deepagents in prod like the Moda team, would love to hear from you on how we can help and share your story! i know the langchain community has been cookin up some really great products on deepagents, please reach out :)

LangChain@LangChain

Congrats @anvisha and the @trymoda team on the launch. Moda built a design platform that turns non-designers into design pros. Under the hood: Deep Agents powering the design agents, with LangSmith providing the observability layer. Lots of smart context engineering in this one. How they built it: langchain-blog.ghost.io/ghost/#/editor…

English

1.9K

Mateusz Kelner@MKelner·9h

@dillon_mulroy What’s the thing you appreciate the most about it?

English

438

Dillon Mulroy@dillon_mulroy·9h

> stop chasing “the next thing” thats exactly why i use svelte

Robin Ebers | AI Coach for Founders@robinebers

yeah, but no context is important here svelte is for tinkerers and shiny object chasers next.js works and there’s no real reason to switch other than not wanting the deep vercel integration focus on building stop chasing “the next thing”

English

163

15.3K

Mateusz Kelner@MKelner·1d

@dexhorthy @southpolesteve And they dropped a new one today cursor.com/blog/fast-rege…

English

dex@dexhorthy·1d

@southpolesteve I feel like they used to right, like og cursor did this a lot

English

917

Steve Faulkner@southpolesteve·1d

Its wild to me that grep/ripgrep is state of the art locally for agents. The harnesses should ship semantic local search and indexing

English

10.5K

Mateusz Kelner@MKelner·1d

It's both. Websites have been fighting against automations and scraping for a long time now. CUA being bad doesn't help. The only usable thing for browser automation right now IMO is @Stagehanddev or @browser_use paired with a browser provider like @browserbase ideally with proxies turned on

English

463

sarah guo@saranormous·1d

watching claude try to use the browser...are websites being adversarial to computer use on purpose? or is CUA still that bad

English

137

401

110.1K

Mateusz Kelner@MKelner·1d

@headinthebox @ujjwalscript Anything you found particularly helpful and can share?

English

Erik Meijer@headinthebox·2d

@ujjwalscript ... Start building deterministic guardrails where AI is the engine, but the engineer holds the steering wheel ... You use math to bash people's naive assumption, but then you wave your hands widly to make your own point.

English

2.2K

Ujjwal Chadha@ujjwalscript·2d

Your AI Agent is mathematically guaranteed to FAIL. This is the dirty secret the industry is hiding in 2026. Everyone on your timeline is currently bragging about their "Multi-Agent Swarms." Founders are acting like chaining five AI agents together is going to replace their entire engineering team overnight. Here is the reality check: It’s a mathematical illusion. Let’s look at the actual numbers. Say you have a state-of-the-art AI agent with an incredible 85% accuracy rate per action. In a vacuum, that sounds amazing. But an "autonomous" workflow isn't one action. It’s a chain. Read the ticket ➡️ Query the DB ➡️ Write the code ➡️ Run the test ➡️ Commit. Let's do the math on a 10-step process: $0.85^10= 0.19$ Your "revolutionary" autonomous system has a 19% success rate. And the real-world data proves it. Recent studies out of CMU this year show that the top frontier models are failing at over 70% of real-world, multi-step office tasks. We are officially in the era of "Agent Washing." Startups are rebranding complex, buggy software as "autonomous agents" to look cool, but they are ignoring the scariest part: AI fails silently. When traditional code breaks, it crashes and throws a stack trace. When an AI agent breaks, it doesn't crash. It just confidently hallucinates a fake database entry, sidesteps a broken API by faking the response, and keeps running—corrupting your data for weeks before you notice. If your "automated" system requires a senior engineer to spend three hours digging through prompt logs to figure out why the bot made a "creative decision," you didn't save any time. You just invented a highly expensive, unpredictable form of technical debt. Stop trying to build fully autonomous swarms to replace human judgment. Start building deterministic guardrails where AI is the engine, but the engineer holds the steering wheel

English

155

456

36.9K

Mateusz Kelner@MKelner·2d

@JoshLu I mean if you're product doesn't grow organically but you have a profitable paid acquisition channel you still have something of value and are generating money.

English

Josh Lu@JoshLu·2d

If a company is spending money and expects that amount to be less than LTV or eLTV (2nd order referrals, seeding a network) then it’s paid marketing. Net, you are spending cash to grow. That’s why this whole thing is silly. Ofc great products have natural virality but it would be insane for anyone not to apply dollars as leverage on top Dollars as leverage, like with anything else, is good when the underlying (product, deal, asset, etc) is good and bad in the inverse

English

2.4K

Mateusz Kelner@MKelner·2d

@linderps Went to brunch at Balboa today, did not find anyone but got a great french toast for $20. Great success

English

Linda Chen@linderps·4d

sexy things to do in sf this weekend: - friday: salsa / bachata class at space550, stay for the social afterwards - saturday: farmers market at the ferry building (go before 11am) - afternoon coffee at stable cafe in their patio, reflect on ur life or smthg - sunday brunch at balboa (unironically good food) hope everyone finds each other this weekend 🫶

Linda Chen@linderps

weekend reminder: go do sexy activities to meet sexy people. when i used to dance bachata, i remember constantly thinking… how am i surrounded by beautiful, sexy, feminine women and somehow there's not enough men??? try something new. go where the sexy people go.

San Francisco, CA 🇺🇸 English

245

48.7K

Mateusz Kelner@MKelner·2d

The AGI is here. AGI's UX? Still catching up. That was the whole point of the Agent Glow Up Hackathon and the teams delivered. Congrats to the winners and huge thanks the organizers @buildclub_ , @Google, @AgoraIO, @adalengineer x @ExaAILabs, @gmi_cloud, and @WorkOS!

English

342

Mateusz Kelner@MKelner·2d

@tylercosgrove Nominative determinism at it's finest

English

3.2K

Tyler Cosgrove@tylercosgrove·2d

joe wise-and-tall

English

921

64.7K

Mateusz Kelner@MKelner·2d

@jeff_weinstein Emailed!

English

110

Jeff Weinstein@jeff_weinstein·3d

🚧 looking for 3 developers who like to try new tools and give (critical) feedback—this weekend... we have a new cmd line tool for those building new apps. if you're willing to write up your thoughts or send a video feedback walking through it, dm or email jweinstein at stripe.

English

14.7K

Mateusz Kelner@MKelner·2d

@RhysSullivan Codex 5.3-xhigh was pretty good at it, haven't tested 5.4 on this task. Opus was missing stuff for me

English

261

Rhys@RhysSullivan·2d

are any of the models actually good at doing large refactors? i have to spend so much time fighting with them to not take shortcuts and actually make large changes to code

English

104

158

24.6K

Mateusz Kelner@MKelner·2d

@minhsmind @esha_hq See you next time!

English

Minh Do@minhsmind·2d

@MKelner @esha_hq I wish I met you last night. I was having a hard time finding people who wanted to engage talking about the film.

English

Esha@esha_hq·2d

who already watched project hail mary and is it good?

DiscussingFilm@DiscussingFilm

‘PROJECT HAIL MARY’ has earned $33.1M in the film's domestic opening day. Biggest domestic opening day ever for any non-franchise film. Read our review: bit.ly/DFMary

English

Mateusz Kelner@MKelner·2d

@esha_hq It inspired me to watch the Martian so tbd 🤠

English

Esha@esha_hq·2d

@MKelner do you prefer it to the martian or is it hard to compare

English

185

Mateusz Kelner@MKelner·3d

@gauravisnotme Any findings you can share on how to solve this/make it better? I am having the same issues and the only helpful thing I found is having the agent ask you questions until it has no more questions

English

Gaurav@gauravisnotme·3d

Every day I relate more and more with what Andrej says - we are the bottleneck in the agent loop. From being amazed by the capabilities of what I have been able to achieve with the current agent harnesses, I have started becoming more frustrated with myself for not being descriptive enough, not being intelligent enough that my agent workflow requires an input from me every 30 minutes or so.

sarah guo@saranormous

Caught up with @karpathy for a new @NoPriorsPod: on the phase shift in engineering, AI psychosis, claws, AutoResearch, the opportunity for a SETI-at-Home like movement in AI, the model landscape, and second order effects 02:55 - What Capability Limits Remain? 06:15 - What Mastery of Coding Agents Looks Like 11:16 - Second Order Effects of Coding Agents 15:51 - Why AutoResearch 22:45 - Relevant Skills in the AI Era 28:25 - Model Speciation 32:30 - Collaboration Surfaces for Humans and AI 37:28 - Analysis of Jobs Market Data 48:25 - Open vs. Closed Source Models 53:51 - Autonomous Robotics and Atoms 1:00:59 - MicroGPT and Agentic Education 1:05:40 - End Thoughts

English

5.7K

Mateusz Kelner@MKelner·4d

@yrechtman Desirability/possibility matrix hits the bullseye. The possibility coverage will only grow but desirability will stay pretty much the same as long as we don't build our entire culture around AI

English

196

yoni rechtman@yrechtman·4d

x.com/i/article/2034…

ZXX

14.5K

Mateusz Kelner@MKelner·4d

@0xblacklight Will it make it to the next riptide update? 🤨

English

Kyle Mistele 🏴‍☠️@0xblacklight·4d

forcing claude to read the "you might not need an effect" post, and a little agent I wrote that runs react-doctor on your PR and roasts you based on react code-quality regressions against main (it diffs them) are the two best things I have done for our react code quality it is actually improving our code quality over time instead of degrading it

English

1.3K

Mateusz Kelner@MKelner·5d

@creatine_cycle Everyday I get reminded that posting on this platform is entirely optional

English

226

atlas@creatine_cycle·5d

"yeah this is fucking sick. let's post this"

Aris@aris_segueg

building the biggest women-only community in San Francisco 1) You get approved 2) You get in 3) You get to know women you should've known years ago if this sounds like you, comment👇

English

255

24.9K

Keşfet

@creatine_cycle @Vtrivedy10 @dillon_mulroy @dexhorthy @southpolesteve @Stagehanddev @browser_use @browserbase