cagin

1.2K posts

cagin

@0xcgn

full stack dev | athlete | nomad~ish | building side projects one line at a time.

localhost:4000 Katılım Aralık 2009

150 Takip Edilen401 Takipçiler

cagin@0xcgn·7h

it is still on early poc. but the idea is quite simple. - parse all the session logs. - put them on a dashboard like UI - try extracting valuable information - add a couple of automations to see if agent does what you think it does. this is how it roughly looks like. not sure if its something worth pursuing yet tbh. but I have been enjoying looking at some numbers about the projects and my AI usage

English

Nico Baier@nbbaier·9h

Hey @0xcgn can you share anything about this project?

English

cagin@0xcgn·8h

@nbbaier @TommyFalkowski thank you! still iterating over it but I like the look and feel so far.

English

Nico Baier@nbbaier·9h

@TommyFalkowski @0xcgn Not directly relevant to the post but this @0xcgn your website is COOL

English

294

Tommy Falkowski@TommyFalkowski·4d

I just stumbled upon this awesome article by @0xcgn and can't believe how much it resonates with me, it's very similar to my experience of agentic whiplash: "The cost of building has collapsed, but the cost of maintaining, supporting, and committing to the thing you built hasn't changed at all." also: "Your agent setup is your vim config now. But for the cognitive layer." Highly recommended it: cagin.dev/writings/pi-lo…

English

12.8K

cagin@0xcgn·1d

you are correct, I do not know the specifics of how it all works under the hood or their infra. I only got some assumptions and things I've read/heard. I've worked on products where a group of users did not use it the way the product was intended or abused services that we provided. There is always the option of killing a specific use case... but you can also try to understand why and improve your offering or even change the incentives. I just don't see an attempt on either side... example for this case, they could say: if you are using your claude max on a 3rd party tool, you cannot benefit from the subsidized usage and let users decide if its worth letting go of additional benefits of a subscription just to use those 3rd party tools. or create a separate subscription for 3rd party tools entirely. that way you satisfy people who can only get access to claude subscription (because of their employers) but does not like claude code as a product... I know im probably oversimplifying and I don't know the right answer but I think it is too early to kill that use case for the subscription service they provide. we are still in the early stages and choice is important.

English

Mario Zechner@badlogicgames·1d

> many ways of dealing everybody is hurting for gpus atm. that's not easy to fix. in lieu of hardware you can try make the software side more efficient, i.e. serve more tokens per hardware unit. but that too has limits. putting "demand" and "usage patterns" in quotes just tells me you don't know how this stuff works under the hood.

English

210

Mario Zechner@badlogicgames·1d

entirely expected and makes sense under their ToS. Codex, OpenCode Zen, MiniMax, z.ai. You have many options.

dancube@ctx_dan

Well, it’s been a good run @badlogicgames and pi-coding-agent

English

216

24.3K

cagin@0xcgn·1d

I absolutely understand the reasoning. But the Antropic has damaged my trust in them so much that I cannot stop my self from thinking of a malicious intent... there are many ways of dealing with such cases. If the "demand" and "usage patterns" are different than what you expected when you built a product. you can also think about welcoming the change and update your assumptions and infra accordingly to satisfy that market demand. They had many opportunities to reflect/rethink and open up but they have chosen to go for full lock in at each turn. not a good long term strategy and definitely not the type of behavior I want to support with my money...

English

262

Mario Zechner@badlogicgames·1d

my heart goes out to @trq212 who will surely get some of the worst twitter messages in the next few days. personally think clarity is good. i like to think that this could have been avoided if some harnesses didn't behave like elephants in the prompt cache shop. tried to make pi a model prompt cache citizen because of that. alas.

English

138

11.5K

cagin@0xcgn·1d

@OpenAI I think it is the perfect time to add a new subscription tier between 20$ and 200$ range ?

Boris Cherny@bcherny

Starting tomorrow at 12pm PT, Claude subscriptions will no longer cover usage on third-party tools like OpenClaw. You can still use these tools with your Claude login via extra usage bundles (now available at a discount), or with a Claude API key.

English

100

cagin@0xcgn·1d

@mitsuhiko dammit.

Deutsch

123

Armin Ronacher ⇌@mitsuhiko·1d

I understand it, but it makes me sad. I really like Opus in pi, but paying for API token prices is just not a reasonable option today. x.com/bcherny/status…

Boris Cherny@bcherny

English

418

45.2K

cagin@0xcgn·2d

I've been running deep research on perplexity, chatgpt & claude. - perplexity: finished in 3 mins 45 sources. really good surface level information. enough to get me started - chatgpt: finished in 13 mins. in depth analysis. 400+ citations from scientific papers, posts, discussions ...etc. goes really deep into topic. - claude: still retrying after I've received the error for 8th time...

English

213

cagin@0xcgn·3d

your code base is the agents memory

English

260

cagin@0xcgn·4d

@ThePrimeagen I mean it's still better than New York Times website.

English

ThePrimeagen@ThePrimeagen·4d

guys, i honestly do not like clowning on Gary. I don't find being the butt of a joke funny, so I imagine he does not either. But, this is what worries me about where we are going. We are actively encouraging an entire generation that the tech is there when its not, and a couple of silly mistakes made on a website isn't the end of the world, but people's data and breaches are serious. We are entering a very VERY hackable world, and I do not like it one bit.

gregorein@Gregorein

so... I audited Garry's website after he bragged about 37K LOC/day and a 72-day shipping streak. here's what 78,400 lines of AI slop code actually looks like in production. a single homepage load of garryslist.org downloads 6.42 MB across 169 requests. for a newsletter-blog-thingy. 1/9🧵

English

298

265

5.3K

554.6K

cagin@0xcgn·4d

@TommyFalkowski very interesting indeed!

English

Tommy Falkowski@TommyFalkowski·4d

I love cloning interesting repos and have the agent evaluate which parts might be useful for the stuff that I'm building.

English

252

cagin retweetledi

beginbot 🃏@beginbot·4d

wow @garrytan just exposed Anthropic as total frauds Claude Code was ONLY 512K LOC ☹️ Gary is shipping 37K LOCs PER DAY so Gary could recreate all of Claude Code in ONLY 13 days! a supposedly $380 billion is big trouble

English

177

6.9K

457.4K

cagin@0xcgn·5d

@0xSero and breaking markdown paragraphs on multiple lines on review mode...

English

162

0xSero@0xSero·5d

All Zed needs to do is make their terminal good man, it's perfect in every other way. The terminal causes too much flickering with agents.

English

4.6K

cagin@0xcgn·5d

@badlogicgames ahahahah.

Italiano

Mario Zechner@badlogicgames·5d

chat, is he serious?

English

19.3K

cagin retweetledi

David J Phillips@davj·6d

"Make no mistakes DO NOT HALLUCINATE. YOU ARE AN EXPERT SOFTWARE ENGINEER"

English

192

2.1K

24.4K

1.3M

cagin retweetledi

Feross@feross·5d

🚨 CRITICAL: Active supply chain attack on axios -- one of npm's most depended-on packages. The latest axios@1.14.1 now pulls in plain-crypto-js@4.2.1, a package that did not exist before today. This is a live compromise. This is textbook supply chain installer malware. axios has 100M+ weekly downloads. Every npm install pulling the latest version is potentially compromised right now. Socket AI analysis confirms this is malware. plain-crypto-js is an obfuscated dropper/loader that: • Deobfuscates embedded payloads and operational strings at runtime • Dynamically loads fs, os, and execSync to evade static analysis • Executes decoded shell commands • Stages and copies payload files into OS temp and Windows ProgramData directories • Deletes and renames artifacts post-execution to destroy forensic evidence If you use axios, pin your version immediately and audit your lockfiles. Do not upgrade.

English

539

4.1K

16.3K

12.1M

cagin@0xcgn·6d

@UtkarshUsername thank you. tried to keep it honest but failed on the short part :)

English

Utkarsh@UtkarshUsername·6d

@0xcgn Well written

English

cagin@0xcgn·6d

cagin.dev/writings/pi-lo…

ZXX

197

cagin@0xcgn·6d

If you were following, you might have noticed that I haven't been pushing updates for a quite while and decided to shelve the project. my current setup is just pi tui + cmux which happens to cover all the things I need from the GUI perspective. I have decided to use that energy + time to focus on experimenting with + improving my agents setup. here is a small reflection on the rationale. I learned a valuable lesson and going to be more mindful when I have an another idea: cagin.dev/writings/pi-lo…

English

107

cagin@0xcgn·8 Şub

day-4: analytics + chat controls - support for steering messages & follow-up queue (interrupt/queue while agent runs) - thinking level control with Shift+Tab cycling - auto-retry with countdown banner & cancel - event pipeline refactor + 19 new RPC commands also got some free credits and used @v0 from @vercel to pimp the gui a little very smooth experience except some minor issues I reported. and the biggest change was the analytics dashboard I set up with: daily reports, deltas, drill-down, budget guardrails, efficiency metrics, heat-map toggles...etc. it exists because I'm running on $20 claude, chatgpt & cursor plans instead of one unlimited $200 plan. need to keep an eye on usage and not get too excited shipping everything at once. so I added daily & weekly budget targets to stay within limits. will probably add per-model/subscription budgets later. who knows maybe it can even help me touch some grass also helps me understand my prompting patterns and be more effective. it's all local. no telemetry, no cloud.

English

1.6K

cagin@0xcgn·6 Şub

devlog: i'm building pi-lot, a desktop GUI for pi.dev . I started pi-lot to automate my own dev work + learn agentic engineering hands-on. Now turning it into a tool other devs can use too. will post updates as replies below. thank you @badlogicgames for enabling me with your awesome work and great documentation!

English

163

9.9K

cagin@0xcgn·28 Mar

@badlogicgames @mitsuhiko I was just writing a post and gave it a shot. oh boy... pro tip: add `roast me` in the end of the prompt

English

621

Mario Zechner@badlogicgames·28 Mar

anytime i finish a blog post, i feed it to an LLM asking it to produce 20-40 HN or Reddit comments. immensely effective. stole that idea from @mitsuhiko

Andrej Karpathy@karpathy

- Drafted a blog post - Used an LLM to meticulously improve the argument over 4 hours. - Wow, feeling great, it’s so convincing! - Fun idea let’s ask it to argue the opposite. - LLM demolishes the entire argument and convinces me that the opposite is in fact true. - lol The LLMs may elicit an opinion when asked but are extremely competent in arguing almost any direction. This is actually super useful as a tool for forming your own opinions, just make sure to ask different directions and be careful with the sycophancy.

English

117

284.8K

cagin@0xcgn·28 Mar

I'd happily donate data from my personal projects. 964 sessions (~3B tokens) in 50 projects spanning over 3 months and growing every day. however the data from pi might not be that useful as of now. maybe the pi-core needs to add otel traces first so people can already start collecting & hand labeling them locally? until it is figured out where to dump them? there is not "human in the loop" on pi meaning there is no "accept | decline" data from the output. so probably a way to give feedback/label the output would be a good extension? it'd also probably be interesting to see which files are updated, when, how long the agents code lives ...etc. so we can maybe derive the acceptance rate or lifespan of the code? also a way to scrub sensitive data on both sessions and traces which imho. is the most complex part given agents can read any file on the disk... I've been trying to extract & parse some information from what pi sessions offer, it is tricky to get valuable data as of now especially for training. my preferred way would be to give people the infrastructure to see, understand, label their own data and verify that there is nothing sensitive being leaked and then upload/sync/dump the data.

English

662

Mario Zechner@badlogicgames·28 Mar

we as software engineers are becoming beholden to a handful of well funded corportations. while they are our "friends" now, that may change due to incentives. i'm very uncomfortable with that. i believe we need to band together as a community and create a public, free to use repository of real-world (coding) agent sessions/traces. I want small labs, startups, and tinkerers to have access to the same data the big folks currently gobble up from all of us. So we, as a community, can do what e.g. Cursor does below, and take back a little bit of control again. Who's with me? cursor.com/blog/real-time…

English

178

318

2.5K

218.5K

Keşfet

@nbbaier @TommyFalkowski @trq212 @OpenAI @mitsuhiko @ThePrimeagen @garrytan @elonmusk