cagin

1.2K posts

cagin banner
cagin

cagin

@0xcgn

full stack dev | athlete | nomad~ish | building side projects one line at a time.

localhost:4000 Katılım Aralık 2009
150 Takip Edilen401 Takipçiler
cagin
cagin@0xcgn·
it is still on early poc. but the idea is quite simple. - parse all the session logs. - put them on a dashboard like UI - try extracting valuable information - add a couple of automations to see if agent does what you think it does. this is how it roughly looks like. not sure if its something worth pursuing yet tbh. but I have been enjoying looking at some numbers about the projects and my AI usage
English
1
0
2
27
Nico Baier
Nico Baier@nbbaier·
Hey @0xcgn can you share anything about this project?
Nico Baier tweet media
English
1
0
1
60
Tommy Falkowski
Tommy Falkowski@TommyFalkowski·
I just stumbled upon this awesome article by @0xcgn and can't believe how much it resonates with me, it's very similar to my experience of agentic whiplash: "The cost of building has collapsed, but the cost of maintaining, supporting, and committing to the thing you built hasn't changed at all." also: "Your agent setup is your vim config now. But for the cognitive layer." Highly recommended it: cagin.dev/writings/pi-lo…
English
2
4
57
12.8K
cagin
cagin@0xcgn·
you are correct, I do not know the specifics of how it all works under the hood or their infra. I only got some assumptions and things I've read/heard. I've worked on products where a group of users did not use it the way the product was intended or abused services that we provided. There is always the option of killing a specific use case... but you can also try to understand why and improve your offering or even change the incentives. I just don't see an attempt on either side... example for this case, they could say: if you are using your claude max on a 3rd party tool, you cannot benefit from the subsidized usage and let users decide if its worth letting go of additional benefits of a subscription just to use those 3rd party tools. or create a separate subscription for 3rd party tools entirely. that way you satisfy people who can only get access to claude subscription (because of their employers) but does not like claude code as a product... I know im probably oversimplifying and I don't know the right answer but I think it is too early to kill that use case for the subscription service they provide. we are still in the early stages and choice is important.
English
0
0
1
46
Mario Zechner
Mario Zechner@badlogicgames·
> many ways of dealing everybody is hurting for gpus atm. that's not easy to fix. in lieu of hardware you can try make the software side more efficient, i.e. serve more tokens per hardware unit. but that too has limits. putting "demand" and "usage patterns" in quotes just tells me you don't know how this stuff works under the hood.
English
1
0
1
210
cagin
cagin@0xcgn·
I absolutely understand the reasoning. But the Antropic has damaged my trust in them so much that I cannot stop my self from thinking of a malicious intent... there are many ways of dealing with such cases. If the "demand" and "usage patterns" are different than what you expected when you built a product. you can also think about welcoming the change and update your assumptions and infra accordingly to satisfy that market demand. They had many opportunities to reflect/rethink and open up but they have chosen to go for full lock in at each turn. not a good long term strategy and definitely not the type of behavior I want to support with my money...
English
1
0
1
262
Mario Zechner
Mario Zechner@badlogicgames·
my heart goes out to @trq212 who will surely get some of the worst twitter messages in the next few days. personally think clarity is good. i like to think that this could have been avoided if some harnesses didn't behave like elephants in the prompt cache shop. tried to make pi a model prompt cache citizen because of that. alas.
English
7
4
138
11.5K
cagin
cagin@0xcgn·
I've been running deep research on perplexity, chatgpt & claude. - perplexity: finished in 3 mins 45 sources. really good surface level information. enough to get me started - chatgpt: finished in 13 mins. in depth analysis. 400+ citations from scientific papers, posts, discussions ...etc. goes really deep into topic. - claude: still retrying after I've received the error for 8th time...
English
0
0
3
213
cagin
cagin@0xcgn·
your code base is the agents memory
English
0
1
5
260
cagin
cagin@0xcgn·
@ThePrimeagen I mean it's still better than New York Times website.
English
0
0
0
71
ThePrimeagen
ThePrimeagen@ThePrimeagen·
guys, i honestly do not like clowning on Gary. I don't find being the butt of a joke funny, so I imagine he does not either. But, this is what worries me about where we are going. We are actively encouraging an entire generation that the tech is there when its not, and a couple of silly mistakes made on a website isn't the end of the world, but people's data and breaches are serious. We are entering a very VERY hackable world, and I do not like it one bit.
gregorein@Gregorein

so... I audited Garry's website after he bragged about 37K LOC/day and a 72-day shipping streak. here's what 78,400 lines of AI slop code actually looks like in production. a single homepage load of garryslist.org downloads 6.42 MB across 169 requests. for a newsletter-blog-thingy. 1/9🧵

English
298
265
5.3K
554.6K
Tommy Falkowski
Tommy Falkowski@TommyFalkowski·
I love cloning interesting repos and have the agent evaluate which parts might be useful for the stuff that I'm building.
Tommy Falkowski tweet media
English
3
1
6
252
cagin retweetledi
beginbot 🃏
beginbot 🃏@beginbot·
wow @garrytan just exposed Anthropic as total frauds Claude Code was ONLY 512K LOC ☹️ Gary is shipping 37K LOCs PER DAY so Gary could recreate all of Claude Code in ONLY 13 days! a supposedly $380 billion is big trouble
English
86
177
6.9K
457.4K
cagin
cagin@0xcgn·
@0xSero and breaking markdown paragraphs on multiple lines on review mode...
English
0
0
0
162
0xSero
0xSero@0xSero·
All Zed needs to do is make their terminal good man, it's perfect in every other way. The terminal causes too much flickering with agents.
0xSero tweet media
English
7
1
57
4.6K
Mario Zechner
Mario Zechner@badlogicgames·
chat, is he serious?
Mario Zechner tweet media
English
33
0
96
19.3K
cagin retweetledi
David J Phillips
"Make no mistakes DO NOT HALLUCINATE. YOU ARE AN EXPERT SOFTWARE ENGINEER"
English
192
2.1K
24.4K
1.3M
cagin retweetledi
Feross
Feross@feross·
🚨 CRITICAL: Active supply chain attack on axios -- one of npm's most depended-on packages. The latest axios@1.14.1 now pulls in plain-crypto-js@4.2.1, a package that did not exist before today. This is a live compromise. This is textbook supply chain installer malware. axios has 100M+ weekly downloads. Every npm install pulling the latest version is potentially compromised right now. Socket AI analysis confirms this is malware. plain-crypto-js is an obfuscated dropper/loader that: • Deobfuscates embedded payloads and operational strings at runtime • Dynamically loads fs, os, and execSync to evade static analysis • Executes decoded shell commands • Stages and copies payload files into OS temp and Windows ProgramData directories • Deletes and renames artifacts post-execution to destroy forensic evidence If you use axios, pin your version immediately and audit your lockfiles. Do not upgrade.
English
539
4.1K
16.3K
12.1M
cagin
cagin@0xcgn·
@UtkarshUsername thank you. tried to keep it honest but failed on the short part :)
English
0
0
1
17
cagin
cagin@0xcgn·
If you were following, you might have noticed that I haven't been pushing updates for a quite while and decided to shelve the project. my current setup is just pi tui + cmux which happens to cover all the things I need from the GUI perspective. I have decided to use that energy + time to focus on experimenting with + improving my agents setup. here is a small reflection on the rationale. I learned a valuable lesson and going to be more mindful when I have an another idea: cagin.dev/writings/pi-lo…
English
0
1
1
107
cagin
cagin@0xcgn·
day-4: analytics + chat controls - support for steering messages & follow-up queue (interrupt/queue while agent runs) - thinking level control with Shift+Tab cycling - auto-retry with countdown banner & cancel - event pipeline refactor + 19 new RPC commands also got some free credits and used @v0 from @vercel to pimp the gui a little very smooth experience except some minor issues I reported. and the biggest change was the analytics dashboard I set up with: daily reports, deltas, drill-down, budget guardrails, efficiency metrics, heat-map toggles...etc. it exists because I'm running on $20 claude, chatgpt & cursor plans instead of one unlimited $200 plan. need to keep an eye on usage and not get too excited shipping everything at once. so I added daily & weekly budget targets to stay within limits. will probably add per-model/subscription budgets later. who knows maybe it can even help me touch some grass also helps me understand my prompting patterns and be more effective. it's all local. no telemetry, no cloud.
English
1
2
4
1.6K
cagin
cagin@0xcgn·
devlog: i'm building pi-lot, a desktop GUI for pi.dev . I started pi-lot to automate my own dev work + learn agentic engineering hands-on. Now turning it into a tool other devs can use too. will post updates as replies below. thank you @badlogicgames for enabling me with your awesome work and great documentation!
English
11
5
163
9.9K
cagin
cagin@0xcgn·
@badlogicgames @mitsuhiko I was just writing a post and gave it a shot. oh boy... pro tip: add `roast me` in the end of the prompt
English
0
0
1
621
cagin
cagin@0xcgn·
I'd happily donate data from my personal projects. 964 sessions (~3B tokens) in 50 projects spanning over 3 months and growing every day. however the data from pi might not be that useful as of now. maybe the pi-core needs to add otel traces first so people can already start collecting & hand labeling them locally? until it is figured out where to dump them? there is not "human in the loop" on pi meaning there is no "accept | decline" data from the output. so probably a way to give feedback/label the output would be a good extension? it'd also probably be interesting to see which files are updated, when, how long the agents code lives ...etc. so we can maybe derive the acceptance rate or lifespan of the code? also a way to scrub sensitive data on both sessions and traces which imho. is the most complex part given agents can read any file on the disk... I've been trying to extract & parse some information from what pi sessions offer, it is tricky to get valuable data as of now especially for training. my preferred way would be to give people the infrastructure to see, understand, label their own data and verify that there is nothing sensitive being leaked and then upload/sync/dump the data.
cagin tweet mediacagin tweet mediacagin tweet media
English
0
0
5
662
Mario Zechner
Mario Zechner@badlogicgames·
we as software engineers are becoming beholden to a handful of well funded corportations. while they are our "friends" now, that may change due to incentives. i'm very uncomfortable with that. i believe we need to band together as a community and create a public, free to use repository of real-world (coding) agent sessions/traces. I want small labs, startups, and tinkerers to have access to the same data the big folks currently gobble up from all of us. So we, as a community, can do what e.g. Cursor does below, and take back a little bit of control again. Who's with me? cursor.com/blog/real-time…
English
178
318
2.5K
218.5K