Daniel Marbach🇨🇭

181 posts

Daniel Marbach🇨🇭

Daniel Marbach🇨🇭

@danielmarbach

He / him

Switzerland Katılım Ocak 2011
784 Takip Edilen2.6K Takipçiler
Daniel Marbach🇨🇭 retweetledi
Peter Steinberger 🦞
Folks: when you write skills, ask your agent to be token efficient, relax grammer. I see too many skills that write books in the skill description, and all that crap is loaded into every context. I wrote a skill that finds the worst offenders. github.com/steipete/agent…
English
187
389
5K
318.9K
Daniel Marbach🇨🇭 retweetledi
Aaron Stannard
Aaron Stannard@Aaronontheweb·
How LLMs are destroying OSS trust signals - what should maintainers do about it? (I have one idea)
English
5
3
36
3.4K
Daniel Marbach🇨🇭 retweetledi
Aaron Stannard
Aaron Stannard@Aaronontheweb·
The thing I really appreciate about local AI and home-rolled agent harnesses is that this is where the people who are in software for the love of the game are creating a garage-hacker movement. OSSing everything. Building and sharing tools. Experimenting. 100000% contrast to the AI doomer / grifters / slop-maxxing / course sellers. If you are feeling down or worried about how AI is going to impact your career, this is where you're going to find your love of building + learning things again and probably drastically increase your market value as a developer right now too. You can probably jump in with an old gaming rig or a Mac and start right away - you won't be replacing frontier models with that gear, but you'll be surprised what you can automate and accomplish even with small models.
English
4
3
42
3.1K
Aaron Stannard
Aaron Stannard@Aaronontheweb·
Gave Qwen3-Coder-vNext a shot on one of my DGX-Sparks yesterday and boy, it does NOT do well on anything larger than a well-scoped bug fix on legacy code bases. Real "dog off a leash" vibe.
English
3
0
3
957
Daniel Marbach🇨🇭
Daniel Marbach🇨🇭@danielmarbach·
@KooKiz Yes I always activate my subagent skills for investigations and big fixes. It improves the accuracy and time to real solution a lot
English
0
0
1
56
Kevin Gosse
Kevin Gosse@KooKiz·
A failure mode I often see is: - It doesn't work, the model makes a theory about why - The fix doesn't work, the model doubles down with a different fix for the same theory - At this point, the model has reframed the problem it tries to solve, and gets stuck in a dead-end A recent example on my side: the AI was building a custom theme for an electron-based app. I test it, the pictures don't show. The model theorizes: "it must be the z-index". It sets z-index, still no pictures. The model sets a bigger z-index: still no pictures. The model adds !important: still no pictures. At this point, the model is completely polluted by its own context and is trying to solve the problem "why is z-index not applied correctly" instead of "why are the pictures not showing", and starts doing crazy stuff like overriding the z-index of the *other* elements. I stopped it and said: "you tried z-index 3 times and failed. Either come up with an experiment to unambiguously demonstrate that z-index is indeed the problem, or start considering other theories". It took a step back, returned to the original problem, and quickly realized that the page didn't have permission to load pictures from that folder. Clearing/compacting the context is a good way to fix this. Lowering the context window can help by forcing more compactions, at the expense of the model forgetting some instructions during compaction (so not a silver bullet). I have a hunch that forcing the model to use subagents to verify its theories when debugging would provide a significant improvement, but I don't have enough isolated test-cases to experiment with.
WebDevCody@webdevcody

@ChadMoran it wasn't context rot, opus just sucks at solving some bugs, but I'm 100% going to create that skill now and use it

English
1
0
5
968
Kevin Gosse
Kevin Gosse@KooKiz·
@mkristensen To me the threshold was Opus 4.6. Before that I felt like agentic coding was wasting more time than it saved. Since then I've barely written any code. 4.7 is different, sensibly better on analytical tasks, way worse on initiative.
English
2
0
4
1.9K
Mads Kristensen
Mads Kristensen@mkristensen·
From my real-world use cases, I haven’t seen any significant improvements in coding models since Opus 4.5 and GPT-5.3 Codex. The newer releases feel like incremental updates that don’t deliver meaningful gains for my workflows.
English
22
7
104
13.5K
Daniel Marbach🇨🇭 retweetledi
Het Mehta
Het Mehta@hetmehtaa·
Be Anthropic > Give people Opus 4.6 > People love it. > For 2 months you degrade Opus 4.6 > You give back normal Opus 4.6 and call it Opus 4.7. > People love it. That's the business model.
English
254
692
16K
592.8K
Aaron Stannard
Aaron Stannard@Aaronontheweb·
@danielmarbach Nice, it supports Claude Code and OpenCode hooks. I'll have to give that a shot
English
2
0
0
58
Aaron Stannard
Aaron Stannard@Aaronontheweb·
Not sure I'm going to make it (this is with 1m token context lol)
Aaron Stannard tweet media
English
3
0
4
1.4K
Daniel Marbach🇨🇭 retweetledi
James Montemagno
James Montemagno@JamesMontemagno·
I got sick of being forced to see tons of ads and to log in just to visualize and export a Mermaid diagram... Mermaid Studio is LIVE! mermaidstudio.app And @jongalloway approved
English
5
21
179
9.5K
Daniel Marbach🇨🇭
Daniel Marbach🇨🇭@danielmarbach·
@Aaronontheweb @ICooper I only discovered it today. I saw that they have a skill eval mechanism that looks handy and having a package manager for skills is kinda nice
English
1
0
0
33
Ian Cooper
Ian Cooper@ICooper·
Another example of Claude Code misbehaving around memory here @Aaronontheweb. I asked it to update the dotnet style guide in the repository and it ignored me and updated a memory file. Super-annoying and unhelpful behavior. A real danger that Claude is iterating too fast
Ian Cooper tweet media
English
3
0
3
1.2K
Aaron Stannard
Aaron Stannard@Aaronontheweb·
@danielmarbach @ICooper I wrote my own version of this last summer that also ports my Claude Code agents / skills to OpenCode also - that'll probably be less necessary the more things get standardized
English
1
0
3
101