nilenso

1.9K posts

nilenso banner
nilenso

nilenso

@nilenso

Employee-owned programmer cooperative in Bangalore.

Katılım Mayıs 2013
475 Takip Edilen1.7K Takipçiler
Sabitlenmiş Tweet
nilenso
nilenso@nilenso·
In 2013, a group of makers got together to find new ways to work together. A lot has happened since. We recently celebrated our 10th birthday :) Over the years, we've had the privilege of working with some exceptional organizations and doing work we're proud of. 1/2
nilenso tweet media
English
2
4
42
4.6K
nilenso retweetledi
Srihari Sriraman
Srihari Sriraman@SrihariSriraman·
I did a compaction analysis on a couple of recent claude code sessions, and thought I'd share here too. You can use the link to explore further if you're interested. These are good compaction examples. I wish I could do the same analysis with some bad examples. nilenso.github.io/context-viewer…
English
0
1
0
103
nilenso retweetledi
Drew Breunig
Drew Breunig@dbreunig·
Somehow I didn't fully appreciate how strongly Claude Code's prompt has to fight against the weights to make parallel tool calls. blog.nilenso.com/blog/2026/02/1…
Drew Breunig tweet media
English
9
19
261
19K
nilenso retweetledi
Govind Krishna Joshi
Govind Krishna Joshi@govindkrjoshi·
Something I've been thinking for a while, but finally got to writing it down. The core thesis is that building reliable AI applications requires a harness to be able to tinker, experiment and iterate, without which the project gets stuck in the prototyping phase. blog.nilenso.com/blog/2026/02/1…
English
0
5
5
379
nilenso retweetledi
Srihari Sriraman
Srihari Sriraman@SrihariSriraman·
I've been studying the effect of system prompts in the model + tools + system-prompts + harness stack. So, I ran the same SWE-Bench-Pro task with Opus+Claude Code, but with different system prompts. One run used Codex's system prompt, and another run used Claude's system prompt. The workflows on the runs are different, and mirror these kinds of sentiments. You can see the corresponding differences in the system prompts too. We maybe mis-attributing some of these behaviours to the model, when they're attributable to the system-prompt.
Srihari Sriraman tweet mediaSrihari Sriraman tweet media
English
1
1
3
372
nilenso retweetledi
Yash Gandhi
Yash Gandhi@yashgandhi_·
Atharva Raykar from @nilenso will tell us how you're not a programmer anymore: you're coordinating a complex system. Systems thinking, feedback loops, scientific reasoning. The skills that actually matter when building AI. Unlearning and relearning the new rules of the game.
English
1
3
2
243
nilenso retweetledi
atharva
atharva@AtharvaRaykar·
I have collected some thoughts on how to look at benchmarks that are rarely expressed elsewhere. I believe it's useful and tenable for people and organisations to build their own "minimum viable benchmark" to really make sense of LLM capabilities.
atharva tweet media
English
1
2
5
773
nilenso retweetledi
Srihari Sriraman
Srihari Sriraman@SrihariSriraman·
I just published the next article in the "How to work with Product" series. This one is called: "Taste and Adjust", and it's about finding ways to "taste" your product at every stage, by consciously building a product development flywheel. Link: blog.nilenso.com/blog/2025/11/2…
Srihari Sriraman tweet media
English
1
1
4
167
nilenso retweetledi
atharva
atharva@AtharvaRaykar·
I let Codex CLI rip over the @nilenso website code to optimise performance. It scripted a benchmark, applied some changes and reran the bench to confirm that its changes sped things up by ~5x. Our website sends ~10x less data as well. We had been putting off the website optimisation work due to other priorities, but these days the friction to take up this kind of work is really low.
English
0
1
1
251
nilenso retweetledi
Srihari Sriraman
Srihari Sriraman@SrihariSriraman·
Sometimes I just want to give a github url, and a prompt to semantically search. Similar to web search tools, but for Github / Gitlab. I made a tool that does this, following @thorstenball 's "How to Build an Agent", and @nickbaumann_ 's "What Makes a Coding Agent?" blog posts. I just use Github/Gitlab's APIs instead of using the filesystem. I use this now in storymachine because product managers or business folks don't have a repo cloned or an agentic-cli running on their machines.
Srihari Sriraman tweet media
English
1
4
19
9.4K
nilenso retweetledi
Srihari Sriraman
Srihari Sriraman@SrihariSriraman·
@dbreunig My blog post has a lot more detail about all this. Check out this section for details on why existing observability tools don't cut it:
Srihari Sriraman tweet media
English
1
1
1
259
nilenso retweetledi
Srihari Sriraman
Srihari Sriraman@SrihariSriraman·
You know how your LLM context is a giant wall of text, and mostly an opaque box that you don't open? I built a tool to open it up, and pull it apart for you so that you can actually do "context engineering". You can just drag-drop your conversation.json into it, and it will show you the components growing through time.
English
1
2
8
889
nilenso retweetledi
atharva
atharva@AtharvaRaykar·
Wrote a new post about a trend I'm seeing. Designing a good AI-integrated application requires trading off between tricks to improve performance *today*, while also preparing for the bitter lesson that will strike in the future.
atharva tweet media
English
1
1
2
1.6K
nilenso
nilenso@nilenso·
@tod thanks for sharing our blogs on hn :) hope you enjoy reading!
English
0
0
1
10
nilenso
nilenso@nilenso·
Throwing out some ideas for improving benchmarks.
nilenso tweet media
English
1
0
1
121
nilenso
nilenso@nilenso·
What popular SWE/coding benchmarks are measuring. 1) SWE Bench Verified Patches that close GitHub issues. Mostly for Python library repositories. In that mostly Django. Pass criteria: get the unit tests to pass.
nilenso tweet media
English
1
0
10
1.3K