Nikita Nosov

190 posts

Nikita Nosov banner
Nikita Nosov

Nikita Nosov

@nik1t7n

i am nikita, founder & software engineer

Palo Alto, California Katılım Ekim 2022
58 Takip Edilen38 Takipçiler
Sabitlenmiş Tweet
Nikita Nosov
Nikita Nosov@nik1t7n·
I built context-firewall for coding agents. agents waste context on terminal noise: test output, diffs, logs, rg results, giant JSON. context-firewall lets agents run the same commands they already run, but shows them only the useful summary instead of flooding their context. the full output is still saved locally, so the agent can pull exact lines later if it needs them. feel free to use it, test it, challenge it, contribute and give your agent more room for the actual work. in a real release audit: 300k raw tokens → 16k agent-visible tokens 94.6% less context noise brew install nik1t7n/tap/cfw npx @nik1t7n/context-firewall --help github.com/nik1t7n/contex…
English
0
1
5
638
Nikita Nosov
Nikita Nosov@nik1t7n·
I turned @itsreallyvivek's “how to be good at research” essay into an agent skill. research-craft helps agents plan better research loops: choose problems, forecast experiments, keep logs, inspect failures, and tighten iteration. npx -y skills add nik1t7n/research-craft-skill --all github.com/nik1t7n/resear…
English
6
18
171
14.7K
Ish
Ish@DecisionTree_gg·
@nik1t7n @itsreallyvivek i have truly become addicted to people converting @X wisdom into repos. ty, king
English
1
0
1
176
Nikita Nosov
Nikita Nosov@nik1t7n·
@untethered_sid @itsreallyvivek agree but my thinking was just to give the agent the same mindset i got from reading the article - to align my vision with the agent’s vision. because in the future it’s going to assist me in any type of research, and it must understand the paradigm i’m using
English
1
0
2
29
Sid
Sid@untethered_sid·
@itsreallyvivek @nik1t7n after reading what you wrote(was very well articulated btw) i felt that the “skill” is truly an innate mindset problem. regardless of the acceleration of the process with an agent skill there is a pre-req to that, that starts with the person doing the research
English
2
0
0
150
Nikita Nosov
Nikita Nosov@nik1t7n·
@itsreallyvivek thanks. your article felt like a breath of fresh air for me. cause lately i've been a bit stuck on finding truly novel ideas while building my startup, and i was wondering what to do about it. the article helped a lot. tt was genuinely refreshing.
English
0
0
2
315
vivek
vivek@itsreallyvivek·
@nik1t7n nice one nikita
Filipino
1
0
5
800
Nikita Nosov
Nikita Nosov@nik1t7n·
Everyone is talking about Loops for Agents. But have you noticed that after a few hours, your agent can start getting dumber? Actually, mine did. Codex would begin strong, then slowly drift: repeat work, forget recent decisions, reopen the same files, and continue from an older version of the task. I traced the problem to context compaction and fixed it with a checkpoint hook. Wrote the full breakdown here: x.com/nik1t7n/status…
Nikita Nosov@nik1t7n

x.com/i/article/2066…

English
0
0
4
306
Nikita Nosov retweetledi
Mark Valorian
Mark Valorian@markvalorian·
@AnthropicAI You people torpedoed your own initiative with this fear mongering nonsense. Just supply the models to willing buyers and please keep the pseudophilosophical pontification to yourselves. This is extraordinarily frustrating to deal with as a user. I hope you understand that.
English
79
147
5.1K
575.1K
Nikita Nosov retweetledi
Silverrock
Silverrock@nwabuakwu·
The great divide coming in the future is not going to be about access to money, it's going to be access to top tier AI models.
English
4
3
18
24.8K
Nikita Nosov retweetledi
Telegram Messenger
Telegram Messenger@telegram·
if I had to look for a job on @LinkedIn I would fucking kill myself
English
1.3K
6.3K
57.6K
2.8M
Nikita Nosov
Nikita Nosov@nik1t7n·
@sudoingX infrastructure is the stuff you stop noticing. I have never met anyone who regretted setting up tailscale early, only people who put it off.
English
0
0
0
21
Sudo su
Sudo su@sudoingX·
anything that is cheap to start with and brutally expensive to add later is foundational. that is the whole test. the five hit it. once you actually need tailscale, half your configs already assume public ips, and once you want git as the memory layer, months of context are already trapped in chat windows you cannot recover. retrofitting eats months, not weekends. foundation is just another word for what does not retrofit.
English
2
0
5
1.1K
Sudo su
Sudo su@sudoingX·
anyone thinking about, learning, or already working with agentic systems, you should know this. the first few steps of your setup matter more than any model or framework you pick later. get them right and you never lose your flow. the foundation nobody posts about: > 1. tailscale. a private mesh network across every machine you own. laptop, desktop, rented node, all on one secure tailnet, reachable from anywhere. nothing else works well until this does. > 2. termius, over that tailnet. one SSH client that reaches every node, phone included. you are never away from your stack. > 3. tmux. persistent sessions. disconnect, close the laptop, come back, every session exactly where you left it. agentic work runs long, your terminal has to survive that. > 4. a private git repo. the one i am most glad i found. it is the memory layer across all my agents, they pull, they work, they merge back, the codebase stays alive between sessions. context that would die in a chat window lives in the repo instead. > 5. script everything from day one. ssh aliases for every node, setup scripts, the boring boilerplate automated. if you will do a thing more than twice, it is a script. everything past these five is decorative. know these cold. and the habit that ties it together: ask the AI itself. for the config, for the error, for any of it, let the agent do the lifting, then double check what it hands you. lock the five, build the habit, and you make it. skip it, anon, and you ngmi.
English
117
169
2.3K
223K
Nikita Nosov
Nikita Nosov@nik1t7n·
@ItsWillHenry the 20% that survive past year one are the ones solving a real workflow. everything else is an API call with a login page.
English
0
0
0
2
Will Henry
Will Henry@ItsWillHenry·
80% of “AI startups” are just: - Paste your prompt into GPT - Add a dashboard - Rename it “copilot” - Charge you $25/month
English
1
1
5
395
Nikita Nosov
Nikita Nosov@nik1t7n·
@sudoingX everyone is faking it. the ones shipping have just made peace with the fact that the tooling is half-baked and the edges are sharp. the gap between a tweetstorm about agents and a working multi-agent loop is about 200 unglamorous hours of debugging.
English
0
0
0
10
Sudo su
Sudo su@sudoingX·
if you are working with agentic systems and you quietly feel like you are faking it, like everyone else got the memo and you are just barely keeping up, i want to tell you what is actually happening. it is not a talent gap. it is a setup gap. the people who look fluent are not smarter than you. they set up the boring foundation early, so the friction you fight all day, they simply do not have anymore. every hour you lose to context dying in a closed window, to a long run dropping on a disconnect, to a machine you cannot reach, that hour feels like proof you are not good enough. it is not. it is proof you are under tooled. i felt like an imposter for a long time too. the feeling did not go away when i got smarter, it went away when i set up the five and the friction stopped. you are not behind. you are one weekend of setup away from feeling like you belong.
Sudo su@sudoingX

anyone thinking about, learning, or already working with agentic systems, you should know this. the first few steps of your setup matter more than any model or framework you pick later. get them right and you never lose your flow. the foundation nobody posts about: > 1. tailscale. a private mesh network across every machine you own. laptop, desktop, rented node, all on one secure tailnet, reachable from anywhere. nothing else works well until this does. > 2. termius, over that tailnet. one SSH client that reaches every node, phone included. you are never away from your stack. > 3. tmux. persistent sessions. disconnect, close the laptop, come back, every session exactly where you left it. agentic work runs long, your terminal has to survive that. > 4. a private git repo. the one i am most glad i found. it is the memory layer across all my agents, they pull, they work, they merge back, the codebase stays alive between sessions. context that would die in a chat window lives in the repo instead. > 5. script everything from day one. ssh aliases for every node, setup scripts, the boring boilerplate automated. if you will do a thing more than twice, it is a script. everything past these five is decorative. know these cold. and the habit that ties it together: ask the AI itself. for the config, for the error, for any of it, let the agent do the lifting, then double check what it hands you. lock the five, build the habit, and you make it. skip it, anon, and you ngmi.

English
20
13
238
14.8K
Nikita Nosov
Nikita Nosov@nik1t7n·
@NousResearch @browserbase browser skills as composable modules is the right abstraction. everyone was writing custom scrapers because there was no way to discover what already existed.
English
0
0
0
1.3K
Nous Research
Nous Research@NousResearch·
Hermes Agent now has access to hundreds of browser skills through @browserbase’s new Browse.sh hub, so agents can more reliably perform any task on the internet. You can try a skill from their catalog or contribute your own.
English
106
195
2.4K
546.1K
Nikita Nosov
Nikita Nosov@nik1t7n·
@akshay_pachaar went back and forth between code-gen agents and function-calling agents for months. code is more expressive but structured outputs are more reliable. a hard problem to pick sides on.
English
0
0
0
140
Akshay 🚀
Akshay 🚀@akshay_pachaar·
code as agent harness. a 102-page survey from Stanford, Meta, and UIUC on agent harnesses. the paper argues that code is no longer just the thing agents produce. it’s the medium through which they reason, act, and represent their environment. it calls this “code as agent harness” and covers three layers: code as the interface between agents and their tasks; the mechanisms that keep agents reliable over long-horizon execution (planning, memory, tool use, verification); and how multi-agent systems coordinate through shared code artifacts. core findings: the paper introduces “evolution agents” that treat the harness itself as the optimization target. they collect telemetry, diagnose failures, propose infrastructure changes, and promote only mutations that pass regression. the harness improves itself. in multi-agent systems, topology complexity inversely correlates with infrastructure quality. teams with better shared state use simpler coordination. teams without it build increasingly elaborate workarounds. finally, the paper concludes that future agent systems need four properties: - executable - inspectable - stateful - governed read more: arxiv.org/abs/2605.18747 i also published this deep dive (article) on agent harness engineering, covering the orchestration loop, tools, memory, context management, and everything else that transforms a stateless LLM into a capable agent. the article is quoted below.
Akshay 🚀 tweet media
Akshay 🚀@akshay_pachaar

x.com/i/article/2040…

English
40
112
749
122.3K
Nikita Nosov
Nikita Nosov@nik1t7n·
@tonysimons_ shared catalogs handle discovery well. the real test is how fast selectors decay relative to catalog update cycles.
English
0
0
1
13
Tony Simons
Tony Simons@tonysimons_·
A shared catalog of browser skills beats writing custom prompts for every site. Browse.sh + Hermes means your agent inherits hundreds of battle-tested site workflows instead of you debugging selectors at 11pm. Can I get a hell yeah in the comments?!
GIF
Nous Research@NousResearch

Hermes Agent now has access to hundreds of browser skills through @browserbase’s new Browse.sh hub, so agents can more reliably perform any task on the internet. You can try a skill from their catalog or contribute your own.

English
3
0
9
1.7K