Shivam Nayak

47 posts

Shivam Nayak

@shivam56296

Building secure runtime for AI agents | IIIT Hyderabad

Katılım Ağustos 2025

93 Takip Edilen13 Takipçiler

Shivam Nayak@shivam56296·17h

@benswerd sounds sick. frankly we are not at the stage where we can open source - but would love to exchange notes sometime.

English

Benۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗ☁️@benswerd·1d

I'm actually working on a 1ms VM Demoware that I'm gonna opensource this week, would love to work with you on it if you're interested. Basically a "here's how to do shitty sandboxes like those 1ms VM people" [ I'm assuming Garrison letting you top his leaderboards means your sandboxes are not shit ]

English

Benۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗ☁️@benswerd·3d

Anyone can start a vm in 1 millisecond. Its not that deep, chop a 64 GB computer into 8 8 GB slices, and poof you have a 1 millisecond start time. So much sandbox talk is demoware BS, every millisecond Freestyle.sh spends on starting our VMs is a cost we chose to pay to make them good.

English

Shivam Nayak@shivam56296·1d

@benswerd appreciate it man. agree -- if "1ms for 1VM" numbers aren't replicated in real production workflows at scale, it is demoware.

English

Benۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗۗ☁️@benswerd·1d

I agree and disagree. There are a lot of things we simply do better than people before us; resource sharing, scheduling, vmm steps/boot system, virtio and device virtualization etc. But certain things are tradeoffs, and a lot of the milliseconds being cut are great exclusively for demoware. 1ms for 1 VM demos are in my opinion incredibly lame, because as someone who's spent years on the milliseconds, virtualizing computers well takes more than 1.

English

Shivam Nayak@shivam56296·3d

@confusedqubit @pavitrabhalla That's interesting - lazy restore with predictive page pre-fetching. Makes sense for time-to-first-instruction. What does time-to-first-useful-response look like though? (e.g. sandbox can serve a command)

English

Shivansh Vij@confusedqubit·3d

@shivam56296 @pavitrabhalla Because firecracker needs to load the entire snapshot before starting the vCPUs. We don't, we know exactly what instruction is going to be executed as soon as the vCPU starts. We also have a good idea of which ones are going to come next.

English

Shivansh Vij@confusedqubit·4d

One of the things I’ve realized with AI is that the edge isn’t the implementation anymore. It’s about the knowledge, the techniques. For example, *WE* know how to start a VM from a snapshot in under 1ms. It’s not an optimization problem, it’s an architectural one - and unless you already know how, you’re unlikely to discover it blindly with Claude.

English

2.5K

Shivam Nayak@shivam56296·3d

@AnandShivansh @computesdk @modal @tensorlake @RunloopAI @northflank @declaw_ai @e2b @daytonaio +1, would be great to see @daytonaio in there too

English

Shivansh Anand Srivastava@AnandShivansh·3d

@computesdk @modal @tensorlake @RunloopAI @northflank @declaw_ai @e2b Would be really interesting to see @daytonaio included in the 100k scale benchmark as well

English

ComputeSDK@computesdk·3d

🚨🚨🚨 We're back with Benchmark Friday, but in addition to our daily tests we've been working on the Scale Invitational. The 100k Scale Invitational is the most ambitious sandbox benchmark. See the thread🧵

English

2.6K

Shivam Nayak@shivam56296·3d

@confusedqubit @pavitrabhalla Isn't preinitializing disks, hypervisor etc and handing out a ready VM just a well-implemented warm pool? From a warm pool, sub-ms is fine. But snapshots getting pulled at start time - without pre-loading, Firecracker's snapshot/load alone take multiple ms even from local NVMe?

English

Shivam Nayak@shivam56296·18 May

@galdayan1895 @a16z @speedrun declaw -- secure runtime for production AI agents (Firecracker sandbox + guardrails + action controls + observability, all in one SDK), built for regulated industries where security is a Day 0 problem.

English

Gal Dayan@galdayan1895·17 May

Last day to apply to @a16z @speedrun SR007. As an a16z speedrun scout, I've already referred 2 startups who got accepted Sell me your product in one sentence. I will rate each of them with one sentence feedback. Those who are above 9, I will reach out directly and schedule an instant meeting.

English

293

237

23.5K

Shivam Nayak@shivam56296·18 May

Would you let a suspect investigate their own crime, decide what counts as evidence, and write the verdict? Then why are we doing exactly that with AI agents? Security in the same trust boundary as the threat isn't security. It's a suggestion.

English

Shivam Nayak@shivam56296·15 May

Didn't clear the YC interview this batch. But the prep forced us to confront questions we'd been ducking, and the partner team's feedback was honest and specific. Will come back stronger. Building declaw.ai -- secure runtime for AI agents.

English

240

Shivam Nayak@shivam56296·11 May

@confusedqubit "Agent in the VM that survives a VM rollback" is contradictory -- the agent's memory is part of the VM state. The only workaround I see is persisting agent state outside the VM and reimporting after restore -- so remote execution nonsense is the only solution?

English

Shivansh Vij@confusedqubit·10 May

Snapshot/Restore for sandboxes is NOT implemented correctly by any provider. Some are snapshotting just disks and calling it a day (looking at you smolvm). That's not enough. Others allow you to take full VM (memory, etc.) snapshots, but rolling back the VM state interrupts the agent running inside. This is the next UX hurdle. I want my agent to be able to rollback my VM, without it getting rolled back. And I want it in the VM, none of this remote execution nonsense.

English

3.5K

Shivam Nayak@shivam56296·11 May

Dirty Frag (CVE-2026-43284): a Linux kernel LPE that turns "agent in sandbox" into "root on host." Containers share a kernel. Break it once, every tenant on the box is exposed. The fix isn't a faster patch cadence. It's not sharing a kernel. @shivamnayak52.sn/why-your-ai-agents-runtime-might-not-be-as-safe-as-you-think-5fdee8f6ef93" target="_blank" rel="nofollow noopener">medium.com/@shivamnayak52…

English

212

Shivam Nayak@shivam56296·10 May

@confusedqubit Sandbox snapshots replacing git commits? Reproducibility was never git's job -- diff/merge/blame is. Diffing snapshots is fine; merge and blame need ancestry and intent, not end states. Sandboxes replace the laptop, can't really see it replacing the repo.

English

Shivansh Vij@confusedqubit·6 May

I think VMs (sandboxes) will become the next Git. Folks have been out in full-force on twitter arguing about what to do about git, and I think most (not all) are completely off the mark. If you want agent-native git, look to sandboxes. The folks using LLMs to create applications (see: your local plumber and dentist) are not going to think about deploying those apps. This is why @Lovable is so popular - if the app works in the sandbox, ship the sandbox. Same pattern as @Replit Your sandbox VMs are where agents develop and applications, and where you (the human) validate the work. My bet is that, long-term, those sandboxes are what will get shipped to prod. The sandbox snapshot will be the record of your code, of the LLMs work. New changes will be forks of the current sandbox, and when they work, you'll promote those to production. Debugging will happen in forks of the production sandbox, etc. I'm not sure how long it will take to get here, but this would be my bet.

English

13.6K

Shivam Nayak@shivam56296·9 May

@jn_jackk INVEST

English

JN Jack | Cold Email@jn_jackk·8 May

Need investor contacts for your startup? I've built a searchable database of 9,000+ investors - angels, VCs, and accelerators you can reach out to immediately. It comes with verified emails, LinkedIn, and even phone numbers. Want access? Like this post Comment "INVEST" Follow me so I can DM you the link I'll send it over ASAP. P.S.: If you are serious about fundraising (now or in the future), you should grab it right away.

English

607

629

48.9K

Shivam Nayak@shivam56296·8 May

@NorthflankWill Strong agree. Containers were never a security boundary -- shared kernel means a single exploit gets you lateral across the node. microVMs give you a real isolation primitive: separate kernel, separate network namespace, no shared attack surface.

English

Will Stewart@NorthflankWill·8 May

In recent weeks, Northflank's PaaS and BYOC microVMs have protected customers from multiple critical CVEs. Copy Fail. Copy Fail 2. DirtyFrag. Recent container escape vulnerabilities are a wake-up call for anyone running infrastructure on containers. Containers are not a strong security boundary. If workloads share the same kernel, your nodes, clusters, IP, and customer data can be exposed. Northflank's secure runtime uses microVM isolation to protect thousands of companies across dev, preview, sandbox, and production workloads. Scanning matters. Runtime detection matters. But neither replaces strong isolation at runtime. If you're running standard Kubernetes or plain containers, it's time to sandbox everything. Come build with, or on top of, Northflank. We run millions of workloads every month, including databases, microservices, builds, AI agents, and GPU workloads, sandboxed by default.

English

906

Shivam Nayak@shivam56296·8 May

AI agents are unpredictable. They can call the wrong APIs, leak data -- and report success either way. Declaw is the runtime soln that fixes it. Sandbox, guardrails, controls at every layer, full observability. Live on Product Hunt today 🚀 producthunt.com/products/decla…

English

133

Shivam Nayak@shivam56296·7 May

launching @declaw_ai startups program today. $10K in credits. complete runtime solution for AI agents: → firecracker microVMs (sub-50ms boot) → network controls (L3/L4 + L7) -- others stop at L4 → AI guardrails → full agent audit trail apply → declaw.ai/startups

English

127

Shivam Nayak@shivam56296·1 May

@theCTO try out @declaw_ai -> console.declaw.ai - Sandbox as a service - Firecracker microVMs, spin up in ms - credit-based, you pay for what you use - Persistent storage - volumes that survive across sessions + snapshot/restore - no tier gating - full outgoing internet access

English

adam@theCTO·28 Nis

I need a Sandbox service. I need the Sandbox to not bill me per hour/minute. I need to it to have persistent storage. I need it to have outgoing internet access. Who should I use?

English

166

50.1K

Shivam Nayak@shivam56296·1 May

@TencentAI_News Real respect for the drop. RustVMM + KVM done well is hard 🫡 Now do it in public. @computesdk public benchmarks - @declaw_ai #1 at 30ms median P95. Sign up. Same test, same conditions. Let the numbers talk 🦾

English

153

Tencent AI@TencentAI_News·21 Nis

🥳We just open-sourced Cube Sandbox! An instant, concurrent, secure and lightweight sandbox runtime for AI Agents. Built with RustVMM and KVM, it achieves the perfect balance of security and performance: → Sub-60ms cold start (2.5-50x faster) → Under 5MB memory overhead per instance (6x less memory) → Dedicated kernel per sandbox (hardware-level isolation) → Thousands of concurrent sandboxes per node → 100% E2B SDK compatible. Swap the endpoint, zero code changes Full-stack capability, one-click deployment. 3 steps to spin up your own private AI sandbox 👇 🔗 github.com/TencentCloud/C…

GIF

English

143

206.5K

Shivam Nayak@shivam56296·29 Nis

@GergelyOrosz actually we built a mac app based on this exact premise - any ai app/agent running on your system would be sandboxed (not exactly sandboxed per se- we just intercepted network and enforced guardrails) the app was 1.2 gb 🥲

English

Gergely Orosz@GergelyOrosz·19 Nis

AI agents are far more cable when they have full system access; but when they do, they can mess a lot of stuff up (not unique to any one model). AI harnesses have guardrails: but those can fail. I wonder if we’ll need OS-level “sandbox primitives” to deal with this better?

Elliot Arledge@elliotarledge

just woke up to opus 4.7 nuking one of my projects during an overnight session. luckily i was able to get it back easily

English

263

42.6K

Shivam Nayak@shivam56296·29 Nis

declaw.ai is now open for everyone. Sign up and spawn your first sandbox in 2 minutes — $300 in free platform credits to start. What declaw is: secure runtime for AI agents. Sandbox, guardrails, policy enforcement, and full audit trail — one platform

English

118

Keşfet

@benswerd @confusedqubit @pavitrabhalla @AnandShivansh @computesdk @modal @tensorlake @RunloopAI