Shivasurya

4.2K posts

Shivasurya

@sshivasurya

senior software engineer | security + AI | @UWaterloo @Dropbox @Zoho alum | building https://t.co/bMnGeuZ1tX | 🍁 🇨🇦

Waterloo, Ontario Beigetreten Haziran 2013

425 Folgt696 Follower

Angehefteter Tweet

Shivasurya@sshivasurya·29 Tem

Yesterday I demo-ed Secureflow at @TheBarnWaterloo . To my surprise, it was well received and I received a lot of positive feedback! Check out Secureflow 👇 codepathfinder.dev/blog/introduci…

Waterloo, Ontario 🇨🇦 English

3.8K

Shivasurya@sshivasurya·1d

@djcows Same 6$ digitalocean droplet can do the same!

English

130

djcows@djcows·2d

a $100 raspberry pi can do exactly the same thing as a $500 mac mini btw

English

120

655

101.4K

Shivasurya@sshivasurya·1d

me in the morning: "NVIDIA DGX Spark is literally the only thing standing between me and greatness. $8,000? Pocket change." me in the afternoon: "Ah yes, renting GPUs. A wise, rational decision. Very mature of me." me in the evening: "Stick to Claude Code Max Plan. YOLO." me in the night: "keep reading baseten Inference engineering book..." next day, 9am: "So anyway, about this DGX Spark…" 🔁

English

Shivasurya@sshivasurya·2d

Good signs of Code-Pathfinder MCP adoption in last 30 days! 🚀 It's now published MCP registry: registry.modelcontextprotocol.io/?q=dev.codepat… More updates dropping soon 😉!

Shivasurya@sshivasurya

LSPs are built for humans typing in IDEs. Pathfinder MCP is built for agents querying codebases. Demoed it tonight at @TheBarnWaterloo despite the storm. Check it out here: codepathfinder.dev/mcp Thanks @shriji for the pic 📸

English

Shivasurya@sshivasurya·5d

@allgarbled And that gradient colours, and every colorful div too 😂

English

555

gabe@allgarbled·6d

I don’t know what you call them, but these little side tabs are like the emdash of vibe coded UIs

Juri Strumpflohner@juristr

What's your AI adoption level? (according to Steve Yegge)

English

621

521

15.3K

Shivasurya@sshivasurya·6d

@MatthewBerman @nvidia @Dell Wow! That's so cool.

English

Matthew Berman@MatthewBerman·6d

.@nvidia hand delivered a pre-production unit of the @Dell Pro Max with GB300 to my house. 100lbs beast with 750GB+ of unified memory to power the best open-source models in the world. What should I test first?

English

299

102

1.9K

252.6K

Shivasurya@sshivasurya·15 Mar

Opus 4.6 1M on Claude code was exciting at first but beyond 300k every request lags and slows down.

English

103

Shivasurya@sshivasurya·15 Mar

@Pranit 💯

QME

Pranit@Pranit·15 Mar

Anthropic just pulled the oldest trick in SaaS pricing. I pay $200/mo for Claude Max. My limits have been noticeably worse this past week. Now they announce 2x off-peak usage for two weeks. Sounds generous. But here’s what actually happens: limits quietly drop, a temporary 2x makes the reduced limit feel normal, the promo ends, and you’re left at a baseline lower than where you started. You just didn’t notice the downgrade because the 2x absorbed the transition. These AI plans are massively subsidized. The raw compute behind a heavy user costs multiples of the subscription price. Every move like this is the subsidy quietly correcting. Very sneaky, Anthropic.

Claude@claudeai

A small thank you to everyone using Claude: We’re doubling usage outside our peak hours for the next two weeks.

English

385

311

1.2M

Shivasurya@sshivasurya·14 Mar

@ctbbpodcast Or you can use codepathfinder.dev as a tool to drive with LLM to trace the code path 😉 more deterministically. Having tried with Gemini beyond context window size the results get poorer and misses code path.

English

146

Critical Thinking - Bug Bounty Podcast@ctbbpodcast·13 Mar

If you're running AI agents for hacking/research (you should), one of the coolest tips we got from dawgyg is to make them log what failed too. Brain dump it every few minutes to a file, everything it tried, everything that didn't work. Without it, agents loop back into the same dead ends and you won't notice until you've wasted a ton of time. If the agent cracks something and you only see the result, you've got nothing for next time. Also: never let them delete files. For Chrome specifically, dawgyg only uses Gemini because it's the only model with enough context window to trace a code path across hundreds of files without losing state. RCA that used to take 1-2 hours manually now takes about 2 minutes with AI help.

English

159

7.7K

Shivasurya@sshivasurya·13 Mar

@CtrlAltDwayne This falls under: anything that is verifiable is a good signal for AI training/RL loop youtu.be/b6Doq2fz81U?si…

YouTube

English

840

Dwayne@CtrlAltDwayne·13 Mar

The best argument for Rust in 2026 is not memory safety or performance. It is that AI writes better Rust than it writes C++. The compiler feedback loop is so tight that models self-correct in real time. Every error message is a free training signal. Rust was accidentally designed for AI-assisted development 10 years before anyone knew that mattered.

English

110

172

2.5K

171K

Shivasurya@sshivasurya·13 Mar

@49agents Exactly! Writing techspec and being more specific is way to go!

English

49 Agents - Agentic Coding IDE@49agents·13 Mar

@sshivasurya claude code overnight is the new alpha. specs before sleep, wake up to implemented and tested code. the workflow shift is real - writing the spec well matters more than writing the code

English

Shivasurya@sshivasurya·13 Mar

Spent the evening reviewing a tech spec, scrutinizing every detail, and splitting it into stacked PRs. Woke up to all of them implemented, tested. 🤯 Claude Code overnight is the new **alpha** ?

English

164

Shivasurya retweetet

alphaXiv@askalphaxiv·12 Mar

If doomscrolling X is part of your research workflow, we built something for you. Introducing Paperscrolling 🚀 Get the most trending research with key ideas, figures, and audio explanations from alphaXiv Briefs

English

126

1.4K

82K

Shivasurya@sshivasurya·13 Mar

@ycocerious Curating all edgecases, working on another techspec or reading book!

English

punarv@ycocerious·12 Mar

What do you guys do while your claude code is running?

English

501

293

59K

Shivasurya@sshivasurya·11 Mar

@AaronCQL Thanks for sharing! Will give it a try.

English

498

AaronCQL@AaronCQL·10 Mar

Spent an hour with pencil.dev and I'm sold. If you're an engineer who has strong design opinions but zero design skills (ie. me), this is your tool. Free, runs on all platforms, uses your own claude sub with no setup, and the built-in prompts actually work great.

English

100

2.5K

270.3K

Shivasurya@sshivasurya·11 Mar

@trq212 Finally much needed one!

English

Thariq@trq212·11 Mar

We just added /btw to Claude Code! Use it to have side chain conversations while Claude is working.

English

1.2K

1.6K

26K

2.7M

Shivasurya@sshivasurya·10 Mar

@claudeai If it costs $25 per PR review, RIP code review startups 😆.

English

106

Claude@claudeai·9 Mar

Introducing Code Review, a new feature for Claude Code. When a PR opens, Claude dispatches a team of agents to hunt for bugs.

English

2.1K

5.2K

62.9K

22.7M

Shivasurya retweetet

Andrej Karpathy@karpathy·7 Mar

I packaged up the "autoresearch" project into a new self-contained minimal repo if people would like to play over the weekend. It's basically nanochat LLM training core stripped down to a single-GPU, one file version of ~630 lines of code, then: - the human iterates on the prompt (.md) - the AI agent iterates on the training code (.py) The goal is to engineer your agents to make the fastest research progress indefinitely and without any of your own involvement. In the image, every dot is a complete LLM training run that lasts exactly 5 minutes. The agent works in an autonomous loop on a git feature branch and accumulates git commits to the training script as it finds better settings (of lower validation loss by the end) of the neural network architecture, the optimizer, all the hyperparameters, etc. You can imagine comparing the research progress of different prompts, different agents, etc. github.com/karpathy/autor… Part code, part sci-fi, and a pinch of psychosis :)

English

3.6K

28.2K

10.8M

Shivasurya@sshivasurya·8 Mar

@ekzhang1 Wow! That's cool.

English

101

Eric Zhang@ekzhang1·5 Mar

By popular request, next month at NYSRG is on program analysis. Maybe you use fuzzers or sanitizers, but how do they work? How about symbolic execution and formal verification? Some cool and useful tools that help people build the most complex systems :) notes.ekzhang.com/events/nysrg

English

117

16.8K

Shivasurya@sshivasurya·8 Mar

@jordanreviewsit Same with Apple to use finder search and spotlight search for an app 😉

English

254

Jordanreviewsittt@jordanreviewsit·7 Mar

Make the Microsoft CEO search for an email on Outlook

English

159

2.2K

115.7K

Shivasurya@sshivasurya·8 Mar

@karanb192 @SebAaltonen Try codepathfinder.dev/mcp inspired from CodeQL 😉

English

Karan Bansal@karanb192·8 Mar

@SebAaltonen Totally agree on tooling leverage. I wrote the original deep-dive on this (that diagram is from my blog) and the token efficiency gains are just as significant as the speed: ~75% fewer tokens on semantic queries. Full breakdown with benchmarks: karanbansal.in/blog/claude-co…

English

380

Sebastian Aaltonen@SebAaltonen·8 Mar

One of the many examples why LLM tooling matters a lot. There's several 10x+ gains to be had just by improving the LLM tool interfaces and LLM runner history management. We could achieve so much more with the LLMs we currently have if we tooled them better.

Om Patel@om_patel5

claude code has a hidden setting that makes it 600x faster and almost nobody knows about it by default it uses text grep to find functions. it doesn't understand your code at all. that's why it takes 30-60 seconds and sometimes returns the wrong file there's a flag called ENABLE_LSP_TOOL that connects it to language servers. same tech that powers vscode's ctrl+click to jump straight to the definition after enabling it: > "add a stripe webhook to my payments page" - claude finds your existing payment logic in 50ms instead of grepping through hundreds of files > "fix the auth bug on my dashboard" - traces the actual call hierarchy instead of guessing which file handles auth > after every edit it auto-catches type errors immediately instead of you finding them 10 prompts later also saves tokens because claude stops wasting context searching for the wrong files 2 minute setup and it works for 11 languages

English

178

33.2K

Entdecken

@djcows @allgarbled @MatthewBerman @nvidia @Dell @Pranit @ctbbpodcast @CtrlAltDwayne