Varun Singh

9.8K posts

Varun Singh

@vr000m

@trydaily @pipecat_ai. ex-CEO @callstatsio acq’d by $eght. earlier multimedia protocols and video. Focus on growth, revenue. 🇺🇸🇫🇮🇮🇳

San Francisco, CA Katılım Ağustos 2007

2.3K Takip Edilen1.6K Takipçiler

Varun Singh retweetledi

Linus ✦ Ekenstam@LinusEkenstam·3d

Google just took a massive stab at Figma Unpopular take, Figma is still goat. But this clearly creates a massive crater, a void that will re-shuffle the map. Where will entry level designers go? $10.000/M designers, gone. Everyone gets better design? we’re accelerating

Stitch by Google@stitchbygoogle

Meet the new Stitch, your vibe design partner. Here are 5 major upgrades to help you create, iterate and collaborate: 🎨 AI-Native Canvas 🧠 Smarter Design Agent 🎙️ Voice ⚡️ Instant Prototypes 📐 Design Systems and DESIGN.md Rolling out now. Details and product walkthrough video in 🧵

English

201

63.7K

Varun Singh@vr000m·3d

Agree on router > monolith. We have skills that chain together: /dev-plan -> /review-plan -> /fan-out -> /deep-review -> /update-docs, each invokable by others. It's an infinite loop: build a skill, incorporate it into existing ones, repeat. The /deep-review skill was reworked after gstack dropped, before that it was just invoking /security and /security-review in a subagent. Based on the latest interaction, I feel Claude itself prefers smaller skills. In reviews it did ask me: do you want all this in one pass or should I provide the initial output for the model to then act? The Claude and Codex skills are here: github.com/vr000m/skills.…

dex@dexhorthy

Tried plan-review-ceo from gstack yesterday. I’m not sure if this is good or bad, intentional or not intentional, but when I felt like pushing back on the agent*, something in my brain feels like I’m arguing with Garry directly 🤣 Anyways milestone 1 of a big feature shipping with RPI/QRSPI + Gstack shipping today, will report back * (which @garrytan had stated is part of the process - “your job is to know when the model is gassing you up and call it out” or something) I have some technical concerns with the sheer volume of instructions in the prompt and the amount of adherence you will actually get (@0xblacklight cited an interesting arxiv paper in post linked below) - I think we might be better served by a router that routes to specific modes, rather than explaining every single mode in a single monolithic prompt, but there’s tradeoffs to consider in plumbing and Ux for the end user. I think some may complain that it’s overly verbose and thoughtful and brings up things that are irrelevant but I actually think that’s good. I want a clean braindump of everything that might be relevant so I can edit and prune down to just what’s important

English

Varun Singh retweetledi

Dan Woods@danveloper·4d

x.com/i/article/2034…

ZXX

169

1.2K

611K

Varun Singh retweetledi

Jean P.D. Meijer ― 🇪🇺 eu/acc@initjean·4d

EU Inc. proposal right now: - max. 48 hrs and €100 to incorporate - registration through common EU portal, automatic tax registration - fully digital process, no notaries - EU employee stock option plans, taxed only when sold - simplified insolvency procedures

European Commission@EU_Commission

We are introducing EU Inc. To make building and growing a business across the EU faster, simpler, and smarter. 🔸 Start a company in less than 48 hours 🔸 No minimum capital requirement 🔸 Fully online and borderless

English

138

191

3.6K

660.9K

Varun Singh@vr000m·4d

My code reviewer today: Bohr, Fermat, Hegel, and Singer. No pressure or anything.

English

Varun Singh retweetledi

kwindla@kwindla·5d

Come by and see @EvanGrenda at the AWS booth at GTC. @tavus video avatars, voice agents built with NVIDIA Nemotron models, and new realtime AI architecture patterns in @pipecat_ai!

English

1.3K

Varun Singh retweetledi

Latent.Space@latentspacepod·4d

🆕 Claude Cowork, Skills, and the Future of AI Coworkers latent.space/p/felix-anthro… @felixrieseberg has spent years working at the interface layer, from Electron and the Slack desktop app to now helping build @claudeai Cowork. In this episode, Felix explains why execution is getting so cheap that teams can “build all the candidates,” why Anthropic is betting on local-first agent workflows, and why the future of AI products may belong less to chatbots and more to systems that can actually do knowledge work.

English

56.8K

Varun Singh@vr000m·14 Mar

@tlehmanifold thanks! there are equivalent codex ones as well, in case someone comes looking. github.com/vr000m/skills.… github.com/vr000m/skills.…

English

Tobi Lehman@tlehmanifold·14 Mar

@vr000m These are very cool!

English

Varun Singh@vr000m·11 Mar

These days, finding new skills.md is similar to trying a new mashup. I think running this on something like medisoup or a pion to see how many of the rfcs are reproduced

Tobi Lehman@tlehmanifold

@dexhorthy this is why I created the literate programming skill: github.com/tlehman/litpro… you run /literate-programming in Claude Code and it ingests the code, produces a project.lit.md, the tangles it back to source code, and also produces a beautiful PDF to print out and study every line

English

114

Varun Singh@vr000m·14 Mar

@tlehmanifold Her are two skills github.com/vr000m/skills.… github.com/vr000m/skills.…

English

Tobi Lehman@tlehmanifold·11 Mar

@vr000m RFC finder skill would be awesome, could link to it instead of reproducing it. Is that what you had in mind?

English

Varun Singh@vr000m·12 Mar

For the technically curious: we use @trychroma's ChromaDB + SQLite FTS5 with reciprocal rank fusion. Local all-MiniLM-L6-v2 embeddings for docs parsing. AST extraction for class definitions, method signatures, and full function bodies. 16K indexed chunks. github.com/vr000m/pipecat…

English

Varun Singh@vr000m·12 Mar

. @pipecat_ai releases updates every week. We also write a lot of sample code for our customers. I realised that the coding agents end up grepping through .venv or spend cycles correcting the code after implementation, i.e., during testing realise something worked differently from what was expected. We built pipecat-context-hub for @pipecat_ai, an MCP server giving coding agents structured access to Pipecat's framework docs, example code, and AST-indexed source. We use it daily. Anytime the coding agent starts grepping through .venv, that is input for improvement. The initial lesson: just providing latest docs was not enough. Two real examples from this week's sessions. An agent researching Pipecat's context summarization API used search_api and get_code_snippet to build a reference doc. The tools surfaced that LLMContextSummarizationConfig is deprecated in favor of LLMAutoContextSummarizationConfig (v0.0.104+) — without that, the agent would have used the old class name and had to fix deprecation warnings during testing. Exactly @AndrewYNg's point about outdated APIs. In another session, an agent was estimating LLM token usage by dividing character count by 4. Six MCP calls traced the real metrics path. The key finding was that MetricsFrame is emitted for every LLM call including tool follow-ups, without that information the agent would have undercounted tokens. (summarized by claude looking at jsonl) My pipecat-context-hub uses code beyond the core pipecat-ai repo. I found that adding additional examples help build faster. My local copy has indexed 12 repos (added to `PIPECAT_HUB_EXTRA_REPOS=` in the .env). And because Pipecat's API evolves fast, the db index tracks commit SHAs and doc content hashes. Agents get recent code with older examples as fallback. We could potentially use the database to find stale examples, unused code paths and generate new examples.

Andrew Ng@AndrewYNg

I'm excited to announce Context Hub, an open tool that gives your coding agent the up-to-date API documentation it needs. Install it and prompt your agent to use it to fetch curated docs via a simple CLI. (See image.) Why this matters: Coding agents often use outdated APIs and hallucinate parameters. For example, when I ask Claude Code to call OpenAI's GPT-5.2, it uses the older chat completions API instead of the newer responses API, even though the newer one has been out for a year. Context Hub solves this. Context Hub is also designed to get smarter over time. Agents can annotate docs with notes — if your agent discovers a workaround, it can save it and doesn't have to rediscover it next session. Longer term, we're building toward agents sharing what they learn with each other, so the whole community benefits. Thanks Rohit Prsad and Xin Ye for working with me on this! npm install -g @aisuite/chub GitHub: github.com/andrewyng/cont…

English

Varun Singh retweetledi

kwindla@kwindla·11 Mar

NVIDIA Nemotron 3 Super launches today! We've been building voice agents with Super's pre-release checkpoints and running all our various tests and benchmarks. Nemotron 3 Super matches both GPT-5.4 and GPT-4.1 in tool calling and instruction following performance on our realtime conversation, long context, real-world benchmarks. GPT-4.1 is the most widely used LLM today for production voice agents. So an open model that performs as well as GPT-4.1 on hard, voice-specific benchmarks is a big deal. (Side note: we don't think a benchmark "tells the story" about a model's voice agent performance unless it tests model correctness across at least 20 human/agent conversation turns.) The Nemotron models are *fully* open: weights, data sets, training code, inference code. Nemotron 3 Super is 120B params, with a hybrid Mamba-Transformer MoE architecture for efficient inference. You can run it on NVIDIA data center hardware or on a DGX Spark mini-desktop machine. 1M token context. Blog post with full benchmarks, thinking budget notes, inference setup on @Modal, and where we think this goes next. 👇

English

231

19.3K

Varun Singh retweetledi

Hume AI@hume_ai·10 Mar

Today we're releasing our first open source TTS model, TADA! TADA (Text Audio Dual Alignment) is a speech-language model that generates text and audio in one synchronized stream to reduce token-level hallucinations and improve latency. This means: → Zero content hallucinations across 1,000+ test samples → 5x faster than similar-grade LLM-based TTS → Fits much longer audio: 2,048 tokens cover ~700 seconds with TADA vs. ~70 seconds in conventional systems → Free transcript alongside audio with no added latency

English

312

2.9K

256.4K

Varun Singh retweetledi

Andrej Karpathy@karpathy·7 Mar

I packaged up the "autoresearch" project into a new self-contained minimal repo if people would like to play over the weekend. It's basically nanochat LLM training core stripped down to a single-GPU, one file version of ~630 lines of code, then: - the human iterates on the prompt (.md) - the AI agent iterates on the training code (.py) The goal is to engineer your agents to make the fastest research progress indefinitely and without any of your own involvement. In the image, every dot is a complete LLM training run that lasts exactly 5 minutes. The agent works in an autonomous loop on a git feature branch and accumulates git commits to the training script as it finds better settings (of lower validation loss by the end) of the neural network architecture, the optimizer, all the hyperparameters, etc. You can imagine comparing the research progress of different prompts, different agents, etc. github.com/karpathy/autor… Part code, part sci-fi, and a pinch of psychosis :)

English

3.6K

28.2K

10.8M

Varun Singh retweetledi

geoff@GeoffreyHuntley·7 Mar

here until 8 if y’all wanna nerd out about stuff

geoff@GeoffreyHuntley

heading on over to sf.aitinkerers.org/p/background-a…

English

4.3K

Varun Singh retweetledi

Awni Hannun@awnihannun·6 Mar

According to benchmarks Qwen3.5 4B is as good as GPT 4o. GPT 4o came out ~2 years ago (May 2024). Qwen 3.5 4B runs easily on modern mobile devices. So the gap between frontier intelligence in a datacenter and running a model of equal quality on your iPhone could be 2-3 years. (Probably closer to 3 assuming Qwen3.5 4B is more benchmaxxed than 4o) I don't expect the trend of increasing intelligence-per-watt to change. So in 2-3 years it's plausible we will be running GPT 5.x quality models on an iPhone. Pretty wild.

English

125

150

198K

Varun Singh retweetledi

Addy Osmani@addyosmani·5 Mar

Introducing the Google Workspace CLI: github.com/googleworkspac… - built for humans and agents. Google Drive, Gmail, Calendar, and every Workspace API. 40+ agent skills included.

English

654

1.6K

15K

5.4M

Varun Singh retweetledi

Aida Baradari@aidaxbaradari·3 Mar

Today, we're introducing Spectre I, the first smart device to stop unwanted audio recordings. We live in a world of always-on listening devices. Smart devices and AI dominate our world in business and private conversations. With Deveillance, you will @be_inaudible.

English

1.1K

42.5K

4.4M

Varun Singh@vr000m·3 Mar

@OtsoVeistera @thetokenco Hyvää! Would love to get this on to @pipecat_ai as a preprocessor in the pipeline between the stt and llm. cc: @aconchillo

English

780

otso veistera@OtsoVeistera·3 Mar

You're wasting half your context window. We’re launching @thetokenco (YC W26) today. We compress LLM inputs before they reach the model. Fewer tokens, lower cost, faster inference. Models also perform better. In customer case studies we’ve seen a +5% lift in user purchases due to higher preference for outputs from compressed prompts. The API is live. Link in the comments

English

507

91.5K

Keşfet

@EvanGrenda @tavus @pipecat_ai @felixrieseberg @claudeai @tlehmanifold @trychroma @AndrewYNg