Varun Singh

9.8K posts

Varun Singh banner
Varun Singh

Varun Singh

@vr000m

@trydaily @pipecat_ai. ex-CEO @callstatsio acq’d by $eght. earlier multimedia protocols and video. Focus on growth, revenue. 🇺🇸🇫🇮🇮🇳

San Francisco, CA Katılım Ağustos 2007
2.3K Takip Edilen1.6K Takipçiler
Varun Singh retweetledi
Linus ✦ Ekenstam
Linus ✦ Ekenstam@LinusEkenstam·
Google just took a massive stab at Figma Unpopular take, Figma is still goat. But this clearly creates a massive crater, a void that will re-shuffle the map. Where will entry level designers go? $10.000/M designers, gone. Everyone gets better design? we’re accelerating
Stitch by Google@stitchbygoogle

Meet the new Stitch, your vibe design partner. Here are 5 major upgrades to help you create, iterate and collaborate: 🎨 AI-Native Canvas 🧠 Smarter Design Agent 🎙️ Voice ⚡️ Instant Prototypes 📐 Design Systems and DESIGN.md Rolling out now. Details and product walkthrough video in 🧵

English
45
18
201
63.7K
Varun Singh
Varun Singh@vr000m·
Agree on router > monolith. We have skills that chain together: /dev-plan -> /review-plan -> /fan-out -> /deep-review -> /update-docs, each invokable by others. It's an infinite loop: build a skill, incorporate it into existing ones, repeat. The /deep-review skill was reworked after gstack dropped, before that it was just invoking /security and /security-review in a subagent. Based on the latest interaction, I feel Claude itself prefers smaller skills. In reviews it did ask me: do you want all this in one pass or should I provide the initial output for the model to then act? The Claude and Codex skills are here: github.com/vr000m/skills.…
dex@dexhorthy

Tried plan-review-ceo from gstack yesterday. I’m not sure if this is good or bad, intentional or not intentional, but when I felt like pushing back on the agent*, something in my brain feels like I’m arguing with Garry directly 🤣 Anyways milestone 1 of a big feature shipping with RPI/QRSPI + Gstack shipping today, will report back * (which @garrytan had stated is part of the process - “your job is to know when the model is gassing you up and call it out” or something) I have some technical concerns with the sheer volume of instructions in the prompt and the amount of adherence you will actually get (@0xblacklight cited an interesting arxiv paper in post linked below) - I think we might be better served by a router that routes to specific modes, rather than explaining every single mode in a single monolithic prompt, but there’s tradeoffs to consider in plumbing and Ux for the end user. I think some may complain that it’s overly verbose and thoughtful and brings up things that are irrelevant but I actually think that’s good. I want a clean braindump of everything that might be relevant so I can edit and prune down to just what’s important

English
0
0
1
89
Varun Singh retweetledi
Jean P.D. Meijer ― 🇪🇺 eu/acc
EU Inc. proposal right now: - max. 48 hrs and €100 to incorporate - registration through common EU portal, automatic tax registration - fully digital process, no notaries - EU employee stock option plans, taxed only when sold - simplified insolvency procedures
European Commission@EU_Commission

We are introducing EU Inc. To make building and growing a business across the EU faster, simpler, and smarter. 🔸 Start a company in less than 48 hours 🔸 No minimum capital requirement 🔸 Fully online and borderless

English
138
191
3.6K
660.9K
Varun Singh
Varun Singh@vr000m·
My code reviewer today: Bohr, Fermat, Hegel, and Singer. No pressure or anything.
Varun Singh tweet mediaVarun Singh tweet media
English
0
0
0
53
Varun Singh retweetledi
kwindla
kwindla@kwindla·
Come by and see @EvanGrenda at the AWS booth at GTC. @tavus video avatars, voice agents built with NVIDIA Nemotron models, and new realtime AI architecture patterns in @pipecat_ai!
kwindla tweet media
English
1
2
8
1.3K
Varun Singh retweetledi
Latent.Space
Latent.Space@latentspacepod·
🆕 Claude Cowork, Skills, and the Future of AI Coworkers latent.space/p/felix-anthro… @felixrieseberg has spent years working at the interface layer, from Electron and the Slack desktop app to now helping build @claudeai Cowork. In this episode, Felix explains why execution is getting so cheap that teams can “build all the candidates,” why Anthropic is betting on local-first agent workflows, and why the future of AI products may belong less to chatbots and more to systems that can actually do knowledge work.
English
12
11
98
56.8K
Varun Singh
Varun Singh@vr000m·
These days, finding new skills.md is similar to trying a new mashup. I think running this on something like medisoup or a pion to see how many of the rfcs are reproduced
Tobi Lehman@tlehmanifold

@dexhorthy this is why I created the literate programming skill: github.com/tlehman/litpro… you run /literate-programming in Claude Code and it ingests the code, produces a project.lit.md, the tangles it back to source code, and also produces a beautiful PDF to print out and study every line

English
1
0
1
114
Tobi Lehman
Tobi Lehman@tlehmanifold·
@vr000m RFC finder skill would be awesome, could link to it instead of reproducing it. Is that what you had in mind?
English
1
0
0
76
Varun Singh
Varun Singh@vr000m·
For the technically curious: we use @trychroma's ChromaDB + SQLite FTS5 with reciprocal rank fusion. Local all-MiniLM-L6-v2 embeddings for docs parsing. AST extraction for class definitions, method signatures, and full function bodies. 16K indexed chunks. github.com/vr000m/pipecat…
English
0
0
0
52
Varun Singh
Varun Singh@vr000m·
. @pipecat_ai releases updates every week. We also write a lot of sample code for our customers. I realised that the coding agents end up grepping through .venv or spend cycles correcting the code after implementation, i.e., during testing realise something worked differently from what was expected. We built pipecat-context-hub for @pipecat_ai, an MCP server giving coding agents structured access to Pipecat's framework docs, example code, and AST-indexed source. We use it daily. Anytime the coding agent starts grepping through .venv, that is input for improvement. The initial lesson: just providing latest docs was not enough. Two real examples from this week's sessions. An agent researching Pipecat's context summarization API used search_api and get_code_snippet to build a reference doc. The tools surfaced that LLMContextSummarizationConfig is deprecated in favor of LLMAutoContextSummarizationConfig (v0.0.104+) — without that, the agent would have used the old class name and had to fix deprecation warnings during testing. Exactly @AndrewYNg's point about outdated APIs. In another session, an agent was estimating LLM token usage by dividing character count by 4. Six MCP calls traced the real metrics path. The key finding was that MetricsFrame is emitted for every LLM call including tool follow-ups, without that information the agent would have undercounted tokens. (summarized by claude looking at jsonl) My pipecat-context-hub uses code beyond the core pipecat-ai repo. I found that adding additional examples help build faster. My local copy has indexed 12 repos (added to `PIPECAT_HUB_EXTRA_REPOS=` in the .env). And because Pipecat's API evolves fast, the db index tracks commit SHAs and doc content hashes. Agents get recent code with older examples as fallback. We could potentially use the database to find stale examples, unused code paths and generate new examples.
Varun Singh tweet media
Andrew Ng@AndrewYNg

I'm excited to announce Context Hub, an open tool that gives your coding agent the up-to-date API documentation it needs. Install it and prompt your agent to use it to fetch curated docs via a simple CLI. (See image.) Why this matters: Coding agents often use outdated APIs and hallucinate parameters. For example, when I ask Claude Code to call OpenAI's GPT-5.2, it uses the older chat completions API instead of the newer responses API, even though the newer one has been out for a year. Context Hub solves this. Context Hub is also designed to get smarter over time. Agents can annotate docs with notes — if your agent discovers a workaround, it can save it and doesn't have to rediscover it next session. Longer term, we're building toward agents sharing what they learn with each other, so the whole community benefits. Thanks Rohit Prsad and Xin Ye for working with me on this! npm install -g @aisuite/chub GitHub: github.com/andrewyng/cont…

English
1
0
1
91
Varun Singh retweetledi
kwindla
kwindla@kwindla·
NVIDIA Nemotron 3 Super launches today! We've been building voice agents with Super's pre-release checkpoints and running all our various tests and benchmarks. Nemotron 3 Super matches both GPT-5.4 and GPT-4.1 in tool calling and instruction following performance on our realtime conversation, long context, real-world benchmarks. GPT-4.1 is the most widely used LLM today for production voice agents. So an open model that performs as well as GPT-4.1 on hard, voice-specific benchmarks is a big deal. (Side note: we don't think a benchmark "tells the story" about a model's voice agent performance unless it tests model correctness across at least 20 human/agent conversation turns.) The Nemotron models are *fully* open: weights, data sets, training code, inference code. Nemotron 3 Super is 120B params, with a hybrid Mamba-Transformer MoE architecture for efficient inference. You can run it on NVIDIA data center hardware or on a DGX Spark mini-desktop machine. 1M token context. Blog post with full benchmarks, thinking budget notes, inference setup on @Modal, and where we think this goes next. 👇
kwindla tweet media
English
13
34
231
19.3K
Varun Singh retweetledi
Hume AI
Hume AI@hume_ai·
Today we're releasing our first open source TTS model, TADA! TADA (Text Audio Dual Alignment) is a speech-language model that generates text and audio in one synchronized stream to reduce token-level hallucinations and improve latency. This means: → Zero content hallucinations across 1,000+ test samples → 5x faster than similar-grade LLM-based TTS → Fits much longer audio: 2,048 tokens cover ~700 seconds with TADA vs. ~70 seconds in conventional systems → Free transcript alongside audio with no added latency
English
98
312
2.9K
256.4K
Varun Singh retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
I packaged up the "autoresearch" project into a new self-contained minimal repo if people would like to play over the weekend. It's basically nanochat LLM training core stripped down to a single-GPU, one file version of ~630 lines of code, then: - the human iterates on the prompt (.md) - the AI agent iterates on the training code (.py) The goal is to engineer your agents to make the fastest research progress indefinitely and without any of your own involvement. In the image, every dot is a complete LLM training run that lasts exactly 5 minutes. The agent works in an autonomous loop on a git feature branch and accumulates git commits to the training script as it finds better settings (of lower validation loss by the end) of the neural network architecture, the optimizer, all the hyperparameters, etc. You can imagine comparing the research progress of different prompts, different agents, etc. github.com/karpathy/autor… Part code, part sci-fi, and a pinch of psychosis :)
Andrej Karpathy tweet media
English
1K
3.6K
28.2K
10.8M
Varun Singh retweetledi
Awni Hannun
Awni Hannun@awnihannun·
According to benchmarks Qwen3.5 4B is as good as GPT 4o. GPT 4o came out ~2 years ago (May 2024). Qwen 3.5 4B runs easily on modern mobile devices. So the gap between frontier intelligence in a datacenter and running a model of equal quality on your iPhone could be 2-3 years. (Probably closer to 3 assuming Qwen3.5 4B is more benchmaxxed than 4o) I don't expect the trend of increasing intelligence-per-watt to change. So in 2-3 years it's plausible we will be running GPT 5.x quality models on an iPhone. Pretty wild.
English
125
150
2K
198K
Varun Singh retweetledi
Addy Osmani
Addy Osmani@addyosmani·
Introducing the Google Workspace CLI: github.com/googleworkspac… - built for humans and agents. Google Drive, Gmail, Calendar, and every Workspace API. 40+ agent skills included.
English
654
1.6K
15K
5.4M
Varun Singh retweetledi
Aida Baradari
Aida Baradari@aidaxbaradari·
Today, we're introducing Spectre I, the first smart device to stop unwanted audio recordings. We live in a world of always-on listening devices. Smart devices and AI dominate our world in business and private conversations. With Deveillance, you will @be_inaudible.
English
1.1K
5K
42.5K
4.4M
otso veistera
otso veistera@OtsoVeistera·
You're wasting half your context window. We’re launching @thetokenco (YC W26) today. We compress LLM inputs before they reach the model. Fewer tokens, lower cost, faster inference. Models also perform better. In customer case studies we’ve seen a +5% lift in user purchases due to higher preference for outputs from compressed prompts. The API is live. Link in the comments
English
76
57
507
91.5K