Dhruv Agarwal

870 posts

Dhruv Agarwal banner
Dhruv Agarwal

Dhruv Agarwal

@furst_fly

21 | Building Donna | Won 100k cloud funding from Google | Beat the Anthropic open recruitment challenge | Stanford ASES fellow | @join_ef

Katılım Nisan 2022
1.2K Takip Edilen1.3K Takipçiler
Sabitlenmiş Tweet
Dhruv Agarwal
Dhruv Agarwal@furst_fly·
I will defeat death and achieve biological and absolute freedom
English
1
0
5
739
Dhruv Agarwal
Dhruv Agarwal@furst_fly·
Thinking of starting a community for people who want to learn biotech, drug discovery or anything bio in general. Reply to the tweet if you want in. Let's kill Death together
Dhruv Agarwal tweet media
English
82
16
200
10.6K
Vatsal Sanghvi
Vatsal Sanghvi@vatsal_sanghvi·
the goal is not to live forever, but to create something that will
English
9
1
28
920
🧬💊 PharmaAlerts 🚨
@furst_fly Love the ambition. A community focused on first principles in drug discovery could help filter out a lot of the noise in biotech.
English
1
0
1
229
Dhruv Agarwal
Dhruv Agarwal@furst_fly·
@reddy2go If the game got deleted from ur laptop the first time u died, it's would be a pretty sub optimal game no? The reason we enjoy games is because we can always try things without fear that there is a true end
English
1
0
2
237
reddy2go
reddy2go@reddy2go·
@furst_fly wait... where's the fun in playing the game if you can't get depleted, defeated, and destroyed?
GIF
English
1
0
1
316
Dhruv Agarwal
Dhruv Agarwal@furst_fly·
@jrkelly @Ginkgo I have never regretted not being in boston as much 😭, please build a lab in India!! (or fly me out to visit ^_^)
English
0
0
1
33
Jason Kelly
Jason Kelly@jrkelly·
Come by for a tour of our newly expanded autonomous lab at @Ginkgo - now at 150 lab devices and 100+ robots! Happy hour tonight at 4:30 whether you are in town for #BioITExpo or just want to visit! Sign up here: luma.com/t2tiq9u4
English
4
12
70
9.1K
Dhruv Agarwal
Dhruv Agarwal@furst_fly·
@AlexanderKalian Same energy as claiming humans will never discover all laws of physics because earth is a tiny part of the universe
English
2
0
0
130
Dr Alexander D. Kalian
Dr Alexander D. Kalian@AlexanderKalian·
AI will never feasibly "solve" drug discovery. There are an estimated 10^63 possible small druggable molecules (1 followed by 63 zeros). To truly cover druggable biochemical space, AI would need to learn how all possible chemotypes causally interact across complex biological systems and multi-omics layers. Biochemical space is better thought of as a giant dense knowledge graph with 10^63 nodes. Even with extremely generous assumptions (e.g. one training sample informing the model about 1 trillion nearby related molecular nodes), you would still need 10^51 training examples. That alone breaks the scalability of any current or near-future AI architecture - as well as modern computers themselves. And this is before adding quantum-mechanical descriptors, physicochemical properties, pharmacokinetics, toxicological pathways, and all the other rich data layers. We currently have meaningful data on only ~10^8 molecules in open databases like PubChem - a tiny fraction of what would be required. And we haven't even discussed AI drug discovery's navigation of edge cases, larger druggable molecules, antibodies, nanoparticles, or chemical mixtures. Building the data required for a true "solution" is beyond human civilisation's capacity. We will possibly someday be an interstellar civilisation and still be working on stubborn pain points in AI drug discovery. That said, AI can still meaningfully improve drug discovery - generating better candidates, improving virtual screening, and modestly raising clinical success rates. That's valuable and worth pursuing. But a rigorous "solving" of drug discovery? Completely unfeasible.
English
44
17
174
15.7K
Rounak Adhikary
Rounak Adhikary@Rounacc·
We reinvented the computer to end hardware dependency across all decices
English
27
7
92
18.6K
Dhruv Agarwal
Dhruv Agarwal@furst_fly·
@parmita I love this post and I love what you have been doing, Godspeed
English
1
0
6
132
Parmita Mishra
Parmita Mishra@parmita·
It is easy to be a naysayer and a skeptic. Societally, it is. It is truly, genuinely, hard to be an optimist. I see that in my haters. I love engaging with them, it is like a survey. And they are genuinely so stupid. Genuinely. So skeptical. They don't realize there is an information asymmetry happening. They really think they're on to something. They think I just say things for money, when raising for a dumber idea than this one is probably easier for me to do, and I don't need any more social media presence to do it... They say ridiculous things because they don't know what they don't know. "Are you using your TC hood as a prop for a video?" "You know you're bullshitting, right?" The stuff we are building internal to our org, that I will talk to investors about, advisors about, but never talk about in public until it's fully done, the data, the plans. So many ground-breaking, larger companies probably feel the same way today. Especially in deep tech.
English
7
4
41
3.2K
Savio
Savio@saviomartin7·
We got into Y Combinator (P26) After scaling SimpleClaw to $40k MRR in <3 weeks, we learnt what our users were trying to accomplish - build companies with agents. SimpleClaw is shutting down; I’m now 19, and made the hard decision to skip college to build Result, my biggest bet so far. Company announcement tomorrow. ok, back to work.
Savio tweet media
English
143
24
1K
80.8K
Dhruv Agarwal
Dhruv Agarwal@furst_fly·
Just watched @parmita's podcast - always good to find more longevity believers, there should be more sane people in the world. Here are some thoughts from a noob: 1) Her point on CRISPR was brutal but spot-on: tweak one gene for “higher intelligence” and suddenly your lungs are different. Single edits don’t work in complex systems. but this reminds me of @demishassabis's talk with @garrytan where he advised to hunt areas with insane combinatorial spaces, a clear success metric, and a way in via data or simulators. maybe AI could be the unlock here? 2) I'd like to push back on her point about LLMs being good at code because all the code already exists on a dataset. I think it's more because of verifiable rewards. the game changer might be a way to setup good, fast feedback loops….imagine agents actually running labs and equipment 24/7. 3) I like her points about how a company should ideally have some IP involved or regulation unlock involved, hits especially hard bcoz we seem to be stuck in pivot hell lol loved the quote "Every day you wake up and you're blinded by your mission. You are not married to a technique", I might end up stealing this. ^_^ thanks @b1shtream for recommending this to me.
English
1
2
25
5.1K
Dhruv Agarwal
Dhruv Agarwal@furst_fly·
@KitF_T I loved the blog! reading the paper and code rn, this is my first introduction to NLAs and I love that AI safety is being taken seriously! ^_^
English
0
0
1
108
Dhruv Agarwal
Dhruv Agarwal@furst_fly·
The cruelest thing about job hunting aren't the rejections, It's that, you will never know why you got rejected. Well here's Donna, She gives you 1) specific feedback on why you got rejected 2) a side by side comparision with the candidates who actually got shortlisted. try it out!
English
4
0
16
1.1K
Ananya Gupta
Ananya Gupta@an2_yea·
read the best book I've ever read this weeekend, no exaggeration and I am gonna cry not because it was a sob story but the Writing is SO eloquent?? gonna remember this one for a long time
English
1
0
3
487
arnav sonavane
arnav sonavane@w2sgarnav·
i have like 150+ ideas in ml, need to make an academia group ig
arnav sonavane@w2sgarnav

aiming for 5 research topics for the upcoming few months, if yall want to join in pls do so, GPU shortage wont be there (hopefully) (worked on these problem statements a bit previously, and have ran a few experiments on each) find them below: ps 1 : Process Reward Models Beyond Outcome Supervision Without the need for human-labeled trajectories, we provide a completely automated approach for training Process Reward Models (PRMs) that either meet or surpass the quality of gold step-level annotations. We create dense Monte-Carlo Tree Search (MCTS) rollouts with depth d ≥ 32 and branching factor b = 8, starting from a base policy π_θ trained via SFT on chain-of-thought data. Each intermediate step is scored using an ensemble of outcome verifiers (ORMs) bootstrapped from self-consistency and LLM-as-judge signals under temperature T = 0.7. A process-DPO variation with step-wise Bradley-Terry losses weighted by MCTS visit counts and calibrated via Platt scaling on a short held-out verification set is introduced to reduce verifier noise. By simultaneously optimising the PRM and policy under a single RLVR goal that alternates between process-level preference optimisation and outcome-level PPO updates, with adaptive mixing ratio λ_t planned via cosine annealing, our method closes the annotation gap. Our auto-annotated PRM delivers +14.7% pass@1 over outcome-only RM baselines at 7B scale and transfers to code and scientific reasoning domains with 3% deterioration following LoRA adaptation on 2k domain-specific trajectories, according to extensive ablation on GSM8K, MATH, and HumanEval. We present the multi-domain PRM benchmark, the distilled verifier weights, and the whole MCTS annotation program, offering the first production-ready recipe for frontier-scale process supervision. ps 2 : Computer-Use Agents and GUI Grounding In addition to introducing a large-scale synthetic data engine that uses Playwright + Android Emulator instrumentation to generate 500k grounded interaction traces across web, mobile, and desktop environments, we formalise GUI grounding failures through a tripartite decomposition: perception (pixel-to-semantic mapping), planning (high-level action sequence), and execution (low-level mouse/keyboard trajectories). Pixel-level segmentation masks, accessibility tree annotations, and oracle action sequences obtained via deterministic UI state diffing are linked with each trace. Using a hybrid loss that combines contrastive screen embedding alignment (using InfoNCE on cropped UI elements), autoregressive action token prediction, and auxiliary bounding-box regression heads that function at 4× downsampled resolution to maintain fine-grained OCR and icon semantics, we train a multimodal VLA policy on top of a Qwen2-VL-7B backbone. A domain-adversarial training objective that aligns screen embeddings across platforms while maintaining task-specific action distributions is combined with test-time adaptation using a lightweight 256M adapter that conditions on platform-specific accessibility trees to achieve cross-platform zero-shot transfer. Our model decreases end-to-end grounding error from 48% (Claude-3.5 baseline) to 19% on the recently released GUI-Grounding-Bench (which includes 12k actual jobs from WebArena, AndroidWorld, and OSWorld), with the biggest improvements in perception-heavy mobile UIs. We provide the cross-platform VLA checkpoint, the failure atlas taxonomy, and the complete synthetic trace generator, creating the first reproducible benchmark and recipe for reliable computer-use agents. ps 3 : Agent Memory Architectures Beyond RAG We present TypedAgentMemory, a modular memory substrate controlled by a differentiable memory controller trained end-to-end with the agent policy that explicitly distinguishes episodic semantic (dense vector summaries with SAE-derived concept tags), procedural, and working (short-term KV cache compression) memories. A 128-dim uncertainty head that thresholds epistemic uncertainty from an ensemble of forward passes gates memory writes. The controller uses a hierarchical policy over four memory operations: write, consolidate (graph-based merging with GNN message passing), forget (learned eviction via eligibility traces and recency + relevance scores), and retrieve (hybrid dense + symbolic query routing). Explicit memory consolidation every 50 steps is used to evaluate long-horizon tasks on τ-bench, WebArena, and GAIA. This results in a 2.3× decrease in context length and a 31% improvement in success rate over flat vector-store RAG baselines. Per-memory-type differential privacy approaches, such as homomorphic encryption for procedural skill graphs, concept-level k-anonymity on semantic features, and ε = 0.5 noise injection on episodic writing, are used to ensure privacy. Ablations show that typed memory facilitates effective cross-task transfer through procedural memory reuse and prevents catastrophic forgetting on 200-step agent trajectories. We provide the first rational substitute for monolithic RAG for production-grade autonomous agents by making the whole TypedAgentMemory library (based on LangGraph + FAISS + Neo4j), the long-horizon evaluation harness, and pretrained memory controllers for Llama-3.1-8B and Qwen2.5-72B open-source. ps 4: SAE Universality Across Model Families By training 128k-feature JumpReLU SAEs (expansion factor 64, k = 32) on residual streams of Llama-3.1-8B, Qwen2.5-72B, Gemma-2-27B, Mistral-Large-2, and DeepSeek-V3 with the same hyperparameters and reconstruction aims, we perform the first extensive cross-family SAE universality investigation. A bipartite matching that quantifies pairwise overlap at both neuron-level (cosine similarity > 0.85) and concept-level (via automated interpretation pipelines using 512 probe prompts per feature) is obtained by performing feature matching via optimal transport with Sinkhorn algorithm on normalised decoder weight matrices. By grouping similar features from different families into 4.2k platonic ideas and annotating each concept with activation data, downstream steering efficacy, and causal mediation scores calculated via route patching, we further build a universal feature library. Steering vectors created from the universal library outperform within-family SAEs on out-of-distribution tasks and enhance zero-shot generalisation on MMLU-Pro, GPQA, and LiveCodeBench by an average of 9.4% when transferred between families, according to downstream transfer studies. We make available the whole SAE training software, the universal concept library with 4.2k interpreted features, the cross-family matching dataset (which includes optimum transport plans), and a plug-and-play steering toolkit that works with Hugging Face Transformers and vLLM. In order to facilitate transfer learning, model merging, and safety interventions within the existing frontier model ecosystem, this study offers the first rigorous atlas and infrastructure for mechanistic universality. ps 5 : Synthetic Data Generation Without Mode Collapse We provide an iterated synthetic data pipeline that explicitly characterises the collapse threshold ρ*(q) as a function of generator quality q (as determined by the activation entropy of the SAE feature and the entropy of the output distribution H_π). Using temperature-annealed sampling (T=1.0 → 0.7) supplemented with SAE-guided rejection sampling, we create synthetic corpora at different mixing ratios ρ ∈ {0, 0.1,…, 1.0} starting from a 7B base policy π_θ trained on 200B tokens of FineWeb-Edu. At each generation, we train a 128k-feature JumpReLU SAE (expansion factor 64, k=32) on the residual stream of the current model and filter synthetic samples whose top-activating features show activation entropy below a calibrated threshold τ derived from the real-data reference distribution. Our experiments provide the first empirical collapse-threshold map ρ*(q) at 1.3B–7B scale, demonstrating that SAE-guided diversity sampling extends the safe mixing ratio by 2.3× compared to persona-conditioned or temperature-only baselines, while generator entropy H_π ≥ 4.2 nats delays the onset of measurable perplexity degradation on a held-out real validation set until generation 7 under accumulation (versus generation 3 under pure replacement). A closed-form constraint on variance contraction rate under synthetic mixing is derived theoretically, connecting the number of safe iterations before tail probability mass falls below 10^{-3} to the spectral gap of the generator's transition kernel.

English
9
0
86
6.9K