Dhruv Agarwal

870 posts

Dhruv Agarwal

@furst_fly

Katılım Nisan 2022

1.2K Takip Edilen1.3K Takipçiler

Sabitlenmiş Tweet

Dhruv Agarwal@furst_fly·11 May

I will defeat death and achieve biological and absolute freedom

English

752

Dhruv Agarwal@furst_fly·2d

@BagOfNeurons LFG!!

213

Colin Kakama@BagOfNeurons·2d

@furst_fly Let's go!!!!

English

230

Dhruv Agarwal@furst_fly·2d

Thinking of starting a community for people who want to learn biotech, drug discovery or anything bio in general. Reply to the tweet if you want in. Let's kill Death together

English

202

10.8K

Dhruv Agarwal@furst_fly·2d

@vatsal_sanghvi Why not do both

English

Vatsal Sanghvi@vatsal_sanghvi·2d

the goal is not to live forever, but to create something that will

English

932

Dhruv Agarwal@furst_fly·2d

@PharmaAlerts No bots in the community!!

English

197

🧬💊 PharmaAlerts 🚨@PharmaAlerts·2d

@furst_fly Love the ambition. A community focused on first principles in drug discovery could help filter out a lot of the noise in biotech.

English

234

Dhruv Agarwal@furst_fly·2d

@singhalkarunx Awesome!! Lmk if you find any interesting bio folks!

English

293

Karun Singhal@singhalkarunx·2d

@furst_fly noob but def find it something interesting to learn about

English

290

Dhruv Agarwal@furst_fly·2d

@reddy2go If the game got deleted from ur laptop the first time u died, it's would be a pretty sub optimal game no? The reason we enjoy games is because we can always try things without fear that there is a true end

English

243

reddy2go@reddy2go·2d

@furst_fly wait... where's the fun in playing the game if you can't get depleted, defeated, and destroyed?

GIF

English

325

Dhruv Agarwal@furst_fly·3d

@DanielMukasa1 sent a DM!

English

Daniel Mukasa@DanielMukasa1·3d

Excited for this next step! Reach out for more info

Y Combinator@ycombinator

Abinitio Bio is building the foundation models for biomanufacturing, turning 6-18 month process decisions into hours of compute and saving pharma $100M+ per month of delay on blockbusters. Congrats on the launch, @DanielMukasa1! ycombinator.com/launches/QS3-a…

English

1.3K

Dhruv Agarwal@furst_fly·6d

@jrkelly @Ginkgo I have never regretted not being in boston as much 😭, please build a lab in India!! (or fly me out to visit ^_^)

English

Jason Kelly@jrkelly·19 May

Come by for a tour of our newly expanded autonomous lab at @Ginkgo - now at 150 lab devices and 100+ robots! Happy hour tonight at 4:30 whether you are in town for #BioITExpo or just want to visit! Sign up here: luma.com/t2tiq9u4

English

9.1K

Dhruv Agarwal@furst_fly·17 May

@AlexanderKalian Same energy as claiming humans will never discover all laws of physics because earth is a tiny part of the universe

English

130

Dr Alexander D. Kalian@AlexanderKalian·16 May

AI will never feasibly "solve" drug discovery. There are an estimated 10^63 possible small druggable molecules (1 followed by 63 zeros). To truly cover druggable biochemical space, AI would need to learn how all possible chemotypes causally interact across complex biological systems and multi-omics layers. Biochemical space is better thought of as a giant dense knowledge graph with 10^63 nodes. Even with extremely generous assumptions (e.g. one training sample informing the model about 1 trillion nearby related molecular nodes), you would still need 10^51 training examples. That alone breaks the scalability of any current or near-future AI architecture - as well as modern computers themselves. And this is before adding quantum-mechanical descriptors, physicochemical properties, pharmacokinetics, toxicological pathways, and all the other rich data layers. We currently have meaningful data on only ~10^8 molecules in open databases like PubChem - a tiny fraction of what would be required. And we haven't even discussed AI drug discovery's navigation of edge cases, larger druggable molecules, antibodies, nanoparticles, or chemical mixtures. Building the data required for a true "solution" is beyond human civilisation's capacity. We will possibly someday be an interstellar civilisation and still be working on stubborn pain points in AI drug discovery. That said, AI can still meaningfully improve drug discovery - generating better candidates, improving virtual screening, and modestly raising clinical success rates. That's valuable and worth pursuing. But a rigorous "solving" of drug discovery? Completely unfeasible.

English

174

15.7K

Dhruv Agarwal@furst_fly·16 May

@Rounacc I need thisss

English

Rounak Adhikary@Rounacc·15 May

We reinvented the computer to end hardware dependency across all decices

English

18.6K

Dhruv Agarwal@furst_fly·16 May

@parmita I love this post and I love what you have been doing, Godspeed

English

132

Dhruv Agarwal@furst_fly·16 May

@saviomartin7 @kushwah_aaryan Can't wait for the company announcement, congratssss

English

Savio@saviomartin7·15 May

We got into Y Combinator (P26) After scaling SimpleClaw to $40k MRR in <3 weeks, we learnt what our users were trying to accomplish - build companies with agents. SimpleClaw is shutting down; I’m now 19, and made the hard decision to skip college to build Result, my biggest bet so far. Company announcement tomorrow. ok, back to work.

English

143

80.9K

Dhruv Agarwal@furst_fly·16 May

This is possibly the most inspiring picture ever.

Jake Wintermute 🧬/acc@SynBio1

Doing biotech is so metal

English

584

Dhruv Agarwal@furst_fly·13 May

Just watched @parmita's podcast - always good to find more longevity believers, there should be more sane people in the world. Here are some thoughts from a noob: 1) Her point on CRISPR was brutal but spot-on: tweak one gene for “higher intelligence” and suddenly your lungs are different. Single edits don’t work in complex systems. but this reminds me of @demishassabis's talk with @garrytan where he advised to hunt areas with insane combinatorial spaces, a clear success metric, and a way in via data or simulators. maybe AI could be the unlock here? 2) I'd like to push back on her point about LLMs being good at code because all the code already exists on a dataset. I think it's more because of verifiable rewards. the game changer might be a way to setup good, fast feedback loops….imagine agents actually running labs and equipment 24/7. 3) I like her points about how a company should ideally have some IP involved or regulation unlock involved, hits especially hard bcoz we seem to be stuck in pivot hell lol loved the quote "Every day you wake up and you're blinded by your mission. You are not married to a technique", I might end up stealing this. ^_^ thanks @b1shtream for recommending this to me.

English

5.1K

Dhruv Agarwal@furst_fly·8 May

@KitF_T I loved the blog! reading the paper and code rn, this is my first introduction to NLAs and I love that AI safety is being taken seriously! ^_^

English

108

Kit Fraser-Taliente@KitF_T·7 May

trained the first natural language autoencoder on gpt-2 almost a year ago, now we have one on mythos.🥲 do read the paper/play with the live demo! so excited it's finally out.

Anthropic@AnthropicAI

New Anthropic research: Natural Language Autoencoders. Models like Claude talk in words but think in numbers. The numbers—called activations—encode Claude’s thoughts, but not in a language we can read. Here, we train Claude to translate its activations into human-readable text.

English

207

13.1K

Dhruv Agarwal@furst_fly·5 May

find Donna at trydonna.net

English

213

Dhruv Agarwal@furst_fly·5 May

The cruelest thing about job hunting aren't the rejections, It's that, you will never know why you got rejected. Well here's Donna, She gives you 1) specific feedback on why you got rejected 2) a side by side comparision with the candidates who actually got shortlisted. try it out!

English

1.1K

Dhruv Agarwal@furst_fly·3 May

@an2_yea C'mon yeh kya cliffhanger hai, name bolo

Filipino

Ananya Gupta@an2_yea·3 May

read the best book I've ever read this weeekend, no exaggeration and I am gonna cry not because it was a sob story but the Writing is SO eloquent?? gonna remember this one for a long time

English

488

Dhruv Agarwal@furst_fly·3 May

@w2sgarnav lol, join erdos

English

114

arnav sonavane@w2sgarnav·3 May

i have like 150+ ideas in ml, need to make an academia group ig

arnav sonavane@w2sgarnav

aiming for 5 research topics for the upcoming few months, if yall want to join in pls do so, GPU shortage wont be there (hopefully) (worked on these problem statements a bit previously, and have ran a few experiments on each) find them below: ps 1 : Process Reward Models Beyond Outcome Supervision Without the need for human-labeled trajectories, we provide a completely automated approach for training Process Reward Models (PRMs) that either meet or surpass the quality of gold step-level annotations. We create dense Monte-Carlo Tree Search (MCTS) rollouts with depth d ≥ 32 and branching factor b = 8, starting from a base policy π_θ trained via SFT on chain-of-thought data. Each intermediate step is scored using an ensemble of outcome verifiers (ORMs) bootstrapped from self-consistency and LLM-as-judge signals under temperature T = 0.7. A process-DPO variation with step-wise Bradley-Terry losses weighted by MCTS visit counts and calibrated via Platt scaling on a short held-out verification set is introduced to reduce verifier noise. By simultaneously optimising the PRM and policy under a single RLVR goal that alternates between process-level preference optimisation and outcome-level PPO updates, with adaptive mixing ratio λ_t planned via cosine annealing, our method closes the annotation gap. Our auto-annotated PRM delivers +14.7% pass @1 over outcome-only RM baselines at 7B scale and transfers to code and scientific reasoning domains with 3% deterioration following LoRA adaptation on 2k domain-specific trajectories, according to extensive ablation on GSM8K, MATH, and HumanEval. We present the multi-domain PRM benchmark, the distilled verifier weights, and the whole MCTS annotation program, offering the first production-ready recipe for frontier-scale process supervision. ps 2 : Computer-Use Agents and GUI Grounding In addition to introducing a large-scale synthetic data engine that uses Playwright + Android Emulator instrumentation to generate 500k grounded interaction traces across web, mobile, and desktop environments, we formalise GUI grounding failures through a tripartite decomposition: perception (pixel-to-semantic mapping), planning (high-level action sequence), and execution (low-level mouse/keyboard trajectories). Pixel-level segmentation masks, accessibility tree annotations, and oracle action sequences obtained via deterministic UI state diffing are linked with each trace. Using a hybrid loss that combines contrastive screen embedding alignment (using InfoNCE on cropped UI elements), autoregressive action token prediction, and auxiliary bounding-box regression heads that function at 4× downsampled resolution to maintain fine-grained OCR and icon semantics, we train a multimodal VLA policy on top of a Qwen2-VL-7B backbone. A domain-adversarial training objective that aligns screen embeddings across platforms while maintaining task-specific action distributions is combined with test-time adaptation using a lightweight 256M adapter that conditions on platform-specific accessibility trees to achieve cross-platform zero-shot transfer. Our model decreases end-to-end grounding error from 48% (Claude-3.5 baseline) to 19% on the recently released GUI-Grounding-Bench (which includes 12k actual jobs from WebArena, AndroidWorld, and OSWorld), with the biggest improvements in perception-heavy mobile UIs. We provide the cross-platform VLA checkpoint, the failure atlas taxonomy, and the complete synthetic trace generator, creating the first reproducible benchmark and recipe for reliable computer-use agents. ps 3 : Agent Memory Architectures Beyond RAG We present TypedAgentMemory, a modular memory substrate controlled by a differentiable memory controller trained end-to-end with the agent policy that explicitly distinguishes episodic semantic (dense vector summaries with SAE-derived concept tags), procedural, and working (short-term KV cache compression) memories. A 128-dim uncertainty head that thresholds epistemic uncertainty from an ensemble of forward passes gates memory writes. The controller uses a hierarchical policy over four memory operations: write, consolidate (graph-based merging with GNN message passing), forget (learned eviction via eligibility traces and recency + relevance scores), and retrieve (hybrid dense + symbolic query routing). Explicit memory consolidation every 50 steps is used to evaluate long-horizon tasks on τ-bench, WebArena, and GAIA. This results in a 2.3× decrease in context length and a 31% improvement in success rate over flat vector-store RAG baselines. Per-memory-type differential privacy approaches, such as homomorphic encryption for procedural skill graphs, concept-level k-anonymity on semantic features, and ε = 0.5 noise injection on episodic writing, are used to ensure privacy. Ablations show that typed memory facilitates effective cross-task transfer through procedural memory reuse and prevents catastrophic forgetting on 200-step agent trajectories. We provide the first rational substitute for monolithic RAG for production-grade autonomous agents by making the whole TypedAgentMemory library (based on LangGraph + FAISS + Neo4j), the long-horizon evaluation harness, and pretrained memory controllers for Llama-3.1-8B and Qwen2.5-72B open-source. ps 4: SAE Universality Across Model Families By training 128k-feature JumpReLU SAEs (expansion factor 64, k = 32) on residual streams of Llama-3.1-8B, Qwen2.5-72B, Gemma-2-27B, Mistral-Large-2, and DeepSeek-V3 with the same hyperparameters and reconstruction aims, we perform the first extensive cross-family SAE universality investigation. A bipartite matching that quantifies pairwise overlap at both neuron-level (cosine similarity > 0.85) and concept-level (via automated interpretation pipelines using 512 probe prompts per feature) is obtained by performing feature matching via optimal transport with Sinkhorn algorithm on normalised decoder weight matrices. By grouping similar features from different families into 4.2k platonic ideas and annotating each concept with activation data, downstream steering efficacy, and causal mediation scores calculated via route patching, we further build a universal feature library. Steering vectors created from the universal library outperform within-family SAEs on out-of-distribution tasks and enhance zero-shot generalisation on MMLU-Pro, GPQA, and LiveCodeBench by an average of 9.4% when transferred between families, according to downstream transfer studies. We make available the whole SAE training software, the universal concept library with 4.2k interpreted features, the cross-family matching dataset (which includes optimum transport plans), and a plug-and-play steering toolkit that works with Hugging Face Transformers and vLLM. In order to facilitate transfer learning, model merging, and safety interventions within the existing frontier model ecosystem, this study offers the first rigorous atlas and infrastructure for mechanistic universality. ps 5 : Synthetic Data Generation Without Mode Collapse We provide an iterated synthetic data pipeline that explicitly characterises the collapse threshold ρ*(q) as a function of generator quality q (as determined by the activation entropy of the SAE feature and the entropy of the output distribution H_π). Using temperature-annealed sampling (T=1.0 → 0.7) supplemented with SAE-guided rejection sampling, we create synthetic corpora at different mixing ratios ρ ∈ {0, 0.1,…, 1.0} starting from a 7B base policy π_θ trained on 200B tokens of FineWeb-Edu. At each generation, we train a 128k-feature JumpReLU SAE (expansion factor 64, k=32) on the residual stream of the current model and filter synthetic samples whose top-activating features show activation entropy below a calibrated threshold τ derived from the real-data reference distribution. Our experiments provide the first empirical collapse-threshold map ρ*(q) at 1.3B–7B scale, demonstrating that SAE-guided diversity sampling extends the safe mixing ratio by 2.3× compared to persona-conditioned or temperature-only baselines, while generator entropy H_π ≥ 4.2 nats delays the onset of measurable perplexity degradation on a held-out real validation set until generation 7 under accumulation (versus generation 3 under pure replacement). A closed-form constraint on variance contraction rate under synthetic mixing is derived theoretically, connecting the number of safe iterations before tail probability mass falls below 10^{-3} to the spectral gap of the generator's transition kernel.

English

6.9K

Keşfet

@BagOfNeurons @vatsal_sanghvi @PharmaAlerts @singhalkarunx @reddy2go @DanielMukasa1 @jrkelly @Ginkgo