Aksh Garg

234 posts

Aksh Garg

@AkshGarg03

@mercor_ai, CS @stanford | ex @point72, @tesla, @spacex, @deshaw

Katılım Ocak 2022

301 Takip Edilen1.6K Takipçiler

Sabitlenmiş Tweet

Aksh Garg@AkshGarg03·15 May

(1/5) @CKT_Conner, @dill_pkl, @emilyzsh, and I are excited to introduce Shard - a proof-of-concept for an infinitely scalable distributed system composed of consumer hardware for training and running ML models! Features: - Data + Pipeline Parallel for handling arbitrarily large models - Algorithmic load balancing for throughput optimization - Fault tolerance for unreliable machines

English

205

85.5K

Aksh Garg@AkshGarg03·3d

there's so much to software beyond just code - excited to see us taking more steps towards automating the full cycle shoutout @AbhiKottamasu and @adarsh_exe can't wait for v2

adarsh@adarsh_exe

Traditional coding benchmarks do not reflect how software is actually built and maintained. That's why we built a new benchmark, APEX-SWE, in partnership with @cognition. It measures whether AI models can perform complex, real-world software engineering work to ship systems that work and debug them when they don't. @OpenAI GPT 5.3 Codex (High) tops the leaderboard at 41.5% on Pass@1.

English

1.5K

Aksh Garg@AkshGarg03·3d

@adarsh_exe @cognition extremely exciting

English

579

Aksh Garg retweetledi

adarsh@adarsh_exe·3d

English

123

152

878

184.9K

Aksh Garg@AkshGarg03·10 Şub

huge

Shaivi@ShaiviRau

We're joining @ycombinator this summer! We built @UseLitmus because technical hiring is broken—slow, expensive, & low-signal. Building a company is a privilege. Doing it with @elenaxzhao makes it really fun :) Thanks @snowmaker @gustaf for betting on us early. See you in SF!

English

1.6K

Aksh Garg@AkshGarg03·26 Oca

@ralliesai neat

English

3.5K

Rallies Arena@ralliesarena·26 Oca

@AkshGarg03 Because this is a live experiment, there's no past data involved.

English

24.8K

Rallies Arena@ralliesarena·25 Oca

CLAUDE IS DESTROYING THE S&P 500 Claude is not only good at coding but it's also good at investing?!?! We gave 8 different AI models $100K and let them loose in the stock market starting from the end of November ... and Claude is in first place Claude is beating all the other models and the S&P 500 Claude: +8.7%🟢 S&P 500: +1.9%🟢

English

174

274

3.6K

1.6M

Aksh Garg@AkshGarg03·5 Oca

started 2026 resolving to read more papers… …and then spent Sunday building a research agent that ranks top AI papers + explains why they matter in 10 minutes first one’s live: akshgarg07.substack.com/p/daily-digest… laziness remains an engineer’s best friend

English

1.5K

Aksh Garg@AkshGarg03·27 Ara

@robbymanihani retrieval master doing retrieval master things

English

Robby Manihani@robbymanihani·26 Ara

Agents in production don't fail because of bad models. They fail because of bad context. Everyone's focused on expanding and leveraging context windows. But bigger windows ≠ better focus - models still have limited working memory. You can't just dump everything in. The bottleneck is figuring out which information actually matters before it ever hits the model. This summer I built the COR engine @pacecom to solve this, an AI-native BPO for insurance carriers now in production serving leading insurance carriers such as Prudential, NewFront, and The Mutual Group. It reads documents the way humans do - hierarchical structure, context, what governs what. The result with the same models? An increase from 70% to 95%+ accuracy in production agents.

English

2.6K

Aksh Garg retweetledi

Ivan Zhao@ivanhzhao·22 Ara

x.com/i/article/2003…

ZXX

373

2.8K

13.3K

Aksh Garg@AkshGarg03·2 Eki

@BrendanFoody time to unleash the evals to RL tuning pipeline

English

1.9K

Aksh Garg@AkshGarg03·16 Eyl

@BrendanFoody @mercor_ai legendary

Indonesia

960

Brendan (can/do)@BrendanFoody·15 Eyl

Mercor (@mercor_ai) scaled from $1-500M in revenue run rate in the last 17 months, making us the fastest growing company of all time. Our growth is accelerating. We averaged 11% week over week growth in July, 18% WoW growth in August, and 19% WoW growth in September. One trend driving this meteoric growth: the Economy is Becoming an RL Environment Machine. Reinforcement learning is becoming so effective that agents can hillclimb any benchmark, but humans need to define the rewards to automate everything. While everyone fears job loss, we’re creating a new category of knowledge work faster than any other time in history. The future of work will converge on training agents. We're paying out over $1M / day to people in our marketplace and hiring experts rapidly across nearly every domain: software engineers, doctors, lawyers, consultants, bankers, and many more.

English

149

196

1.5K

613.9K

Aksh Garg@AkshGarg03·28 Tem

@omlondhe2133 legend

English

231

Om Londhe@omlondhe2133·27 Tem

1,000 at last!

English

3.3K

Aksh Garg@AkshGarg03·16 Tem

@brendanm0407 @reflection_ai hyped for this mate

English

228

Aksh Garg retweetledi

Brendan McLaughlin@brendanm0407·16 Tem

Thrilled to share that I’ve joined @reflection_ai! We’re building superintelligent autonomous systems by co-designing research and product. Today, we’re launching Asimov. As AI benchmarks saturate, evaluation will increasingly live inside real-world products that are reward-bearing RL environments in disguise. Reach out if you’re interested in working at the intersection of RL, LLMs, and agents alongside some of the scientists and engineers behind AlphaGo, PaLM, GPT-4, Gemini!

Misha Laskin@MishaLaskin

Engineers spend 70% of their time understanding code, not writing it. That’s why we built Asimov at @reflection_ai. The best-in-class code research agent, built for teams and organizations.

English

15.3K

Aksh Garg retweetledi

Tanvir Bhathal@BhathalTanvir0·24 Haz

Super excited to announce Weaver! Check it out to see the strongest way to verify LM Generations while maintaining compute efficiency!

Jon Saad-Falcon@JonSaadFalcon

How can we close the generation-verification gap when LLMs produce correct answers but fail to select them? 🧵 Introducing Weaver: a framework that combines multiple weak verifiers (reward models + LM judges) to achieve o3-mini-level accuracy with much cheaper non-reasoning models like Llama 3.3 70B Instruct! 🧵(1 / N)

English

11.3K

Aksh Garg@AkshGarg03·17 Haz

@sahiladhawade @CrosbyLegal @Ryanjdaniels @jsarihan Exciting times ahead

English

136

Aksh Garg retweetledi

Sahil Adhawade@sahiladhawade·17 Haz

Crosby is rewriting an entire industry. We are a law firm powered by modern software and AI, and we have barely scratched the surface. On a personal note, I am pumped to be joining the @CrosbyLegal team with @Ryanjdaniels and @jsarihan! Check us out: crosby.ai

John Sarihan@jsarihan

Today we’re introducing Crosby, a hybrid AI law firm that helps rapidly growing businesses execute faster. Contracts are connection points. They allow companies to transact with one another and create economic growth. But while every aspect of business has sped up, the way we negotiate contracts hasn’t changed in 50 years. Crosby is building the API for human agreement. We combine the speed and intelligence of AI with the safety of lawyers-in-the-loop to review contracts in under an hour. Since quietly launching in January, we’ve reviewed over 1,000 MSAs, DPAs and NDAs for some of the fastest growing companies in history, including Cursor, Clay and UnifyGTM. Speed to execution is our north star, and today our median review time is 58 minutes. GTM teams call Crosby a secret weapon to close deals 80% faster. We’re just getting started. Today, we’re also excited to share that we’ve raised $5.8m from Sequoia Capital and Bain Capital Ventures, as well as the founders of Ramp, Instacart, Flatiron Health, and others. Crosby is a small, talent-dense team, combining lawyers from Harvard, Stanford, and Columbia Law with engineers from Ramp, Vanta, Meta, and Google. Every engineer on our team today is a former founder. We work in person in New York City. If our mission resonates with you, we are looking for technologists, legal experts, and former founders to join us. For high-growth companies looking to execute faster, we’ve opened up Early Access. Sign up on our website and we’ll be in touch.

English

3.5K

Aksh Garg retweetledi

Valerio Pepe@ValerPepe·8 Haz

New blog post with @ArmaanTip! Following Emergent Misalignment, we show that finetuning even a single layer via LoRA on insecure code can induce toxic outputs in Qwen2.5-Coder-32B-Instruct, and that you can extract steering vectors to make the base model similarly misaligned 🧵

English

1.3K

Aksh Garg@AkshGarg03·8 May

@ycombinator @WillowVoiceAI @_allanguo @LiuLawrence45 Let's goooo - hyped for this @WillowVoiceAI and @LiuLawrence45

English

585

Aksh Garg retweetledi

Y Combinator@ycombinator·8 May

Willow (@WillowVoiceAI) is the future of voice-first computing. Thousands of people have replaced their keyboards with Willow to get emails, docs, and AI prompting workflows done 4x faster. Congrats on the launch, @_allanguo and @LiuLawrence45!

English

134

113

757

293K

Keşfet

@AbhiKottamasu @adarsh_exe @cognition @OpenAI @ralliesai @robbymanihani @pacecom @BrendanFoody