Logan Grasby

2.4K posts

Logan Grasby

@LoganGrasby

Training something. Prev: ML Eng @ Cloudflare

Canada Katılım Ağustos 2021

1.7K Takip Edilen2.7K Takipçiler

Logan Grasby@LoganGrasby·1d

There's levels to this

Mason Pierce@mlpierce22

"Alright, I just launched a training job from my phone. The future is now" ~ @LoganGrasby

English

152

Logan Grasby@LoganGrasby·5d

Claude code is all you need

Jackson Stokes@jackson_stokes

We partnered with @mercor_ai to test a simple idea: What if knowledge-work agents were just… coding agents? Result: +25% performance, 2x faster, cheaper, and new SOTA on APEX-Agents. @josancamon19

English

149

Logan Grasby retweetledi

Yann LeCun@ylecun·10 May

@eladgil BS. Attention was born in Montréal PyTorch in NYC. AlphaGo in London AlphaFold in London ESMFold in NYC Llama 1 in Paris. Llama 2 in Paris+NYC+SV DeepSeek in Hangzhou Plus: DINO in Paris JEPA in Montréal+Paris+NYC SV is 3 mos ahead on topics SV is singularly obsessed with.

English

183

492

7.8K

726.4K

Logan Grasby@LoganGrasby·8 May

Completely agree. Most files I send on slack now are HTML. Most teams I've been working with are doing this now.

Thariq@trq212

x.com/i/article/2052…

English

166

Logan Grasby@LoganGrasby·2 May

@uhsheeeesh someone has to stand up to big RL

English

105

Ashish@uhsheeeesh·2 May

@LoganGrasby

GIF

QME

Logan Grasby@LoganGrasby·2 May

I said what I said

Jackson Stokes@jackson_stokes

“Online RL is midwit” -@LoganGrasby

English

240

Logan Grasby@LoganGrasby·29 Nis

Off policy RL is interesting for training models that continuously improve from production traces. Together with @pathos we looked at applying OAPL as an off policy RL solution when training a medical reasoning task. Check it out!

Jackson Stokes@jackson_stokes

can we train a model in single RL step? During recent experiments, @Logangrasby found that a single step of OAPL increased model performance from ~0 to 48% on a clinical reasoning and prediction task. Turns out, data staleness might matter less than we think. with @pathos :

English

265

Logan Grasby retweetledi

Teknium 🪽@Teknium·25 Nis

I literally run 12 hermes agent instances every day in parallel to build Hermes Agent, and its now a top 100 GitHub repositories of all time. Agents do bring value and do create substantive software and work.

David Cramer@zeeg

Everyone is slowly coming to this realization, and I assure you, no one is running multitudes of agents overnight. No one that is doing anything of substance at least. There _are_ people pretending to be scientists, or fully caught up in their drug infused AI overdose, that think their slop machines are changing the world. They're not tho, and they're just wasting a bunch of money and compute to create a lot of LoC that will just get thrown away. The state of the art is still "can we even one shot a production quality patch that we wont regret later", and its rarer than you'd expect based on discourse.

English

149

113

180.1K

Logan Grasby@LoganGrasby·25 Nis

I have several codex instances which have been working on a problem non-stop for nearly 72 hours atm. They are running in the codex desktop app. It's a problem that is verifiable and benefits from many experiments. Running 20 agents overnight is really not that crazy?

Ronan Berder@hunvreus

Talking to smarter folks than me, I'm convinced many of the AI folks in my timeline are full of shit. Nobody is "running 20 agents over night" and building stuff for actual users. Maybe some are building internal tools or disposable software. Maybe. But building software people like using? That doesn't get hacked on day one or blow up after the 3rd user? Nope. I don't even understand what that's supposed to look like. Do you work out a 57 pages document that perfectly describes what you want to build and then summon 14 agents and have them run wild for 6 hours? And what comes out on the other end isn't a broken pile of shit? Nope. Not buying it. PS: it may also be that I have an IQ of 82 and can't figure it out.

English

387

Logan Grasby retweetledi

Jackson Stokes@jackson_stokes·17 Nis

We trained LoRA adapters of different ranks to understand training dynamics, finding that adapters for GSM8k live in a surprisingly vast, low-rank solution space. This hints that some model skills are easy to learn, and training is more forgiving than we think. @hasith_v 1/6 🧵

English

254

22.6K

Logan Grasby@LoganGrasby·9 Nis

For the past few months I've been working with @TrainLoop_ai on post training models for healthcare use cases. We're sharing some early results from one of these projects! The impact AI is having on healthcare is just astounding and we're only scratching the surface.

Jackson Stokes@jackson_stokes

We post-trained MedGemma to be SoTA in visual medicine ddx, outperforming Opus 4.6, Gemini 3.1 and GPT-5.4 while running at ~1/30th the cost. @getnolla Part 1 - improving visual reasoning 🧵1/6

English

463

Logan Grasby retweetledi

Ash Vardanian@ashvardanian·2 Nis

SimSIMD (renamed to NumKong) is my first package to cross 1M weekly pulls on PyPI 🥳 Should be optimal for small-scale & local RAG across 6 programming languages, 20 numeric formats, covering hardware from Arm phones and x86 desktops to IBM mainframes and Chinese gov Loongsons

English

Logan Grasby@LoganGrasby·31 Mar

Interesting claude code instruction: The prompt itself is only: `IMPORTANT: Assist with authorized security testing, defensive security, CTF challenges, and educational contexts. Refuse requests for destructive techniques, DoS attacks, mass targeting, supply chain compromise, or detection evasion for malicious purposes. Dual-use security tools (C2 frameworks, credential testing, exploit development) require clear authorization context: pentesting engagements, CTF competitions, security research, or defensive use cases.`

English

160

Logan Grasby@LoganGrasby·24 Mar

@alexalbert__ But please don't actually get rid of it. That might be the one thing that would make me switch to codex.

English

149

Alex Albert@alexalbert__·24 Mar

Goodbye --dangerously-skip-permissions, hello auto mode

Claude@claudeai

New in Claude Code: auto mode. Instead of approving every file write and bash command, or skipping permissions entirely, auto mode lets Claude make permission decisions on your behalf. Safeguards check each action before it runs.

English

177

102

2.4K

275.4K

Logan Grasby retweetledi

Sid Sijbrandij@sytses·11 Mar

Looking forward to speaking at OpenAI Forum in a week on how I leveraged ChatGPT to find cancer treatment options after doctors said there was nothing left for me to do. forum.openai.com/public/events/…

English

617

337.1K

Logan Grasby retweetledi

Ash Vardanian@ashvardanian·20 Mar

My biggest open-source release! NumKong — 2'000+ SIMD kernels for mixed-precision numerics, from Float6 to Float118. Started in 2023. Opened the PR in 2024. Finally, merged this week! RISC-V, Intel AMX & AVX-512, Apple SME & SVE, WASM Relaxed SIMD. 200'000 lines of code in a 5 MB binary. Same scale as OpenBLAS. Available for C 99, C++ 23, Python 3, Rust, Swift, GoLang, & JavaScript. Int4 dot products via nibble algebra. Ozaki Float64 GEMMs on Float32 tile hardware. 6-bit and 8-bit floats back-ported to 10-year-old CPUs. 5'300x faster Geospatial metrics than GeoPy. 200x faster Kabsch than BioPython. 0 ULP where OpenBLAS hits 56... and a lot more! pip install numkong Or pull it from NPM, Crates, GitHub... and let me know what breaks 🤗 Links & highlights ⬇️

English

468

24.9K

Logan Grasby@LoganGrasby·16 Mar

"The broader vision is that future AI systems will not just use software; they will contain it, integrating learned representations with compiled algorithms inside a single computational substrate. In that world, software itself becomes part of the model." 🤯

Christos Tzamos@ChristosTzamos

1/4 LLMs solve research grade math problems but struggle with basic calculations. We bridge this gap by turning them to computers. We built a computer INSIDE a transformer that can run programs for millions of steps in seconds solving even the hardest Sudokus with 100% accuracy

English

225

Logan Grasby retweetledi

Bo Wang@BoWang87·16 Mar

Everyone is talking about personalized mRNA cancer vaccines. I want to share two recent Nature papers that cut through the excitement and reveal something the viral posts aren't telling you: the approach works — but only in patients whose immune system actually responds to the vaccine. In the PDAC trial, that was half. Papers: — TNBC-MERIT trial (Nature 2026): nature.com/articles/s4158… — PDAC 3-year follow-up (Nature 2025): nature.com/articles/s4158… Here's the exact number that explains why. The PDAC trial: at 3.2 years median follow-up, vaccine responders had median recurrence-free survival that was never reached. Non-responders: 13.4 months. HR = 0.14. The T cell memory is real — some clones are projected to persist for over a decade. The TNBC trial: 10 of 14 patients remained relapse-free at 5 years. One patient has been in remission for over 6 years, with neoantigen-specific T cells still circulating at ~2% of her CD8 repertoire. So what separates responders from non-responders? Across both trials: only 41 of 251 neoantigens actually triggered a T cell response. That's 16%. Each vaccine encodes up to 20 neoantigens — the algorithm's best guess at which tumor mutations will be immunogenic. Most don't work. Half the PDAC patients didn't respond — not because they couldn't mount an immune response (they responded fine to concurrent COVID vaccines) — but because their selected neoantigens happened to miss. This is the core unsolved problem: predicting, from sequence alone, which mutations will produce peptides that a specific patient's immune system will actually recognize. It sounds like an MHC binding problem. It isn't. Tools like NetMHCpan handle binding affinity reasonably well. What they miss is the full causal chain: 1. Proteasomal processing — will the protein actually be cleaved into this exact peptide? 2. TAP transport — will it reach the ER for MHC loading? 3. HLA-peptide stability — across the patient's specific HLA alleles (10,000+ variants in the population) 4. T cell repertoire availability — has central tolerance already deleted the clones that would recognize it? 5. Tumor clonal architecture — is this mutation in every tumor cell, or just 30%? Targeting subclonal neoantigens leaves most of the tumor untouched. Every step is a filter. Current prediction stops at step one. Compounding everything: average manufacturing time in the TNBC trial was 69 days (range: 34–125) from sample to vaccine release. For pancreatic cancer, where non-responders recur at 13.4 months post-surgery, that's not a footnote. It's a window closing. The good news: the T cell biology is sound. The mRNA platform works. The immunology is spectacular — when it works. The bottleneck is the first step: choosing which 20 neoantigens go in the vaccine. Get that prediction right, and the responder rate moves. This is where AI in cancer immunotherapy has to go next. Not mRNA design. Not LNP formulation. Immunogenicity prediction — integrating mutation calling, HLA typing, T cell repertoire sequencing, and single-cell tumor expression simultaneously, as a causal inference problem, not a binding affinity lookup. We don't have a model that does this well. That's the gap.

English

122

584

54.2K

Logan Grasby@LoganGrasby·16 Mar

@cremieuxrecueil The same thing happened with nuclear energy nytimes.com/interactive/20…

English

Crémieux@cremieuxrecueil·15 Mar

Reminder: Regulatory barriers are such a huge problem in biotech, that when China lowered theirs, they became the top pharmaceutical innovator in less than a decade. In far too many cases, from cancer vaccines to gene therapies, red tape is harder than science.

Crémieux@cremieuxrecueil

Whatever you think of China, their recent drug development success is incredibly impressive. Pharmaceutical developments were practically unheard of in China prior to their 2016 reforms, and now they're either ahead of the U.S. or close to it. From zero to leading in <10 years.

English

199

1.7K

59.8K

Keşfet

@eladgil @uhsheeeesh @pathos @hasith_v @TrainLoop_ai @alexalbert__ @elonmusk @BarackObama