William Fleshman

337 posts

William Fleshman

@willcfleshman

US Army & PhD student at Johns Hopkins University

Katılım Ağustos 2017

153 Takip Edilen424 Takipçiler

Sabitlenmiş Tweet

William Fleshman@willcfleshman·23 Eyl

Did you know that LoRA A matrices can be frozen at init w/o degrading performance? 🤯 We leverage this trick to construct an unsupervised routing procedure that achieves identical performance to the previous best with orders of magnitude fewer FLOPs and ~50% less GPU memory. 🧵

English

2.3K

William Fleshman retweetledi

JHU Computer Science@JHUCompSci·9 Mar

Congratulations, @willcfleshman!

English

635

William Fleshman retweetledi

Benjamin Van Durme@ben_vandurme·4 Şub

JHU mmBERT extended from 8k to 32k token length by vLLM Semantic Router Team. Cutting edge results on 1,800+ languages, now with longer context! huggingface.co/llm-semantic-r…

English

1.6K

William Fleshman@willcfleshman·15 Oca

@ChromeHODLs @hillery_dan That's definitely easier, just might not be optimal depending on your tax situation. Assuming you can almost capture both rates, swapping back and forth with T-bills would compound faster up to a 25% tax. Tax free accounts, if available, is the real way to go.

English

William Fleshman@willcfleshman·15 Oca

@ChromeHODLs @hillery_dan Opportunity cost. If you can capture the dividend by only tying up your capital for a couple of days then the rest of the month that capital can be making money elsewhere.

English

251

Dan Hillery@hillery_dan·15 Oca

To be eligible for January's STRC dividend, you had to be recorded holding STRC upon market up today, January 15th. Therefore, you can sell STRC today and still get the dividend. STRC sold off less than the monthly dividend payment. There is insatiable demand, and I don't know why.

English

565

34.9K

William Fleshman@willcfleshman·27 Eki

@EdwardRaffML Maybe it's you that has 0 recall and precision 🤯

English

Edward Raff@EdwardRaffML·27 Eki

I just did a literature review on a specific topic to add more related work to a paper. Asked GPT5 to do the same to see how well it would compare. GPT5 had 0% recall and 0% precision on its returned list. At least they were real papers and not hallucinated though 🤷

English

167

William Fleshman@willcfleshman·24 Eki

@jackjingyuzhang @AmazonScience Congrats Jack!

English

Jack Jingyu Zhang@jackjingyuzhang·23 Eki

I’m super thrilled and honored to be named an Amazon AI PhD Fellow 💫 Huge thanks to @AmazonScience for generously supporting our research at JHU! We’ll be advancing AI alignment in collaboration with folks at Amazon.

Rohit Prasad@RohitPrasadAI

Excited to announce @amazon's new AI PhD Fellowship Program supporting 100+ students across 9 universities like Carnegie Mellon, MIT & Stanford. Fellows will be paired with senior scientists working in related fields, plus receive financial support and AWS credits for research. Learn more: amazon.science/news/amazon-la…

English

11.7K

William Fleshman retweetledi

Benjamin Van Durme@ben_vandurme·10 Eki

Summer '26 PhD research internships at Microsoft Copilot Tuning. Continual learning, complex reasoning and retrieval, nl2code, data efficient post-training. jobs.careers.microsoft.com/global/en/job/…

English

238

16.4K

William Fleshman@willcfleshman·23 Eyl

🚨Check out the paper with @ben_vandurme for more juicy details, like how we improve SpectR and SEQR by calibrating the adapter norms! arxiv.org/abs/2509.18093

English

William Fleshman@willcfleshman·23 Eyl

SEQR provably routes to the same adapters as SpectR, yielding the same high level of task performance at a fraction of the cost 🤑. Like previous unsupervised approaches, SEQR is secure, with no risk of data leakage if LoRA B matrices are kept private!🔐

English

William Fleshman@willcfleshman·23 Eyl

English

2.3K

William Fleshman retweetledi

Orion Weller@orionweller·9 Eyl

XLM-R has been SOTA for 6 years for multilingual encoders. That's an eternity in AI 🤯 Time for an upgrade. Introducing mmBERT: 2-4x faster than previous models ⚡ while even beating o3 and Gemini 2.5 Pro 🔥 + open models & training data - try it now! How did we do it? 🧵

English

250

43.2K

William Fleshman retweetledi

Marc Marone@ruyimarone·9 Eyl

3T tokens, ~1800 languages, 2 models - we’re releasing mmBERT, a modern multilingual encoder model!

English

403

30.9K

William Fleshman@willcfleshman·13 Ağu

@jxmnop Cool stuff, when we did RE-Adapt (arxiv.org/abs/2405.15007) with llama we saw many of the base->instruct weight updates are approx. low rank but some layers were not. You could repeat your experiment with the llama instruct models to see how close to base you actually get.

English

108

dr. jack morris@jxmnop·13 Ağu

OpenAI hasn’t open-sourced a base model since GPT-2 in 2019. they recently released GPT-OSS, which is reasoning-only... or is it? turns out that underneath the surface, there is still a strong base model. so we extracted it. introducing gpt-oss-20b-base 🧵

English

163

447

6.1K

928.7K

William Fleshman retweetledi

Benjamin Van Durme@ben_vandurme·12 Ağu

I am growing an R&D team around Copilot Tuning, a newly announced effort that supports adaptation at a customer-specific level. Join us! jobs.careers.microsoft.com/global/en/job/… We collaborate with a crack team of eng and scientists that support the product, also growing! jobs.careers.microsoft.com/global/en/job/…

English

William Fleshman@willcfleshman·7 Ağu

Obviously have to attack you as my main AAAI contact 🤣

English

113

William Fleshman@willcfleshman·10 Tem

Check out the paper w/@ben_vandurme now on arXiv: arxiv.org/abs/2507.05346

English

673

William Fleshman@willcfleshman·10 Tem

SpectR was accepted at @COLM_conf! Our follow-up work, LoRA-Augmented Generation (LAG), combines SpectR w/ a first pass filtering of adapters using Arrow routing. LAG is significantly more efficient, enabling SpectR like performance with much larger LoRA libraries!

William Fleshman@willcfleshman

🚨 Our latest paper is now on ArXiv! 👻 (w/ @ben_vandurme) SpectR: Dynamically Composing LM Experts with Spectral Routing (1/4) 🧵

English

1.1K

William Fleshman@willcfleshman·2 Tem

@investingidiocy Thanks for answering my question on TTU. I'm looking forward to the new series of blog posts. Cheers!

English

102

Rob Carver@investingidiocy·1 Tem

New blog post qoppac.blogspot.com/2025/07/pca-an…

English

7.4K

Keşfet

@hillery_dan @EdwardRaffML @jackjingyuzhang @AmazonScience @ben_vandurme @jxmnop @elonmusk @BarackObama