Hasith Vattikuti

23 posts

Hasith Vattikuti banner
Hasith Vattikuti

Hasith Vattikuti

@hasith_v

Visiting researcher at TrainloopAI, Incoming CS PhD @Caltech, Previously @UTAustin, https://t.co/AaX3Qiyhu0

Austin, TX Katılım Eylül 2025
346 Takip Edilen51 Takipçiler
Hasith Vattikuti retweetledi
Simran Arora
Simran Arora@simran_s_arora·
AI compute and inference are increasingly $$$. How can we change the unit economics of AI to improve accessibility? It's been fun working with @prlnet to release the first model endpoint that simultaneously generates tokens **and** a digital asset that can subsidize inference! 🪙 Check it out, links below 🚀
English
3
2
54
5.5K
Hasith Vattikuti retweetledi
Jackson Stokes
Jackson Stokes@jackson_stokes·
We trained LoRA adapters of different ranks to understand training dynamics, finding that adapters for GSM8k live in a surprisingly vast, low-rank solution space. This hints that some model skills are easy to learn, and training is more forgiving than we think. @hasith_v 1/6 🧵
Jackson Stokes tweet media
English
5
26
254
22.6K
Hasith Vattikuti retweetledi
Jackson Stokes
Jackson Stokes@jackson_stokes·
We post-trained MedGemma to be SoTA in visual medicine ddx, outperforming Opus 4.6, Gemini 3.1 and GPT-5.4 while running at ~1/30th the cost. @getnolla Part 1 - improving visual reasoning 🧵1/6
Jackson Stokes tweet media
English
6
9
34
3.3K
Hasith Vattikuti
Hasith Vattikuti@hasith_v·
@jxmnop This is cool, I've always been slightly uncomfortable with treating *everything* unlabeled as a negative. I wonder if using LLMs to produce rankings (even a somewhat noisy one) would be better than a binary classification. Perhaps we can weight according to rank in softmax?
English
0
0
0
681
Hasith Vattikuti retweetledi
William Gilpin
William Gilpin@wgilpin0·
How do time series foundation models forecast unseen dynamical systems? In new experiments, we find that small transformers learn to approximate transfer operators in-context. (1/N) arxiv.org/abs/2602.18679
English
3
78
382
29.1K
Hasith Vattikuti
Hasith Vattikuti@hasith_v·
@jxmnop Will code be released? Interested in playing around with this
English
0
0
0
15
dr. jack morris
dr. jack morris@jxmnop·
here's a link to the paper on ArXiv! thanks to my collaborators at FAIR: Niloofar Mireshghallah1, Mark Ibrahim , Saeed Mahloujifar arxiv.org/abs/2602.04118 (i left FAIR in october; it just took a while to get the paper out for a number of logistical reasons)
English
4
7
148
8.9K
dr. jack morris
dr. jack morris@jxmnop·
at long last, the final paper of my phd 🧮 Learning to Reason in 13 Parameters 🧮 we develop TinyLoRA, a new ft method. with TinyLoRA + RL, models learn well with dozens or hundreds of params example: we use only 13 parameters to train 7B Qwen model from 76 to 91% on GSM8K 🤯
dr. jack morris tweet media
English
60
232
2.1K
182.1K
Rohan Pandey
Rohan Pandey@khoomeik·
yo @LEGO_Group when are we getting an ASML High-NA EUV Photolithography Machine build set i kinda need this lego is danish, asml is dutch. this collab is written in the stars. make it happen.
Rohan Pandey tweet media
English
3
6
102
7.1K
Hasith Vattikuti retweetledi
Yasa Baig
Yasa Baig@BaigYasa·
Great to see high quality software dev in comp bio. It still amazes me how much of computational biology is based on single-thread processing of large .txt files with minimal application-specific-optimization.
Arc Institute@arcinstitute

Arc bioinformatics scientists @noamteyssier and @a_dobin have just released cyto, an ultra-high throughput processor specifically optimized for @10xGenomics Flex single-cell data. We are excited to make this resource open source: biorxiv.org/content/10.648…

English
2
1
9
590
sidbing 🪽
sidbing 🪽@sidbing·
tomorrow is reading and pondering day. gonna finish up all my backlog of reading papers, blogs, X threads and sit and ponder and talk to claude. can't wait.
English
3
0
18
1.2K
Andrej Karpathy
Andrej Karpathy@karpathy·
Nice, short post illustrating how simple text (discrete) diffusion can be. Diffusion (i.e. parallel, iterated denoising, top) is the pervasive generative paradigm in image/video, but autoregression (i.e. go left to right bottom) is the dominant paradigm in text. For audio I've seen a bit of both. A lot of diffusion papers look a bit dense but if you strip the mathematical formalism, you end up with simple baseline algorithms, e.g. something a lot closer to flow matching in continuous, or something like this in discrete. It's your vanilla transformer but with bi-directional attention, where you iteratively re-sample and re-mask all tokens in your "tokens canvas" based on a noise schedule until you get the final sample at the last step. (Bi-directional attention is a lot more powerful, and you get a lot stronger autoregressive language models if you train with it, unfortunately it makes training a lot more expensive because now you can't parallelize across sequence dim). So autoregression is doing an `.append(token)` to the tokens canvas while only attending backwards, while diffusion is refreshing the entire token canvas with a `.setitem(idx, token)` while attending bidirectionally. Human thought naively feels a bit more like autoregression but it's hard to say that there aren't more diffusion-like components in some latent space of thought. It feels quite possible that you can further interpolate between them, or generalize them further. And it's a component of the LLM stack that still feels a bit fungible. Now I must resist the urge to side quest into training nanochat with diffusion.
GIF
Nathan Barry@nathanrs

BERT is just a Single Text Diffusion Step! (1/n) When I first read about language diffusion models, I was surprised to find that their training objective was just a generalization of masked language modeling (MLM), something we’ve been doing since BERT from 2018. The first thought I had was, “can we finetune a BERT-like model to do text generation?”

English
268
533
5.2K
866.1K
Hasith Vattikuti
Hasith Vattikuti@hasith_v·
@karpathy I actually hacked nanogpt sometime ago to become a diffusion llm. Results were pretty decent on shakespeare with character-level tokenization. Honestly was just surprised it even learned to spell words and pick up on basic grammar. Link in reply
Hasith Vattikuti tweet media
English
3
2
46
1.8K
Joanna
Joanna@materzynska·
I am looking for motivated students to join my team at @AIatMeta FAIR for a summer internship. If you have experience with motion modeling / diffusion models and/or social AI please feel free to reach out! 🤖✨
English
18
33
324
30K
Hasith Vattikuti
Hasith Vattikuti@hasith_v·
@a16z @LiamFedus @LiamFedus what are yalls methods to verify what the LLMs are discovering? How do you make sure it’s ‘understanding’ current physics correctly? I have lots of thoughts on this as a physics student doing AI research if you want to chat
English
0
0
1
167
a16z
a16z@a16z·
“Foundation models but for quantum mechanics, will be the next frontier for LLMs.” Periodic Labs’ Ekin Dogus Cubuk says logic and math gave AI its first proofs. At the quantum scale, where biology, chemistry and materials converge, models could begin inventing new matter itself. @ekindogus @periodiclabs
a16z@a16z

Building an AI Physicist: ChatGPT Co-Creator’s Next Venture Scaling laws took us from GPT-1 to GPT-5 Pro. But in order to crack physics, we need a new approach. We sat down with Liam Fedus (co-creator of ChatGPT) and Ekin Dogus Cubuk (ex-materials science and chemistry lead at Google DeepMind) to talk about their new startup @PeriodicLabs and their plan to automate discovery in the hard sciences. 00:00 LLMs in physics and chem research 03:53 What is Periodic Labs? 14:45 Building the team 17:29 Superconductivity 27:39 Periodic's mission and applications 35:38 Mid-training and model performance 49:49 What makes a great researcher @AnjneyMidha @LiamFedus @EkinDogus

English
26
66
310
96.7K
Hasith Vattikuti
Hasith Vattikuti@hasith_v·
@khoomeik @periodiclabs @LiamFedus Very excited to see where periodic will go next! Extremely bullish on trying to get tangible alpha from AI models in natural sciences--it really plays to my background of first doing physics research and then doing AI research
English
0
0
2
399
Rohan Pandey
Rohan Pandey@khoomeik·
fav part about working at @periodiclabs: when i rabbithole on a quantum mechanics textbook i just tell my boss @LiamFedus that i’m reading training data 😎😉
English
16
6
274
56.1K
Hasith Vattikuti
Hasith Vattikuti@hasith_v·
@CFGeek Yes it is, happy to discuss and get feedback. All is welcome
English
1
0
2
33
Hasith Vattikuti
Hasith Vattikuti@hasith_v·
@CFGeek To be fair, I also think it will be hard to get it to work, and it might not even. But the negative result plus the rl env will leave us things to learn from. Cause I’m pretty confident that LLMs will be using internal reasoning techniques only a few years down the line.
English
1
0
1
52
Charles Foster
Charles Foster@CFGeek·
FWIW it seems unlikely that the proposal in the quoted tweet would actually work. That’s maybe an even better reason to explore some other project idea!
English
2
0
7
609