Punit Vara

1.6K posts

Punit Vara banner
Punit Vara

Punit Vara

@punitvara

RNN - LSTM - Attention - Transformer - LLM - Increase context window - Quantization for on device - Reasoning LM - Tiny RM Machine Learning Engineer

India Katılım Kasım 2010
2.4K Takip Edilen262 Takipçiler
Sabitlenmiş Tweet
Punit Vara
Punit Vara@punitvara·
There is nothing like limit. It's in your brain. Transcend yourself. Don't fucking give up no matter what
English
0
0
5
0
Archie Sengupta
Archie Sengupta@archiexzzz·
Bengal's contribution to world GDP was 12% in the 1700s. Yes, 12%. This is an important day in Indian politics. The BJP swept West Bengal after 34 years of Communist rule and 15 years of Mamata's govt. This changes everything. I am a Bengali, I know the level of corruption, extortion, cut money, illegal immigration, and scams that were happening. Bengalis are intellectuals - if they decide they've had enough, they become a swing state like UP. Kolkata and Bengal reject staying in the past. Go against any party or member you want - but never, ever in your wildest dreams go against our gods. Maa Durga and Maa Kali watch over everyone. Lord Krishna will every single day save dharma when adharma and evil are at their peak. Jai Shri Ram! 🕉️
English
46
203
2.2K
52.1K
Punit Vara
Punit Vara@punitvara·
@SwiggyCares Extremely disappointing. Paid for Swiggy Black express → order grouped & delayed (~1 hr), delivered last. Received wrong item + poor support (false promises, no resolution). What am I paying extra for? Order ID: 236790601360905
English
1
0
0
24
Paras Chopra
Paras Chopra@paraschopra·
Woke up to see my paper accepted at ICML 2026 :) 🕺🏽🚀 My first one at an A* conference! This position paper was a result of me deep diving into the philosophy of science to figure out what counts as “understanding something”. This drove me to Wittgenstein, Quine, William James and other Pragmatism advocates. I then applied what these philosophers talk about to the field of AI, and recommend how to make sense of ambiguous (and often metaphysically loaded) questions such as “what is AGI” or “can LLMs feel emotions”. Philosophical clarity is a necessary prerequisite for attacking unsolved problems, and I’ve found Pragmatism to be the most sensible position for making progress as its focus is always on empirical consequences of concepts (and not their metaphysical status). Will release the paper soon!
Paras Chopra tweet media
English
32
21
612
23.2K
Punit Vara
Punit Vara@punitvara·
@adxtyahq I wish this is true. But most companies will not hire in India at least after this type of gap.
English
0
0
0
151
aditya
aditya@adxtyahq·
we should normalize 1-year career breaks this guy worked at linkedin for years, took a year off, now at meta
aditya tweet media
English
149
1K
10.7K
1.1M
Kirill Skrygan
Kirill Skrygan@kskrygan·
Would you be interested if JetBrains releases a totally local AI agent, working 100% on your laptop, using our code insight engine and deeply integrated into the IDE? Yes, it will be probably 1 month behind the very recent frontier models, but no token blood bath anymore WDYT?
English
809
235
7.2K
486.5K
Punit Vara
Punit Vara@punitvara·
Amazing write up
Brivael Le Pogam@brivael

Elon Musk avait dit un truc qui m'avait marqué sur l'allocation de ressources. En substance : passé un certain niveau de richesse, l'argent n'est plus de la consommation, c'est de l'allocation de capital. Cette phrase change tout. L'économie, dans le fond, c'est juste un problème d'allocation. Tu as des ressources finies et des usages infinis. Qui décide où va quoi ? Imagine une cour de récré. 100 enfants, des paquets de cartes Pokémon distribués au hasard. Tu laisses faire. Très vite, un ordre émerge. Les bons joueurs accumulent les cartes rares, les collectionneurs trient, les négociateurs trouvent des deals. Personne n'a planifié. Et pourtant chaque carte finit dans les mains de celui qui en tire le plus de valeur. Le système maximise le bonheur total de la cour. C'est ça, la main invisible. Maintenant fais entrer la maîtresse. Elle trouve ça injuste. Léo a 50 cartes, Tom en a 3. Elle confisque, redistribue, impose l'égalité. Trois effets immédiats. Les bons joueurs arrêtent de jouer, à quoi bon. Les mauvais n'ont plus de raison de progresser, ils auront leur part. Les échanges s'effondrent. La cour est égale, et morte. Elle a maximisé l'égalité, elle a détruit le bonheur. Le problème de la maîtresse, c'est qu'elle ne peut pas avoir l'information que la cour avait collectivement. C'est le problème du calcul économique de Mises, formulé en 1920. L'URSS a essayé de le résoudre pendant 70 ans avec le Gosplan. Résultat : pénuries, queues, effondrement. Pas parce que les Soviétiques étaient bêtes, parce que le problème est mathématiquement insoluble en mode centralisé. Quand Musk a 200 milliards, il ne les consomme pas, il les alloue. SpaceX, Starlink, Neuralink, xAI. Chaque dollar est un pari sur le futur. Et lui a un track record. PayPal, Tesla, SpaceX. Il a démontré qu'il sait identifier des problèmes immenses et y allouer des ressources avec un rendement spectaculaire. L'État aussi a un track record. Hôpitaux qui s'effondrent, éducation qui décline, dette qui explose, services publics qui se dégradent malgré des budgets en hausse constante. Le marché identifie les bons allocateurs, la politique identifie les bons communicants. Le profit n'est pas une finalité, c'est un signal. Il dit : tu as alloué des ressources rares vers un usage que les gens valorisent suffisamment pour payer. Plus le profit est gros, plus la création de valeur est grande. Quand Starlink est rentable, ça veut dire que des millions de gens dans des zones rurales ont enfin internet. Quand un ministère est en déficit, ça veut dire qu'il consomme plus qu'il ne produit. L'un crée, l'autre détruit, et on appelle ça redistribution. Dans nos sociétés il y a deux catégories d'acteurs. Les entrepreneurs et les bureaucrates. L'entrepreneur prend un risque personnel pour identifier un problème, mobiliser des ressources, créer une solution. S'il se trompe il perd. S'il a raison, ses clients gagnent, ses employés gagnent, ses fournisseurs gagnent, l'État collecte des impôts. Il est la cellule de base du progrès humain. Le bureaucrate ne prend aucun risque personnel. Son salaire est garanti. Au mieux il maintient une rente existante. Au pire il la détruit par excès de réglementation, mauvaise allocation forcée, incitations perverses qui découragent ceux qui produisent. Mais dans aucun cas il ne crée. Regarde les 50 dernières années. iPhone, internet civil, SpaceX, Tesla, Google, Amazon, Stripe, mRNA, ChatGPT. Toutes des inventions privées, portées par des entrepreneurs, financées par du capital risque. Pas un seul ministère n'a inventé quoi que ce soit qui ait changé ta vie au quotidien. La France est devenue le laboratoire mondial de la dérive bureaucratique. 57% du PIB en dépenses publiques, record absolu. Une administration tentaculaire, une fiscalité qui pénalise la création de richesse. Résultat : décrochage face aux États-Unis, à l'Allemagne, à la Suisse. Fuite des cerveaux. Désindustrialisation. Dette qui explose. Et le pire c'est que la mauvaise allocation s'auto-renforce. Plus l'État prélève, moins les entrepreneurs créent. Moins ils créent, moins il y a de base fiscale. Plus l'État s'endette et taxe. Boucle de rétroaction négative parfaite. La maîtresse pense qu'elle aide, et chaque année la cour produit moins. Dans nos sociétés, ce sont les entrepreneurs, toujours, qui font avancer la civilisation. Les bureaucrates au mieux maintiennent une rente, au pire la détruisent. Aucune société n'a jamais progressé en taxant ses créateurs pour subventionner ses gestionnaires. La question n'est jamais qui a combien. C'est qui alloue le mieux la prochaine unité de ressource pour maximiser le futur de l'humanité. La réponse depuis 200 ans n'a jamais changé. Ce ne sont pas les fonctionnaires.

English
0
0
0
6
Punit Vara
Punit Vara@punitvara·
I believe as we advance in AI and our understanding more and more automation will take over world. Only I see is to keep learning and jumping on higher abstraction. I don't honestly see any strategy working out
English
0
0
0
6
Punit Vara
Punit Vara@punitvara·
Can it invent things like Thomas Edison used to ?
Nick Levine@status_effects

New work with @AlecRad and @DavidDuvenaud: Have you ever dreamed of talking to someone from the past? Introducing talkie, a 13B model trained only on pre-1931 text. Vintage models should help us to understand how LMs generalize (e.g., can we teach talkie to code?). Thread:

English
0
0
0
14
Arnav Gupta
Arnav Gupta@championswimmer·
@prflgupta - Walkable roads (without pavements caving into drains) - serious efforts to fix AQI - in BLR, public transport (Delhi has this sorted) - meaningful reversal in communal hatred from current levels to what it was earlier
English
26
25
504
16.4K
Arnav Gupta
Arnav Gupta@championswimmer·
I stayed in India for 8 years after graduating. In those 8 years I ran a mildly successful startup, creating 70+ jobs, and a lot of taxes since we were profitable. My Whatsapp inbox from my teaching time is full of thousands of messages of people getting prestigious high paying tech jobs. After that as an engineering leader at various orgs, hired at least 50+ more people. Paid multiple crores of personal income tax as well. I did more than my fair share of creating employment, creating human capital and contributed enough to pay off my subsidised education (I’ll not deny I got more than good enough education both from school and university, both of which are partly paid for by the government) In those 8 years, of my closest 10 friends, slowly slowly I found 8 of them now have moved outside India. More than half my college group is outside. And eventually on the balance of things, it really started feeling like I’m getting the short end of the bargain and those others who left were getting a better deal in life. I had always assured myself that a) I can always go out whenever I want, I am here by choice b) I’ve consciously stuck to faster growth roles orgs but if I ever wanted to, I can go to big tech too Finally a switch clicked in the head saying if you’re so sure of (a) and (b) why don’t you really just go and see. You can always come back. In the long run I might be proven wrong (I’m aware of stories of one health scare or racism incident or ailing parents that pushes the pendulum back for many), but for now my lived experience only taught me exactly opposite of what Vembu has said below. In fact if 10 years ago I would have even slightly been convinced by the below tweet, today I’m convinced even less by it. From my perspective of the things I sought in life, the equation has only gotten worse not better.
Sridhar Vembu@svembu

Open letter to Indians in America. -- Dear brothers and sisters from Bharat: Like I did 37 years ago, you arrived in America with no money but with a good education and cultural heritage from Bharat. You achieved outstanding success. America was good to us. For that we must remain grateful - gratitude is our Bharatiya way. Yet today, a significant number of Americans, may be not the majority but not too far from it either, believe that Indians "take away" American jobs and our success in America was unfairly earned. You may think the next election will fix this, but your choice would be between people who hate our Bharatiya civilisation and people who hate civilisation itself. That is the "hard right" vs "woke left" battle. You are mere bystanders to that conflict. Meanwhile there is one thing that is true now and will be true in the future: the respect Indians command world-wide will substantially depend on the fortunes of India herself. If India remains poor, the woke left will give us moral lectures with pity and the hard right, different moral lectures with scorn ("hellhole") and we must not confuse either with respect. Respect in today's world, along with prosperity and security, comes from one source: a nation's technological prowess. India produces sufficient brain power to achieve that prowess but alas we exported so much of that talent, particularly to America. As we develop that prowess in India, our civilisational strength will assert itself. As difficult as it is for many of you to contemplate this, please come back home. Bharat Mata needs your talent. Our vast youthful population needs the technology leadership you gained over the years to guide them towards prosperity. Let's do it with a missionary zeal. Respectfully Sridhar Vembu

English
36
162
2.3K
368.8K
Punit Vara
Punit Vara@punitvara·
Great advise to follow in current world.
Startup Archive@StartupArchive_

Sam Altman: “Most people don’t take enough risk” “I think people have terrible risk calculus in general… almost always A) you’re wrong about what is risky and what is not risky, and B) most people don’t take enough risk—especially early in your career. Being young, unknown, and poor, is actually a great gift in terms of the amount of risk you can take.” Sam continues: “I think what risk actually looks like is not doing something that you will spend the rest of your life regretting… So if you really believe in something—if there’s an idea you’re super passionate about—and you take a calculated risk to start a company realizing you may forego a couple of years of steady income and maybe people call you a failure, that’s a great risk to take. And if you don’t take that risk, I think you have a very high chance that you end up regretting that.” Sam believes most people overrate the risk of reputation damage and embarrassment from trying and failing. It’s worse to not even try: “One really important thing to strive for in your career is to be a doer, not a talker. And the reason that people don’t do stuff is 1) it’s hard, and 2) it’s risky. And so you have these people that want to dabble in a bunch of different projects, but never be all-in on one… I think that’s really bad. I think history belongs to the doers, and I think you should take a risk and actually do something.” Video source: @ycombinator (2016)

English
0
0
0
5
Punit Vara retweetledi
Julien Chaumond
Julien Chaumond@julien_c·
This is where we are right now. And i’m not gonna lie it feels pretty magical 🧚‍♀️ Qwen3.6 27B running inside of Pi coding agent via Llama.cpp on the MacBook Pro For non-trivial tasks on the @huggingface codebases, this feels very, very close to hitting the latest Opus in Claude Code, or whatever shiny monopolistic closed source API of the day is. In full airplane mode. Most people haven’t realized this yet. If you have, it means you have a huge headstart to what I call the second revolution of AI. Powerful local models for efficiency, security, privacy, sovereignty 🔥
Julien Chaumond tweet media
English
260
443
5.2K
616.1K
Arnav Gupta
Arnav Gupta@championswimmer·
This house is less than ₹5cr (£0.5M). Less than 30min commute to anywhere in central London. Indian property markets makes no sense to me after seeing the market here 🥲 (+ has huge backyard, kids can play football in)
Arnav Gupta tweet media
English
586
620
9K
945.5K
Punit Vara retweetledi
Aksel
Aksel@akseljoonas·
For the last 72 hours since ml-intern launched we have had over 500+ autonomous AI research projects running on the Space at all times. Some insane ones I saw: 1. A new AI paradigm from scratch — trying to replace transformers with a reasoning architecture based on energy minimization, binary sparse address tables and circular convolution binding. No GPU, no gradients, no training data — pure bitwise operations. Years of research done in 2 days. huggingface.co/Harry00/MLE-Mo… 2. Someone took LoopLM (ByteDance's recurrent depth transformer with shared layers and infinite depth via looping) and crossed it with BitNet b1.58 (ternary 1.58-bit weights). The result: a model that's both infinitely deep AND uses almost no memory per parameter. 3. Designing a new attention mechanism modeled on the thalamo-cortical circuit in the human brain. Pulling from 2025/2026 research out of MIT, Harvard, and UF. The thalamus gates what information reaches the cortex. They're building a learnable gate that mimics this for transformer attention heads, combined with EEG datasets and a reinforcement learning loop. huggingface.co/spaces/daniel8… The use cases people bring are cooler and more impressive than anything we imagined when we built this.
Aksel@akseljoonas

Introducing ml-intern, the agent that just automated the post-training team @huggingface It's an open-source implementation of the real research loop that our ML researchers do every day. You give it a prompt, it researches papers, goes through citations, implements ideas in GPU sandboxes, iterates and builds deeply research-backed models for any use case. All built on the Hugging Face ecosystem. It can pull off crazy things: We made it train the best model for scientific reasoning. It went through citations from the official benchmark paper. Found OpenScience and NemoTron-CrossThink, added 7 difficulty-filtered dataset variants from ARC/SciQ/MMLU, and ran 12 SFT runs on Qwen3-1.7B. This pushed the score 10% → 32% on GPQA in under 10h. Claude Code's best: 22.99%. In healthcare settings it inspected available datasets, concluded they were too low quality, and wrote a script to generate 1100 synthetic data points from scratch for emergencies, hedging, multilingual etc. Then upsampled 50x for training. Beat Codex on HealthBench by 60%. For competitive mathematics, it wrote a full GRPO script, launched training with A100 GPUs on hf.co/spaces, watched rewards claim and then collapse, and ran ablations until it succeeded. All fully backed by papers, autonomously. How it works? ml-intern makes full use of the HF ecosystem: - finds papers on arxiv and hf.co/papers, reads them fully, walks citation graphs, pulls datasets referenced in methodology sections and on hf.co/datasets - browses the Hub, reads recent docs, inspects datasets and reformats them before training so it doesn't waste GPU hours on bad data - launches training jobs on HF Jobs if no local GPUs are available, monitors runs, reads its own eval outputs, diagnoses failures, retrains ml-intern deeply embodies how researchers work and think. It knows how data should look like and what good models feel like. Releasing it today as a CLI and a web app you can use from your phone/desktop. CLI: github.com/huggingface/ml… Web + mobile: huggingface.co/spaces/smolage… And the best part? We also provisioned 1k$ GPU resources and Anthropic credits for the quickest among you to use.

English
26
87
763
99.7K
Punit Vara retweetledi
MIT CSAIL
MIT CSAIL@MIT_CSAIL·
Today, MIT & the IMO released MathNet, the world’s largest dataset of International Math Olympiad problems & solutions 🌍 MathNet is 5x larger than previous datasets & is sourced from over 40 countries across 4 decades: bit.ly/4u1bhBC
MIT CSAIL tweet media
English
15
543
2.1K
193.6K
Ivan Landabaso
Ivan Landabaso@IvanLandabaso·
Meta's CTO guide to starting a new job. One of the most useful things I learned there:
Ivan Landabaso tweet mediaIvan Landabaso tweet media
English
31
493
6.6K
459.5K
Punit Vara
Punit Vara@punitvara·
After asking AI agent to solve this problem. I understood real skill is to make AI agent do deep search and find new algorithms, when it says "I declined to fabricate a non-verified answer. " Steering them to do novel task would be FUTURE skill caisc2026.github.io/verifiable-pro…
English
0
0
0
12
Punit Vara retweetledi
Aksel
Aksel@akseljoonas·
Introducing ml-intern, the agent that just automated the post-training team @huggingface It's an open-source implementation of the real research loop that our ML researchers do every day. You give it a prompt, it researches papers, goes through citations, implements ideas in GPU sandboxes, iterates and builds deeply research-backed models for any use case. All built on the Hugging Face ecosystem. It can pull off crazy things: We made it train the best model for scientific reasoning. It went through citations from the official benchmark paper. Found OpenScience and NemoTron-CrossThink, added 7 difficulty-filtered dataset variants from ARC/SciQ/MMLU, and ran 12 SFT runs on Qwen3-1.7B. This pushed the score 10% → 32% on GPQA in under 10h. Claude Code's best: 22.99%. In healthcare settings it inspected available datasets, concluded they were too low quality, and wrote a script to generate 1100 synthetic data points from scratch for emergencies, hedging, multilingual etc. Then upsampled 50x for training. Beat Codex on HealthBench by 60%. For competitive mathematics, it wrote a full GRPO script, launched training with A100 GPUs on hf.co/spaces, watched rewards claim and then collapse, and ran ablations until it succeeded. All fully backed by papers, autonomously. How it works? ml-intern makes full use of the HF ecosystem: - finds papers on arxiv and hf.co/papers, reads them fully, walks citation graphs, pulls datasets referenced in methodology sections and on hf.co/datasets - browses the Hub, reads recent docs, inspects datasets and reformats them before training so it doesn't waste GPU hours on bad data - launches training jobs on HF Jobs if no local GPUs are available, monitors runs, reads its own eval outputs, diagnoses failures, retrains ml-intern deeply embodies how researchers work and think. It knows how data should look like and what good models feel like. Releasing it today as a CLI and a web app you can use from your phone/desktop. CLI: github.com/huggingface/ml… Web + mobile: huggingface.co/spaces/smolage… And the best part? We also provisioned 1k$ GPU resources and Anthropic credits for the quickest among you to use.
English
132
622
4.6K
1.2M
Punit Vara
Punit Vara@punitvara·
@thsottiaux Make it work in iphone and android. We want cli agents for mobile devices
English
0
0
0
14
Tibo
Tibo@thsottiaux·
Hello builders. What are we getting wrong with Codex, what can we improve?
English
2.4K
64
2.9K
325.6K