Avijit Ghosh

5.7K posts

Avijit Ghosh

@evijit

Lead Technical AI Policy Researcher @huggingface 🤗 . Working on @evaluatingevals and @huggingscience

Boston, Massachusetts Katılım Ocak 2012

1.5K Takip Edilen2.9K Takipçiler

Avijit Ghosh@evijit·19h

@rtwlz @neil_chilson The people yearn for public compute

English

151

Riley Walz@rtwlz·23h

ZXX

389

6.3K

191.8K

Avijit Ghosh@evijit·1d

Great piece! theverge.com/tech/928905/vi…

English

Avijit Ghosh retweetledi

Jonas Geiping@jonasgeiping·2d

We’re training models wrong and it’s due to chatGPT. Even the modern coding agents used daily still use message-based exchanges: They send messages to users, to themselves (CoT) and to tools, and receive messages in turn. This bottlenecks even very intelligent agents to a single stream. The models cannot read while writing, cannot act while thinking and cannot think while processing information. In our new paper, see below, we discuss LLMs with parallel streams. We show that multi-stream LLMs can … 🔵Be created by instruction-tuning for the stream format 🔵Simplify user and tool use UX removing many pain points with agents and chat models (such as having to interrupt the model to get a word in) 🔵Multi-Stream LLMs are fast, they can predict+read tokens in all streams in parallel in each forward pass, improving latency 🔵 LLMs with multiple streams have an easier time encoding a separation of concerns, improving security 🔵 LLMs with many internal streams provide a legible form of parallel/cont. reasoning. Even if the main CoT stream is accidentally pressured or too focused on a particular task to voice concerns, other internal streams can subvocalize concerns that would otherwise not be verbalized. Does this sound related to a recent thinky post :) - Yes, but I don’t feel so bad about being outshipped with such a cool report on their side by 23 hours. I’ll link a 2nd thread below with a more direct comparison. I actually think both are complementary in interesting ways.

GIF

English

166

1.3K

146.1K

Avijit Ghosh retweetledi

MATS Research@MATSprogram·1d

1/ 🚨 MATS Autumn 2026 applications are now open. 10-week fully-funded fellowship for aspiring AI alignment, security & governance researchers and field-builders. 📍 Berkeley + London 📅 Sep 28 – Dec 4, 2026 💰 $5000/month stipend + $8,000/month compute Apply by June 7 AoE ↓

English

674

105.2K

Avijit Ghosh@evijit·1d

Whatever happened to Diffusion Language Models? Don’t hear about them anymore

English

479

Avijit Ghosh@evijit·2d

Open model development does not usually have dedicated compute/data center buildouts like frontier labs do. This is very heartening to see and something I’ve been explicitly advocating for. When physical artifacts of technology cohabit communities, they should benefit from it.

Anissa Gardizy@anissagardizy8

New: The charitable foundation tied to Nvidia CEO Jensen Huang and his wife, Lori Huang, has agreed to rent GPUs from CoreWeave. The Huang Foundation plans to donate the GPU hours to “university and other non-profit research institutes to develop open science and AI research.” It has donated $108 million in “GPU compute time grants” to date. story: theinformation.com/briefings/nvid…

English

731

Avijit Ghosh@evijit·4d

Should all our resources go towards building chatbots? What if we built systems that actually give people meaningful agency? Finally out as an accepted @FAccTConference paper! Joint work with @SourojitGhosh3, @PranavVenkit and @Sanjana08395511. huggingface.co/papers/2605.07…

English

Avijit Ghosh@evijit·4d

@gajesh @controlpaths Just from pure token throughput I imagine it will unlock different kinds of productivity than currently possible

English

226

Gajesh@gajesh·4d

@controlpaths out of curiosity: what can you achieve with 35B-A3B model? how much demand is there? what can you do locally that’s super good? - why embed model into the board if they keep updating every other month? - how much does it cost to manufacture these?

English

1.9K

Pablo T.@controlpaths·5d

A 35B parameters model (with 3B active per token) in an embedded board. Next months are going to be insane.

Sipeed@SipeedIO

High performance #RISCV (RVA23) K3 SBC coming soon! Up to 32GB DDR5, 60T int4 NPU, able to run Qwen3.5 35B-A3B @ 15tps~ Support Ubuntu2604 ! Vote for your preferred config and get early access when it launches next month! sipeed.com/k3/vote

English

116

1.8K

204.7K

Avijit Ghosh retweetledi

Quentin Gallouédec@QGallouedec·5d

releasing hf-sandbox 🥡

English

439

118.1K

Avijit Ghosh@evijit·5d

@focusfronting @Miles_Brundage I’m sad for you fren Coke Zero is amazing

English

Miles Brundage@Miles_Brundage·5d

Diet Coke is a great human achievement Delicious 100% of the time in canned/bottled form (but also customizable with fountain machines), produced and distributed at an unimaginable scale, probably OK for you, no animal-derived ingredients, loved across political + other lines

English

228

114

1.9K

355.5K

Avijit Ghosh retweetledi

Miles Brundage@Miles_Brundage·6d

The world is sleeping on robotics

English

777

63.7K

Avijit Ghosh@evijit·8 May

@tongzhou_mu More fingers in 2027!

English

Tongzhou Mu 🤖🦾🦿@tongzhou_mu·6 May

The evolution of the "UMI-style" data collection tool is moving fast, from 2-finger grippers to full 5-finger humanoid hands. The race for the ultimate data collector is on.

English

226

17.3K

Avijit Ghosh retweetledi

Jiafei Duan@DJiafei·7 May

If Caffe, TensorFlow, or PyTorch had been closed to only a few; if the Transformer was never published; if ResNet had stayed inside MSR; or if ImageNet and Common Crawl had never been made available, we would not have the ChatGPT moment we see today. Openness is not just a choice. It is a responsibility. We are excited that MolmoAct 2 from @allen_ai can contribute, even in a small way, toward bringing the robotics community closer to its own ChatGPT moment. Thanks @stepjamUK for featuring!

Stephen James@stepjamUK

Most open VLA models are not really open. They release weights and call it reproducibility. The training data is withheld. The training code is withheld. The deployment pipeline is withheld. You get a checkpoint file and a paper. You cannot verify the data quality. You cannot reproduce the training run. You cannot adapt it to your robot without starting from scratch. Researchers from Allen AI released MolmoAct2, the first VLA that is open. Weights, training code, complete datasets. • MolmoAct2-BimanualYAM Dataset: 720 hours of teleoperated trajectories across 28 real-world tasks, the largest open bimanual dataset available. • MolmoAct2-SO100/101 Dataset: 38,059 episodes curated from 1,222 public datasets. • MolmoAct2-DROID Dataset: Quality-filtered Franka trajectories with re-annotated instructions. The system deploys out-of-the-box on three platforms spanning the low-to-medium cost range. Bimanual YAM, SO-100/101, DROID Franka. No additional fine-tuning required. The backbone is Molmo2-ER, trained on a 3.3M sample corpus for embodied reasoning: metric distance estimation, free space detection, cross-view object tracking, scene geometry reconstruction. The skills general-purpose VLMs do not test. Results Look Promissing 63.8% average across 13 embodied reasoning benchmarks. Outperforms GPT-5 and Gemini Robot ER-1.5 on 9 of 13 tasks. Outperforms π0.5 across 7 simulation and real-world benchmarks. The architecture uses per-layer KV conditioning between the VLM and a flow-matching action expert trained with DiT-style transformers. This bridges discrete reasoning tokens to continuous control trajectories while exposing the attention state the VLM itself uses. This is the deployment model NeuraCore advocates for: standardized ecosystems with reproducible training data. Custom infrastructure for every embodiment is technical debt that prevents fleet scaling. Nice work from @hq_fang, @DJiafei, and the team at @allen_ai

English

3.2K

Avijit Ghosh retweetledi

Ai2@allen_ai·7 May

@NSF @nvidia @Cirrascale Our research estimates that in today’s model training efforts, 82% of compute goes into exploratory work. At closed labs, the output of that work stays within those labs. In an open system, models, datasets, & methods are shared, and the value compounds across the field.

English

7.3K

Avijit Ghosh@evijit·8 May

😍 (and also, how does one train with proprioception data? That’s very cool)

Genesis AI@gs_ai_

We are back. After one year of quiet building. Introducing GENE-26.5, our first robotic brain that takes a major step toward human-level capability. For years, robotics has struggled to learn from the world’s largest and valuable data source: Humans. Solving it means rethinking the whole stack from the ground up: - A robotics-native foundation model. - A 1:1 human-like robotic hand. - A noninvasive data collection glove for motion, force, and touch. - A simulator that turns weeks of experiments into minutes. GENE-26.5 is trained across language, vision, proprioception, tactile, and action. We designed a set of tasks to test how far we can go with this new paradigm. Fully autonomous, 1x speed, one model, same weights. (Enjoy with sound on) We are approaching the endgame for robotics. And this is just a beginning.

English

146

Avijit Ghosh@evijit·7 May

Another American open release! Trained on AMD chips 🚀 huggingface.co/Zyphra/ZAYA1-8B

Zyphra@ZyphraAI

Today we're releasing ZAYA1-8B, a reasoning MoE trained on @AMD and optimized for intelligence density. With <1B active params, it outperforms open-weight models many times its size on math and reasoning, closing in on DeepSeek-V3.2 and GPT-5-High with test-time compute. 🧵

English

171

Avijit Ghosh@evijit·7 May

231 pages across 5 NeurIPS submissions later

English

346

Avijit Ghosh@evijit·5 May

It after all is art (I hated that book)

Catherine Yeo@catherinehyeo

Love seeing Naomi Osaka honor the CLRS Algorithms textbook at this year's Met Gala

English

395

Keşfet

@rtwlz @neil_chilson @FAccTConference @SourojitGhosh3 @PranavVenkit @Sanjana08395511 @gajesh @controlpaths