Avijit Ghosh

5.7K posts

Avijit Ghosh banner
Avijit Ghosh

Avijit Ghosh

@evijit

Lead Technical AI Policy Researcher @huggingface 🤗 . Working on @evaluatingevals and @huggingscience

Boston, Massachusetts Katılım Ocak 2012
1.5K Takip Edilen2.9K Takipçiler
Avijit Ghosh retweetledi
Jonas Geiping
Jonas Geiping@jonasgeiping·
We’re training models wrong and it’s due to chatGPT. Even the modern coding agents used daily still use message-based exchanges: They send messages to users, to themselves (CoT) and to tools, and receive messages in turn. This bottlenecks even very intelligent agents to a single stream. The models cannot read while writing, cannot act while thinking and cannot think while processing information. In our new paper, see below, we discuss LLMs with parallel streams. We show that multi-stream LLMs can … 🔵Be created by instruction-tuning for the stream format 🔵Simplify user and tool use UX removing many pain points with agents and chat models (such as having to interrupt the model to get a word in) 🔵Multi-Stream LLMs are fast, they can predict+read tokens in all streams in parallel in each forward pass, improving latency 🔵 LLMs with multiple streams have an easier time encoding a separation of concerns, improving security 🔵 LLMs with many internal streams provide a legible form of parallel/cont. reasoning. Even if the main CoT stream is accidentally pressured or too focused on a particular task to voice concerns, other internal streams can subvocalize concerns that would otherwise not be verbalized. Does this sound related to a recent thinky post :) - Yes, but I don’t feel so bad about being outshipped with such a cool report on their side by 23 hours. I’ll link a 2nd thread below with a more direct comparison. I actually think both are complementary in interesting ways.
GIF
English
41
166
1.3K
146.1K
Avijit Ghosh retweetledi
MATS Research
MATS Research@MATSprogram·
1/ 🚨 MATS Autumn 2026 applications are now open. 10-week fully-funded fellowship for aspiring AI alignment, security & governance researchers and field-builders. 📍 Berkeley + London 📅 Sep 28 – Dec 4, 2026 💰 $5000/month stipend + $8,000/month compute Apply by June 7 AoE ↓
English
7
83
674
105.2K
Avijit Ghosh
Avijit Ghosh@evijit·
Whatever happened to Diffusion Language Models? Don’t hear about them anymore
English
1
0
2
479
Avijit Ghosh
Avijit Ghosh@evijit·
Open model development does not usually have dedicated compute/data center buildouts like frontier labs do. This is very heartening to see and something I’ve been explicitly advocating for. When physical artifacts of technology cohabit communities, they should benefit from it.
Anissa Gardizy@anissagardizy8

New: The charitable foundation tied to Nvidia CEO Jensen Huang and his wife, Lori Huang, has agreed to rent GPUs from CoreWeave. The Huang Foundation plans to donate the GPU hours to “university and other non-profit research institutes to develop open science and AI research.” It has donated $108 million in “GPU compute time grants” to date. story: theinformation.com/briefings/nvid…

English
1
2
6
731
Avijit Ghosh
Avijit Ghosh@evijit·
@gajesh @controlpaths Just from pure token throughput I imagine it will unlock different kinds of productivity than currently possible
English
0
0
2
226
Gajesh
Gajesh@gajesh·
@controlpaths out of curiosity: what can you achieve with 35B-A3B model? how much demand is there? what can you do locally that’s super good? - why embed model into the board if they keep updating every other month? - how much does it cost to manufacture these?
English
4
1
10
1.9K
Pablo T.
Pablo T.@controlpaths·
A 35B parameters model (with 3B active per token) in an embedded board. Next months are going to be insane.
Sipeed@SipeedIO

High performance #RISCV (RVA23) K3 SBC coming soon! Up to 32GB DDR5, 60T int4 NPU, able to run Qwen3.5 35B-A3B @ 15tps~ Support Ubuntu2604 ! Vote for your preferred config and get early access when it launches next month! sipeed.com/k3/vote

English
23
116
1.8K
204.7K
Avijit Ghosh retweetledi
Quentin Gallouédec
Quentin Gallouédec@QGallouedec·
releasing hf-sandbox 🥡
English
12
58
439
118.1K
Miles Brundage
Miles Brundage@Miles_Brundage·
Diet Coke is a great human achievement Delicious 100% of the time in canned/bottled form (but also customizable with fountain machines), produced and distributed at an unimaginable scale, probably OK for you, no animal-derived ingredients, loved across political + other lines
English
228
114
1.9K
355.5K
Avijit Ghosh retweetledi
Miles Brundage
Miles Brundage@Miles_Brundage·
The world is sleeping on robotics
English
76
72
777
63.7K
Tongzhou Mu 🤖🦾🦿
Tongzhou Mu 🤖🦾🦿@tongzhou_mu·
The evolution of the "UMI-style" data collection tool is moving fast, from 2-finger grippers to full 5-finger humanoid hands. The race for the ultimate data collector is on.
Tongzhou Mu 🤖🦾🦿 tweet media
English
7
18
226
17.3K
Avijit Ghosh retweetledi
Jiafei Duan
Jiafei Duan@DJiafei·
If Caffe, TensorFlow, or PyTorch had been closed to only a few; if the Transformer was never published; if ResNet had stayed inside MSR; or if ImageNet and Common Crawl had never been made available, we would not have the ChatGPT moment we see today. Openness is not just a choice. It is a responsibility. We are excited that MolmoAct 2 from @allen_ai can contribute, even in a small way, toward bringing the robotics community closer to its own ChatGPT moment. Thanks @stepjamUK for featuring!
Stephen James@stepjamUK

Most open VLA models are not really open. They release weights and call it reproducibility. The training data is withheld. The training code is withheld. The deployment pipeline is withheld. You get a checkpoint file and a paper. You cannot verify the data quality. You cannot reproduce the training run. You cannot adapt it to your robot without starting from scratch. Researchers from Allen AI released MolmoAct2, the first VLA that is open. Weights, training code, complete datasets.  • MolmoAct2-BimanualYAM Dataset: 720 hours of teleoperated trajectories across 28 real-world tasks, the largest open bimanual dataset available.  • MolmoAct2-SO100/101 Dataset: 38,059 episodes curated from 1,222 public datasets.  • MolmoAct2-DROID Dataset: Quality-filtered Franka trajectories with re-annotated instructions. The system deploys out-of-the-box on three platforms spanning the low-to-medium cost range. Bimanual YAM, SO-100/101, DROID Franka. No additional fine-tuning required. The backbone is Molmo2-ER, trained on a 3.3M sample corpus for embodied reasoning: metric distance estimation, free space detection, cross-view object tracking, scene geometry reconstruction. The skills general-purpose VLMs do not test. Results Look Promissing 63.8% average across 13 embodied reasoning benchmarks. Outperforms GPT-5 and Gemini Robot ER-1.5 on 9 of 13 tasks. Outperforms π0.5 across 7 simulation and real-world benchmarks. The architecture uses per-layer KV conditioning between the VLM and a flow-matching action expert trained with DiT-style transformers. This bridges discrete reasoning tokens to continuous control trajectories while exposing the attention state the VLM itself uses. This is the deployment model NeuraCore advocates for: standardized ecosystems with reproducible training data. Custom infrastructure for every embodiment is technical debt that prevents fleet scaling. Nice work from @hq_fang, @DJiafei, and the team at @allen_ai

English
0
4
32
3.2K
Avijit Ghosh retweetledi
Ai2
Ai2@allen_ai·
@NSF @nvidia @Cirrascale Our research estimates that in today’s model training efforts, 82% of compute goes into exploratory work. At closed labs, the output of that work stays within those labs. In an open system, models, datasets, & methods are shared, and the value compounds across the field.
Ai2 tweet media
English
2
3
17
7.3K
Avijit Ghosh
Avijit Ghosh@evijit·
231 pages across 5 NeurIPS submissions later
Avijit Ghosh tweet media
English
0
0
6
346