John deVadoss

13 posts

John deVadoss

@john_devadoss

co-Founder NeuralFabric acq. by @Cisco | co-Founder @IntWorkAll | Board @GBBC_io | General Manager @Microsoft | Phd RL research @UMassAmherst

Katılım Haziran 2019

2K Takip Edilen9.6K Takipçiler

Sabitlenmiş Tweet

John deVadoss@john_devadoss·25 Ağu

A Public AI Wealth fund, not 'basic income'. It is time for Congress to act. thehill.com/opinion/techno…

English

60.9K

John deVadoss retweetledi

Souradip Chakraborty@SOURADIPCHAKR18·6d

🚨Typical RL algorithms and on-policy distillation methods are blind samplers: they use privileged info to score rollouts, but not to *find* them. We ask: can we use privileged info to *actively sample* the rollouts RL wishes it can stumble upon with compute? ⤵️ Pedagogical RL

English

475

109.1K

John deVadoss retweetledi

Nathan Lambert@natolambert·7 May

Work led by @jacobcares showed that little compute for building an LLM is actually in the final runs. The vast majority of compute goes to developing a recipe. Creating the recipe openly is a huge lever in making sure the research community's compute pushes to new knowledge.

Ai2@allen_ai

Today we’re bringing new NSF OMAI compute online with NVIDIA Blackwell Ultra-powered systems, turning a $152M national investment from @NSF & @NVIDIA into a foundation for truly open AI research. 🧵

English

114

19.6K

John deVadoss retweetledi

Xiaomi MiMo@XiaomiMiMo·27 Nis

Xiaomi MiMo-V2.5 is now officially open-sourced！ MIT License, supporting commercial deployment, continued training, and fine-tuning - no additional authorization required. Two models, both supporting a 1M-token context window : • MiMo-V2.5-Pro: built for complex agent and coding tasks, ranking No.1 among open-source models on GDPVal-AA and ClawEval • MiMo-V2.5: a native omni-modal model with strong agent capabilities A model's value isn't measured by rankings alone — it's measured by the problems it solves. Let's build with MiMo now! 🤗 Weights: huggingface.co/collections/Xi… 📄 Blog: #blog" target="_blank" rel="nofollow noopener">mimo.xiaomi.com/index#blog

English

144

463

3.4K

773.6K

John deVadoss retweetledi

alex zhang@a1zhang·26 Nis

New mini experiment + blogpost + trajectories! tldr; we boost performance of RLM(GPT-5.2) to double the best performing number (38.7% --> 65.6%) on LongCoT-mini without any training! An example of the mismanaged geniuses hypothesis (MGH) we (@zli11010, @lateinteraction) proposed earlier this month. The LongCoT benchmark showed that frontier LMs and RLMs struggled to solve difficult compositional reasoning tasks. The paper generally attributes this to the RLMs inability to perform task decomposition, but we argue this is more our fault in how we prompt them; this capability is fully available to GPT-5.2 with an RLM harness! Building on @raw_works's insightful blogpost and @sumeetrm / @CharlieLondon02 et al.'s incredibly useful benchmark, where they originally found RLMs to be incapable of solving the MATH and CS splits altogether. We did not train anything since the release of the initial benchmark. To be fully transparent, these results are not meant to be added to their leaderboard either; benchmarks measure isolated capabilities, and we focus on showing (through different, rather specific prompting) that the capabilities required to solve these tasks are available to the models without additional training! It also has implications about how we would go about training these systems. Full blog below, it's a nice read :)

English

490

41.2K

John deVadoss@john_devadoss·26 Nis

@RLanceMartin Thank you for writing this up.

English

Lance Martin@RLanceMartin·26 Nis

a few lessons we’ve learned on context management + long-term memory

Lance Martin@RLanceMartin

x.com/i/article/2047…

English

561

124K

John deVadoss retweetledi

Lance Martin@RLanceMartin·24 Nis

x.com/i/article/2047…

ZXX

455

191.9K

John deVadoss retweetledi

DeepSeek@deepseek_ai·24 Nis

🚀 DeepSeek-V4 Preview is officially live & open-sourced! Welcome to the era of cost-effective 1M context length. 🔹 DeepSeek-V4-Pro: 1.6T total / 49B active params. Performance rivaling the world's top closed-source models. 🔹 DeepSeek-V4-Flash: 284B total / 13B active params. Your fast, efficient, and economical choice. Try it now at chat.deepseek.com via Expert Mode / Instant Mode. API is updated & available today! 📄 Tech Report: huggingface.co/deepseek-ai/De… 🤗 Open Weights: huggingface.co/collections/de… 1/n

English

1.6K

7.7K

45.4K

9.7M

John deVadoss retweetledi

Qwen@Alibaba_Qwen·22 Nis

🚀 Meet Qwen3.6-27B, our latest dense, open-source model, packing flagship-level coding power! Yes, 27B, and Qwen3.6-27B punches way above its weight. 👇 What's new: 🧠 Outstanding agentic coding — surpasses Qwen3.5-397B-A17B across all major coding benchmarks 💡 Strong reasoning across text & multimodal tasks 🔄 Supports thinking & non-thinking modes ✅ Apache 2.0 — fully open, fully yours Smaller model. Bigger results. Community's favorite. ❤️ We can't wait to see what you build with Qwen3.6-27B! 👀 🔗👇 Blog: qwen.ai/blog?id=qwen3.… Qwen Studio: chat.qwen.ai/?models=qwen3.… Github: github.com/QwenLM/Qwen3.6 Hugging Face: huggingface.co/Qwen/Qwen3.6-2… huggingface.co/Qwen/Qwen3.6-2… ModelScope: modelscope.cn/models/Qwen/Qw… modelscope.cn/models/Qwen/Qw…

English

544

1.7K

12.5K

3.7M

John deVadoss retweetledi

Arthur Douillard@Ar_Douillard·23 Nis

The DiLoCo team at Google DeepMind and Google Research is proud to release Decoupled DiLoCo, the next frontier for resilient AI pre-training. Decoupled DiLoCo enables training with datacenters across the world, using heterogeneous hardware, and never halting the system despite hardware failures.

GIF

English

609

2.7M

John deVadoss retweetledi

Hayden Prairie@hayden_prairie·15 Nis

We’ve been thinking a lot about scaling laws, wondering if there is a more effective way to scale FLOPs without increasing parameters. Turns out the answer is YES – by looping blocks of layers during training. We find that predictable scaling laws exist for layer looping, allowing us to use looping to achieve the quality of a Transformer twice the size. Our scaling laws suggest that for a fixed parameter budget, data and looping should be increased in tandem! 🧵👇

English

179

1.3K

292.1K

John deVadoss retweetledi

Ian Osband@IanOsband·24 Mar

Scaling up distributed RL is the big challenge in AI. At its core the issue is that the actor != learner. The standard fix is importance weighting p_learn/p_act. It kind of works if you tune/clip... but not very well. Delightful Policy Gradient solves it. arxiv.org/abs/2603.20521

English

245

68K

Keşfet

@jacobcares @zli11010 @lateinteraction @raw_works @sumeetrm @CharlieLondon02 @RLanceMartin @elonmusk