John deVadoss

@john_devadoss

co-Founder NeuralFabric acq. by @Cisco | co-Founder @IntWorkAll | Board @GBBC_io | General Manager @Microsoft | Phd RL research @UMassAmherst

เข้าร่วม Haziran 2019

2K กำลังติดตาม9.6K ผู้ติดตาม

ทวีตที่ปักหมุด

John deVadoss@john_devadoss·25 Ağu

A Public AI Wealth fund, not 'basic income'. It is time for Congress to act. thehill.com/opinion/techno…

English

60.3K

John deVadoss รีทวีตแล้ว

Ian Osband@IanOsband·24 Mar

Scaling up distributed RL is the big challenge in AI. At its core the issue is that the actor != learner. The standard fix is importance weighting p_learn/p_act. It kind of works if you tune/clip... but not very well. Delightful Policy Gradient solves it. arxiv.org/abs/2603.20521

English

244

67.3K

John deVadoss รีทวีตแล้ว

Kimi.ai@Kimi_Moonshot·16 Mar

Introducing 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with learned, input-dependent attention over preceding layers. 🔹 Enables networks to selectively retrieve past representations, naturally mitigating dilution and hidden-state growth. 🔹 Introduces Block AttnRes, partitioning layers into compressed blocks to make cross-layer attention practical at scale. 🔹 Serves as an efficient drop-in replacement, demonstrating a 1.25x compute advantage with negligible (<2%) inference latency overhead. 🔹 Validated on the Kimi Linear architecture (48B total, 3B activated parameters), delivering consistent downstream performance gains. 🔗Full report: github.com/MoonshotAI/Att…

English

334

2.1K

13.6K

4.9M

John deVadoss รีทวีตแล้ว

Ai2@allen_ai·5 Mar

Introducing Olmo Hybrid, a 7B fully open model combining transformer and linear RNN layers. It decisively outperforms Olmo 3 7B across evals, w/ new theory & scaling experiments explaining why. 🧵

English

129

785

168.4K

John deVadoss รีทวีตแล้ว

Felix Rieseberg@felixrieseberg·25 Şub

A software genie in a lamp is hard to explain. The better the models get, the more you can just ask for what you want - and if no specific tool exists, they’ll often just build it. That’s why Cowork gives Claude a VM: it can write software on the fly to do whatever you need. But as an industry, I think we haven’t figured out how to teach users outside the bubble that apps like Claude Code or Cowork can handle a huge range of work without a dedicated “do X” button. Especially since precisely stating what you want has always been hard, AI or not.

Chris@chatgpt21

Claude cowork was making a spreadsheet for me in Google Sheets, it realized taking screenshots and trying to edit on the screen was too slow. Went into some JavaScript - don’t even remember what it was > needed my Google permissions > coded the whole thing on the backend > invisible layers I can’t even see Flawless beautiful spreadsheet. Didn’t need too much hand holding and was as efficient as I would be

English

15.7K

ค้นพบ

@elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA @nikifrancismediavine @katyperry