/MachineLearning

15.7K posts

/MachineLearning

@slashML

Cloud 가입일 Aralık 2016

0 팔로잉123.8K 팔로워

/MachineLearning 리트윗함

Jianyang Gao@gaoj0017·1d

The TurboQuant paper (ICLR 2026) contains serious issues in how it describes RaBitQ, including incorrect technical claims and misleading theory/experiment comparisons. We flagged these issues to the authors before submission. They acknowledged them, but chose not to fix them. The paper was later accepted and widely promoted by Google, reaching tens of millions of views. We’re speaking up now because once a misleading narrative spreads, it becomes much harder to correct. We’ve written a public comment on openreview (openreview.net/forum?id=tO3AS…). We would greatly appreciate your attention and help in sharing it.

Google Research@GoogleResearch

Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: goo.gle/4bsq2qI

English

699

4.8K

559.4K

/MachineLearning 리트윗함

Agentica@agenticasdk·1d

We scored 36.08% on ARC-AGI-3 in one day using the Agentica SDK.

English

132

1.4K

395.9K

/MachineLearning 리트윗함

Lisan al Gaib@scaling01·2d

my read on the ARC-AGI-3 situation is that models were too good with harness so they decided no harness at all

Lisan al Gaib@scaling01

this is pretty much worst case performance no harness at all and very simplistic prompt

English

225

17.9K

/MachineLearning 리트윗함

Artur Chakhvadze@norpadon·21 Mar

The main goal of Bayesian ML research is to show that all methods which have previously been shown to work well in practice are somehow approximately Bayesian

CLaE@leafs_s

Transformers are Bayesian Networks arxiv.org/abs/2603.17063

English

122

2.2K

139.2K

/MachineLearning 리트윗함

The Spectator Index@spectatorindex·5 Mar

Anthropic is resuming negotiations with the Pentagon for a deal on artificial intelligence, according to FT report.

English

150

275

2.8K

523.1K

/MachineLearning@slashML·18 Şub

post-agi career choices

Amy Tam@amytam01

x.com/i/article/2023…

English

10.9K

/MachineLearning@slashML·26 Oca

@_rockt @NetHack_LE There will certainly be ARC-AGI 4+

English

1.3K

Tim Rocktäschel@_rockt·26 Oca

After ARC-AGI 3 is saturated there will still be @NetHack_LE / balrogai.com left to conquer.

English

10.1K

/MachineLearning@slashML·21 Oca

@bubbleboi What financial commitments? Everything announced has thus far been optoinal ("up to X" amount)

English

/MachineLearning@slashML·20 Oca

@EnoReyes What interface are you wrapping GLM with?

English

430

Eno Reyes@EnoReyes·19 Oca

The most cost effective combination right now is setting Opus as your plan model and GLM 4.7 or GPT-5.2-Codex as your execution model. Gives you basically the same performance as opus, for a fraction of the tokens.

English

823

130.6K

/MachineLearning@slashML·19 Oca

Taken directly from: openai.com/index/a-busine…

English

1.4K

/MachineLearning@slashML·19 Oca

OpenAI plans to claim IP over the tokens sent to users?

English

1.4K

/MachineLearning 리트윗함

Jeffrey Emanuel@doodlestein·18 Oca

Would you believe that, far from sponsoring me, @AnthropicAI today started banning several of my (now 22) Max accounts? For the crime of using their models to produce the most useful open-source agent coding tooling on the planet, and then giving it all away for free. And teaching my workflows and methods and prompts to everyone selflessly. Anthropic people who follow me (I know there are dozens of you), please DM me and make this right. I’m not asking for a handout. I’m paying $212 per month with tax for each of those accounts. And I also let you collect info on my usage and use the official harness. The RL from my usage is pure gold. I’ve also been a massive promoter of your company and it’s really messed up to try to ban me like this. Puts a really bad taste in my mouth and makes me never want to promote you guys again. I need to be spending my energy creating, not being made to feel like a criminal for making MIT-licensed tools. You’re also just helping your antagonist, Sam, since I’m now the proud owner of 11 GPT Pro accounts (and counting). I refuse to lose my momentum because of this nonsense. I will not be slowed.

John Thilén@JohnThilen

@doodlestein @AnthropicAI: please sponsor this man.

English

1.2K

271.7K

/MachineLearning@slashML·12 Oca

@scottastevenson A rule that goes as far back as life itself, the bigger something is, the slower it moves.

English

446

Scott Stevenson@scottastevenson·12 Oca

Software is about to go through the same transition that stock trading did when algorithmic traders entered the market. AI will not be good for bootstrappers. They will be wrecked like retail traders were. There used to be many crevices of the market that large software companies couldn’t reach. Bootstrappers and small caps built nests there. But with AI, large software companies will start to look like multi-vertical hedge funds. With 1000 AI tentacles, they will suck the alpha out of every crevice. While one crevice may not have been appetizing enough to go after before, 1000 will be. Software will begin to have something like “market makers” who make money on everything. A small number of hedgefund-like software companies may come to own everything.

ᴅᴀɴɪᴇʟ ᴍɪᴇssʟᴇʀ 🛡️@DanielMiessler

Holy crap. This is the genre of software that's in the most danger: - Kind of mid in quality - Highly niche use-cases - It's been winner takes all for the space in the past - Often involved special formats or protocols And now Claude Code can just reverse engineer it. 🤯

English

1.3K

338.7K

/MachineLearning 리트윗함

Tibo@thsottiaux·10 Oca

Codex ❤️ OSS. Over the coming days we are prioritizing working with open source coding agents and tools to support them in the same way as OpenCode, so that codex users can benefit from their account and usage in those combined with using our models in codex directly. We are already talking with OpenHands, RooCode and Pi. Reach out if you build in the open and would benefit from this. Our own work is OSS at github.com/openai/codex

English

152

158

2.4K

197.8K

/MachineLearning 리트윗함

Quanta Magazine@QuantaMagazine·7 Oca

As AI models grow more powerful, they appear to be converging on how they internally represent reality. @benbenbrubaker reports: quantamagazine.org/distinct-ai-mo…

English

141

62.4K

/MachineLearning 리트윗함

Yifan Zhang@yifan_zhang_·1 Oca

Something REALLY HUGE. github.com/yifanzhang-pro…

English

236

450.5K

/MachineLearning 리트윗함

Christopher Manning@chrmanning·1 Oca

Great to see an AI lab doing and publishing science (as well as discussing engineering efficiencies)! Some of the other “frontier” labs should try it! Thx, @deepseek_ai!

alphaXiv@askalphaxiv

DeepSeek just dropped a banger paper to wrap up 2025 "mHC: Manifold-Constrained Hyper-Connections" Hyper-Connections turn the single residual “highway” in transformers into n parallel lanes, and each layer learns how to shuffle and share signal between lanes. But if each layer can arbitrarily amplify or shrink lanes, the product of those shuffles across depth makes signals/gradients blow up or fade out. So they force each shuffle to be mass-conserving: a doubly stochastic matrix (nonnegative, every row/column sums to 1). Each layer can only redistribute signal across lanes, not create or destroy it, so the deep skip-path stays stable while features still mix! with n=4 it adds ~6.7% training time, but cuts final loss by ~0.02, and keeps worst-case backward gain ~1.6 (vs ~3000 without the constraint), with consistent benchmark wins across the board

English

1.1K

120K

/MachineLearning 리트윗함

hardmaru@hardmaru·27 Ara

Especially in such times, hackers and tinkerers tend to fare better at harnessing evolving technology with a high level of uncertainty and ambiguity, compared to traditional well-read professional types.

English

14.1K

/MachineLearning 리트윗함

Alec Helbling@alec_helbling·26 Ara

I'm really enjoying the diffusion model speed running literature that seems to have been spurred by REPA. The goal is to figure out how to train a reasonable quality ImageNet generator as fast as possible. It is like the nanoGPT of diffusion.

English

534

33.6K

/MachineLearning 리트윗함

Greg Brockman@gdb·23 Ara

exceeding the human baseline on ARC-AGI-2 with gpt-5.2:

Poetiq@poetiq_ai

We finally had a moment to run our system with GPT-5.2 X-High on ARC-AGI-2! Using the same Poetiq harness as before, we saw results as high as 75% at under $8 / problem using GPT-5.2 X-High on the full PUBLIC-EVAL dataset. This beats the previous SOTA by ~15 percentage points.

English

123

1.6K

235.5K

탐색

@_rockt @NetHack_LE @bubbleboi @EnoReyes @AnthropicAI @scottastevenson @benbenbrubaker @deepseek_ai