qrdl

5.5K posts

qrdl

@QRDL

The first career that should be made obsolete is the AI programmer.

เข้าร่วม Aralık 2009

427 กำลังติดตาม408 ผู้ติดตาม

ทวีตที่ปักหมุด

qrdl@QRDL·31 Ara

@VitalikButerin @shayne_coplan @mansourtarek_ @giancarloMKTS @aosipovich @Polymarket @metaculus @kalshi I'm a huge believer in Prediction Markets. It is free speech, but with accountability. Messengers from the future, warning us of the consequence of our actions. But the markets have significant negative social utility when they are subjective and opaque. It can not and it must not be just about gambling. Prediction Markets have perverse incentives to spread disinfo and manipulate outcomes, so more work needs to be done to surface authentic signal and increase transparency to stop bad faith actors. Polymarket, metaculus and kalshi are the leaders in this industry. Polymarket is arguably the leader and best in terms of transparency and free speech, but has serious problems with bad actors and poorly written subjective markets. Metaculus lack of trading means it doesn't have bad actors, and the rulesets are very well done, but their poor transparency is devastating and destroys a great deal of potential value. They also have poor accuracy and no emergent signal / breaking news because of their minimal incentives. Kalshi has reasonable rulesets, but too many bad actors which is a result of their weak transparency. Through no fault of their own, DCM status suppresses their ability to embrace free speech. I am hopeful, and perhaps even a little optimistic, that 2025 will be the year when Prediction Markets grow up. They will realize their greatest strength is their openness, that they have a profound responsibility to make the world better, and they are not here to just facilitate the transfer of wealth from the naïve and gullible to the sharp and cunning.

English

560

qrdl@QRDL·1d

@gaoj0017 @grok is the random rotation in turboquant considered a standard technique in optimizing for models?

English

Jianyang Gao@gaoj0017·27 Mar

We need to publicly clarify serious issues in Google’s ICLR 2026 paper TurboQuant. TurboQuant misrepresents RaBitQ in three ways: 1. Avoids acknowledging key methodological similarity (JL transform) 2. Calls our theory “suboptimal” with no evidence 3. Reports results under unfair experimental settings We have expressed our concerns to the authors before their submission, but they chose not to fix them in their paper submission. The paper was accepted at ICLR 2026 and heavily promoted by Google (tens of millions of views). At that scale, uncorrected claims quickly become “consensus.” Facts: 1. RaBitQ already proves asymptotic optimality (FOCS’17 bound) 2. TurboQuant uses the same random rotation step but misses stating the connection 3. Their experiments used single-core CPU for RaBitQ vs A100 GPU for TurboQuant None of these is properly disclosed. We’ve filed a formal complaint and posted on OpenReview (openreview.net/forum?id=tO3AS…). We’ll release a detailed technical report on arXiv. Our goal is simple: keep the academic record accurate. Would appreciate people taking a look and sharing.

English

1.2K

97.5K

Jianyang Gao@gaoj0017·27 Mar

The TurboQuant paper (ICLR 2026) contains serious issues in how it describes RaBitQ, including incorrect technical claims and misleading theory/experiment comparisons. We flagged these issues to the authors before submission. They acknowledged them, but chose not to fix them. The paper was later accepted and widely promoted by Google, reaching tens of millions of views. We’re speaking up now because once a misleading narrative spreads, it becomes much harder to correct. We’ve written a public comment on openreview (openreview.net/forum?id=tO3AS…). We would greatly appreciate your attention and help in sharing it.

Google Research@GoogleResearch

Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: goo.gle/4bsq2qI

English

977

6.5K

qrdl@QRDL·1d

@gaoj0017 the random rotation is critical and unfortunately they don't alleviate that, however I don't think it was unique to rabitq

English

qrdl@QRDL·27 Mar

@fchollet @fchollet We have a tiny world, it's called math. Open Math problems are a much superior benchmark to arc-agi. Impossible to benchmax (cough), and solving math gets us on the way to solving science.

English

120

François Chollet@fchollet·27 Mar

Many people expect that current AI is ready to cure cancer and do breakthrough new science. ARC-AGI-3 envs are like a microcosm of the scientific method: you must observe a tiny world, form a theory of how it works, test it, iterate until correct. Over the course of a few minutes. If AI can't do it in an ultra-simple, ultra-small scale setting that is explicitly designed to be as accessible as possible, I expect there are a few steps missing until AI can crack the nature of reality.

François Chollet@fchollet

"2+ people can do it out of an unfiltered pool of 10 people that might well be a below-average sample" is not the sign of a insurmountable challenge. It's not certainly where I would set the bar for "super intelligence". ASI is when AI is better than *every single human* -- for instance we have ASI for chess and Go today.

English

564

48.3K

qrdl@QRDL·27 Mar

@FakePsyho What I find stupid in all of this, is we already have extremely useful benchmarks - open math problems. Impossible (mostly) to benchmax. And solving them is the beginning of solving science leading to breakthroughs. Like, wtf, already. Enough with the nonsense.

English

397

Psyho@FakePsyho·27 Mar

At this point, I've played through ALL of the games there. I might do a longer write-up describing all of the flaws. Sadly, this would probably get lost in a flood of "AI is 100x worse than humans!!!111" posts from clueless influencers. While it's clear this whole thing is designed in a way that the models will artificially get as low score as possible, I also suspect that game designers / tester team were just not experienced enough to design something fair. I'd expect that the private test games have an order of magnitude more terrible ideas (considering they are expected to be harder).

English

253

10.6K

Psyho@FakePsyho·27 Mar

AI (or any human) will never get 100% in ARC-AGI-3 Let me introduce you to the worst game mechanic you can find in a puzzle game: fog of war At the start, if you go right instead of bottom, you're wasting many moves. Your score on this level literally depends on a conflip!

English

532

72K

qrdl@QRDL·20 Mar

@lossfunk Can you post an illuminative example? What is sort of the simplest problem it fails on that a human could reasonable do?

English

346

Lossfunk@lossfunk·19 Mar

🚨 Shocking: Frontier LLMs score 85-95% on standard coding benchmarks. We gave them equivalent problems in languages they couldn't have memorized. They collapsed to 0-11%. Presenting EsoLang-Bench. Accepted to the Logical Reasoning and ICBINB workshops at ICLR 2026 🧵

English

153

287

2.2K

1.2M

qrdl@QRDL·11 Mar

@mkurman88 i've been using developers.openai.com/codex/cli/ it's pretty scary good

English

Mariusz Kurman@mkurman88·11 Mar

@QRDL I haven't tried it, but I use it in VSC, and it's pretty good there

English

258

Mariusz Kurman@mkurman88·11 Mar

That's insane, I get more usage quota (or at least can do more) with the $10 GitHub Copilot Pro plan than with the $50 OpenAI Business plan

English

4.2K

qrdl@QRDL·27 Şub

@DavidColetto Trump is not a villain, he's a symptom.

English

David Coletto 🇨🇦@DavidColetto·27 Şub

What stood out most was what he didn’t do. He rejected Trump’s 51st state rhetoric and trade claims. But he did not frame Trump as the central villain. Instead, he argued Canada’s anxiety reflects something deeper: overreliance and domestic weakness. Wrote about the speech: open.substack.com/pub/davidcolet…

English

105

7.5K

qrdl@QRDL·27 Şub

@DelongNuna @DavidColetto We gotta stop blaming and start proposing solutions. I respect Pierre for this reset. Hope he keeps it up.

English

Prairie Girl 🌾@DelongNuna·27 Şub

@DavidColetto Pierre NEVER blamed Canadians. He blamed the people responsible for our mess, Trudeau & now Mark Carney. Who is using the same China election propaganda to divide Canada & USA relationship further. Amplifying it for votes.

English

111

David Coletto 🇨🇦@DavidColetto·26 Şub

Pierre Poilievre’s latest speech felt like a tonal reset. For two years, the message has been that Canada is broken. Elites failed. Gatekeepers blocked growth. It was sharp and oppositional. This time, it sounded different. Calmer. More reflective. More like statecraft. Wrote about the speech: open.substack.com/pub/davidcolet…

English

139

387

47K

qrdl@QRDL·18 Şub

@4gravitons @martinmbauer @ALupsasca arxiv.org/pdf/2601.22401

QME

Matthew von Hippel@4gravitons·18 Şub

@QRDL @martinmbauer @ALupsasca True, but my guess in this case is this isn't a huge player. I don't think there are a lot of OpenAI/scientist collaborations of this type. This one took a personal connection, for example. And the claim seems to be that the twelve hour reasoning run was done in one shot.

English

Martin Bauer@martinmbauer·16 Şub

Yes, this is a significant result and a solid research paper. And it would’ve been much harder to achieve without GPT. While I understand the instinct, I think it is more interesting to evaluate what type of contribution the AI has made as opposed to focussing on how relevant the result is. ChatGPT generalised a previously derived result for an amplitude that was assumed to vanish for all physical kinematics people care about. These amplitudes are very complicated, lengthy expressions with certain structures and symmetries that are sometimes hidden and difficult to see. This kind of problem is exactly where AI shines! AI is better at detecting breast cancer than a clinician because it has seen millions of scans and detects structures where humans -which are limited by their lifetime exposure- can't. AI that systematically surveys many large amplitudes has a similar advantage Similar to the Erdős problems, it was mostly really an attention bottleneck that left this problem unsolved. This calculation was considered another elaborate way of arriving at zero. So few people were interested in the result and even fewer were working on it. Most if not all physicists would therefore consider the insight of the human physicists that there is in fact a kinematic region where these amplitudes are not zero the most meaningful piece of progress here Both these points aren't to diminish the result. It is seriously impressive and deserves the publicity. It shows where the strengths of modern models can significantly accelerate science and I'm convinced there will be even more relevant discoveries in the future.

Noam Brown@polynoamial

There have been fair questions on whether LLM contributions to STEM are overhyped, but I've spoken with physicists about this result and they've told me it is a truly significant research contribution, roughly at the level of a solid journal paper, and GPT-5.2 played a key role.

English

444

38.1K

qrdl@QRDL·18 Şub

@4gravitons @martinmbauer @ALupsasca There was a good paper recently about this

English

qrdl@QRDL·18 Şub

@martinmbauer @4gravitons @ALupsasca But is it survivor / selection bias? We need a detailed analysis of how much time has been wasted analyzing when AI hallucinates and gets it wrong. Stochastically parroting results has real utility, but not if the marginal benefit is eliminated by hallucinations.

English

Martin Bauer@martinmbauer·18 Şub

@4gravitons I guess it depends how you count, but it should've seen many terms in its training data. It probably didn't see a labelled set (as for mammograms), but on the other hand there're many consistency conditions and symmetries that help. @ALupsasca probably would know better

English

185

qrdl@QRDL·16 Şub

@nasqret Truth is we are getting very close to that then, at least for me. I'm not sure I can reliably tell which model is the smartest anymore.

English

Bartosz Naskręcki@nasqret·16 Şub

@QRDL Singularity

English

Bartosz Naskręcki@nasqret·15 Şub

Interesting paradox. With the new Codex-Spark I can generate new content for analysis (documentation, review text etc.) so fast that I cannot parse it and read efficiently. In the end, we humans are the bottleneck in the information pipelines. Thinking slow is essential, but we might be too slow for many tasks. What then?

English

3.3K

qrdl@QRDL·6 Şub

@lukaszstarosta @AnthropicAI Anyone decent could do this in a day

English

259

starosta@lukaszstarosta·5 Şub

@AnthropicAI This is terrifying actually. It used to take people months (and their mental sanity) to build something relatively as complex, just 1 year ago. Now we get this with little to no human effort

English

10.3K

Anthropic@AnthropicAI·5 Şub

New Engineering blog: We tasked Opus 4.6 using agent teams to build a C compiler. Then we (mostly) walked away. Two weeks later, it worked on the Linux kernel. Here's what it taught us about the future of autonomous software development. Read more: anthropic.com/engineering/bu…

English

872

2.5K

21.4K

8.5M

qrdl@QRDL·5 Şub

@0xdoug I am very afraid of a world where people are largely redundant due to automation, but resources are insanely expensive. Very very afraid.

English

qrdl@QRDL·5 Şub

@0xdoug I think the right answer is AI will *eventually* lower prices. But when? And jevon's paradox, will that just shift costs to raw resources, which are becoming increasingly more expensive to extract from the earth. That is my biggest worry in all of this, tbh.

English

175

Doug Colkitt@0xdoug·4 Şub

So to recap, AI is expected to create huge amounts of wealth, yet… Workers lose because they’ll be replaced by robots… Software corps lose because they’ll be replaced by Claude code… Tech giants lose because all their free cash flow is now capex… Frontier labs lose because they get margin compressed by open source models that are at most three months behind… Nvidia loses because it gets margin compressed by TPUs and Huawei… So where exactly is the value supposed to accrue?

English

465

133

2.8K

340.5K

qrdl@QRDL·5 Şub

@0xdoug Just look across the categories of every raw material, like copper, silver, and other industrial metals, and you'll see the same story - declining ore grades, have to dig deeper and harder to get at them. We need massive breakthroughs in material sciences, ASAP.

English

qrdl@QRDL·5 Şub

@0xdoug People often forget there are 8 billion other people on the planet. They have yet to become consumers like the top 13% or so. When that happens (easier when you can just automate everything) - it could put massive strains on raw inputs..

English

qrdl@QRDL·5 Şub

@0xdoug What if we just find out the real problem was never labor and rather it was mining?

English

ค้นพบ

@gaoj0017 @grok @fchollet @FakePsyho @lossfunk @mkurman88 @DavidColetto @DelongNuna