Andreas Robinson

8 posts

Andreas Robinson

@andreasorob

ML engineer • ex-CTO @ https://t.co/M1rE6nmVrU → acq. Nearmap ML research, forecasting, AI safety https://t.co/gcvhc99Efl

Katılım Ocak 2026

214 Takip Edilen6 Takipçiler

Andreas Robinson@andreasorob·4d

@rishicomplex @fchollet Yes, the AI is also allowed to reset the level (neither can reset the game): "Competition Mode... Only Level Resets are premitted..." #competition-mode" target="_blank" rel="nofollow noopener">github.com/arcprize/arc-a…

English

113

Rishi Mehta@rishicomplex·4d

@fchollet according to your paper: "Participants were limited to a single attempt per environment and could not revisit previously completed levels. However, they were allowed to reset the current level at any time. In some cases, participants reset levels after reaching a solution in order to improve efficiency, though this typically increased total interaction time." So humans could play around with the task a bunch, and then just reset the game when they figured it out to get the optimal trajectory? Is AI allowed to do this?

François Chollet@fchollet

ARC-AGI-3 is out now! We've designed the benchmark to evaluate agentic intelligence via interactive reasoning environments. Beating ARC-AGI-3 will be achieved when an AI system matches or exceeds human-level action efficiency on all environments, upon seeing them for the first time. We've done extensive human testing that shows 100% of these environments are solvable by humans, upon first contact, with no prior training and no instructions. Meanwhile, all frontier AI reasoning models do under 1% at this time.

English

2.5K

Andreas Robinson@andreasorob·24 Mar

@littmath @ChaseBrowe32432 Is this case arguably more impressive since it's a small set of open problems, so there is less risk of cherry-picking ones the AI can handle, compared to mining through the large number of erdos problems (Tao seemed concerned about this kind of cherry-picking in the erdos case)?

English

154

Daniel Litt@littmath·23 Mar

@ChaseBrowe32432 I personally understand this as a comparable accomplishment, but the problem is pretty far from my area.

English

1.5K

Chase Brower@ChaseBrowe32432·23 Mar

For the people who had largely dismissed AI Erdos solves as uninteresting (too similar to previous approaches, problems too simple, just didn't receive much attention): What do you think here? Interesting/meaningful step forward, or no? Why?

Epoch AI@EpochAIResearch

AI has solved one of the problems in FrontierMath: Open Problems, our benchmark of real research problems that mathematicians have tried and failed to solve. See thread for more.

English

108

11.6K

Andreas Robinson@andreasorob·16 Mar

@slatestarcodex @ohlennart This argument is really odd, i.e. both parents and AI are explicitly pushed towards alignment via an imperfect process, by evolution and researchers respectively, whereas humans-to-chimps aren't, yet your take away is that the AI analogy to parents is the obviously weaker one?

English

274

Scott Alexander@slatestarcodex·16 Mar

I don't think this is a good article. Yes, humans are actively designing AIs to be nice to us. This is called "the alignment problem". The word "problem" is in there because it's hard and we don't know how to do it with certainty. The chimpanzee analogy is meant to illuminate what would happen if we don't solve that problem. There are very obvious reasons evolution designed parents to be nice to children (it's necessary to pass down selfish genes). These reasons don't exist with humans and AIs, so it's a worse comparison. This is asking us to abandon a normal case and replace it with an extremely unusual special case where we already know the reasons why it doesn't count.

English

391

43.4K

Lennart Heim@ohlennart·16 Mar

Good article by John Halstead arguing that second species arguments are weak and should be retired.

English

24K

Andreas Robinson@andreasorob·16 Mar

@GaryMarcus @sama He doesn't say that current approaches are "not enough" or that there is "a wall", or that we "need" a breakthrough (e.g. to reach AGI), merely that he predicts there will be further breakthroughs; just because something is likely doesn't mean it's necessary.

English

249

Gary Marcus@GaryMarcus·15 Mar

Dear @sama, You owe me an apology. You have relentlessly, publicly and privately, attacked my integrity and wisdom since my 2022 paper “Deep Learning is a Hitting a Wall”. But in your own way you have just come around to conceding *exactly* what I was arguing in that paper: that current architectures are not enough, and that we need something new, researchwise. beyond a scaling (a “megabreakthough” in your words below). That’s all I was trying to say. And I was right. And you should be man enough to admit it. Gary cc @_KarenHao

Rohan Paul@rohanpaul_ai

Sam Altman just said in his new interview, that a new AI architecture is coming that will be a massive upgrade, just like Transformers were over Long Short-Term Memory. And also now the current class of frontier models are powerful enough to have the brainpower needed to help us research these ideas. His advice is to use the current AI to help you find that next giant step forward. --- From 'TreeHacks' YT Channel (link in comment)

English

907

179.8K

Andreas Robinson@andreasorob·16 Mar

@PatrickHeizer @Anirudhtd1D This nonprofit will apparently make a custom targeted cancer vaccine for $83k, though it is long-peptide based rather than mRNA. statnews.com/2023/03/06/fou…

English

Patrick Heizer@PatrickHeizer·14 Mar

@Anirudhtd1D This is not medical advice, but if you have the sequence of the tumor and a lab that can make mRNA and LNPs, then yes, it is possible.

English

4.6K

Patrick Heizer@PatrickHeizer·14 Mar

Sorry to be the downer because this is an impressive story in some senses. But it is ~trivially easy to make a single mRNA vaccine. It's not hard. I cure mice of various cancers with various therapeutics all the time. I've made mice lose more weight in a month than tirzepatide does in a year. What is hard and expensive is proving its BOTH safe AND effective **in a randomized and controlled study in humans** while ALSO manufacturing it at clinical scale and grade. I am happy for this man and his dog. It is impressive. But y'all are overhyping it.

Séb Krier@sebkrier

This is wild. theaustralian.com.au/business/techn…

English

934

415

5.5K

Andreas Robinson@andreasorob·27 Oca

Previous papers had shown that transformer models are computationally universal (turing complete) with infinite precision, but interestingly this paper shows that they can acheive this with finite precision as well arxiv.org/pdf/2506.12027

English

Andreas Robinson@andreasorob·26 Oca

I argue that METR is likely substantially underestimating the agentic AI time-horizon trend by focusing on fixed reliability (e.g. 50%) metrics rather than measuring relative to actual human reliabilities per this proposed alternative metric: lesswrong.com/posts/kNHxuusz…

English

Keşfet

@rishicomplex @fchollet @littmath @ChaseBrowe32432 @slatestarcodex @ohlennart @GaryMarcus @sama