Andreas Robinson

8 posts

Andreas Robinson

Andreas Robinson

@andreasorob

ML engineer • ex-CTO @ https://t.co/M1rE6nmVrU → acq. Nearmap ML research, forecasting, AI safety https://t.co/gcvhc99Efl

Katılım Ocak 2026
214 Takip Edilen6 Takipçiler
Andreas Robinson
Andreas Robinson@andreasorob·
@rishicomplex @fchollet Yes, the AI is also allowed to reset the level (neither can reset the game): "Competition Mode... Only Level Resets are premitted..." #competition-mode" target="_blank" rel="nofollow noopener">github.com/arcprize/arc-a…
English
1
0
2
113
Rishi Mehta
Rishi Mehta@rishicomplex·
@fchollet according to your paper: "Participants were limited to a single attempt per environment and could not revisit previously completed levels. However, they were allowed to reset the current level at any time. In some cases, participants reset levels after reaching a solution in order to improve efficiency, though this typically increased total interaction time." So humans could play around with the task a bunch, and then just reset the game when they figured it out to get the optimal trajectory? Is AI allowed to do this?
François Chollet@fchollet

ARC-AGI-3 is out now! We've designed the benchmark to evaluate agentic intelligence via interactive reasoning environments. Beating ARC-AGI-3 will be achieved when an AI system matches or exceeds human-level action efficiency on all environments, upon seeing them for the first time. We've done extensive human testing that shows 100% of these environments are solvable by humans, upon first contact, with no prior training and no instructions. Meanwhile, all frontier AI reasoning models do under 1% at this time.

English
1
1
25
2.5K
Andreas Robinson
Andreas Robinson@andreasorob·
@littmath @ChaseBrowe32432 Is this case arguably more impressive since it's a small set of open problems, so there is less risk of cherry-picking ones the AI can handle, compared to mining through the large number of erdos problems (Tao seemed concerned about this kind of cherry-picking in the erdos case)?
English
1
0
2
154
Daniel Litt
Daniel Litt@littmath·
@ChaseBrowe32432 I personally understand this as a comparable accomplishment, but the problem is pretty far from my area.
English
1
0
26
1.5K
Chase Brower
Chase Brower@ChaseBrowe32432·
For the people who had largely dismissed AI Erdos solves as uninteresting (too similar to previous approaches, problems too simple, just didn't receive much attention): What do you think here? Interesting/meaningful step forward, or no? Why?
Epoch AI@EpochAIResearch

AI has solved one of the problems in FrontierMath: Open Problems, our benchmark of real research problems that mathematicians have tried and failed to solve. See thread for more.

English
6
3
108
11.6K
Andreas Robinson
Andreas Robinson@andreasorob·
@slatestarcodex @ohlennart This argument is really odd, i.e. both parents and AI are explicitly pushed towards alignment via an imperfect process, by evolution and researchers respectively, whereas humans-to-chimps aren't, yet your take away is that the AI analogy to parents is the obviously weaker one?
English
0
0
0
274
Scott Alexander
Scott Alexander@slatestarcodex·
I don't think this is a good article. Yes, humans are actively designing AIs to be nice to us. This is called "the alignment problem". The word "problem" is in there because it's hard and we don't know how to do it with certainty. The chimpanzee analogy is meant to illuminate what would happen if we don't solve that problem. There are very obvious reasons evolution designed parents to be nice to children (it's necessary to pass down selfish genes). These reasons don't exist with humans and AIs, so it's a worse comparison. This is asking us to abandon a normal case and replace it with an extremely unusual special case where we already know the reasons why it doesn't count.
English
9
11
391
43.4K
Lennart Heim
Lennart Heim@ohlennart·
Good article by John Halstead arguing that second species arguments are weak and should be retired.
Lennart Heim tweet media
English
21
3
78
24K
Andreas Robinson
Andreas Robinson@andreasorob·
@GaryMarcus @sama He doesn't say that current approaches are "not enough" or that there is "a wall", or that we "need" a breakthrough (e.g. to reach AGI), merely that he predicts there will be further breakthroughs; just because something is likely doesn't mean it's necessary.
English
1
0
2
249
Gary Marcus
Gary Marcus@GaryMarcus·
Dear @sama, You owe me an apology. You have relentlessly, publicly and privately, attacked my integrity and wisdom since my 2022 paper “Deep Learning is a Hitting a Wall”. But in your own way you have just come around to conceding *exactly* what I was arguing in that paper: that current architectures are not enough, and that we need something new, researchwise. beyond a scaling (a “megabreakthough” in your words below). That’s all I was trying to say. And I was right. And you should be man enough to admit it. Gary cc @_KarenHao
Rohan Paul@rohanpaul_ai

Sam Altman just said in his new interview, that a new AI architecture is coming that will be a massive upgrade, just like Transformers were over Long Short-Term Memory. And also now the current class of frontier models are powerful enough to have the brainpower needed to help us research these ideas. His advice is to use the current AI to help you find that next giant step forward. --- From 'TreeHacks' YT Channel (link in comment)

English
91
79
907
179.8K
Patrick Heizer
Patrick Heizer@PatrickHeizer·
@Anirudhtd1D This is not medical advice, but if you have the sequence of the tumor and a lab that can make mRNA and LNPs, then yes, it is possible.
English
5
1
63
4.6K
Patrick Heizer
Patrick Heizer@PatrickHeizer·
Sorry to be the downer because this is an impressive story in some senses. But it is ~trivially easy to make a single mRNA vaccine. It's not hard. I cure mice of various cancers with various therapeutics all the time. I've made mice lose more weight in a month than tirzepatide does in a year. What is hard and expensive is proving its BOTH safe AND effective **in a randomized and controlled study in humans** while ALSO manufacturing it at clinical scale and grade. I am happy for this man and his dog. It is impressive. But y'all are overhyping it.
Séb Krier@sebkrier

This is wild. theaustralian.com.au/business/techn…

English
934
415
5.5K
5M
Andreas Robinson
Andreas Robinson@andreasorob·
Previous papers had shown that transformer models are computationally universal (turing complete) with infinite precision, but interestingly this paper shows that they can acheive this with finite precision as well arxiv.org/pdf/2506.12027
English
0
0
0
33
Andreas Robinson
Andreas Robinson@andreasorob·
I argue that METR is likely substantially underestimating the agentic AI time-horizon trend by focusing on fixed reliability (e.g. 50%) metrics rather than measuring relative to actual human reliabilities per this proposed alternative metric: lesswrong.com/posts/kNHxuusz…
English
0
0
0
27