Simon Rogers

4.2K posts

Simon Rogers

Simon Rogers

@sdrogers

AI data scientist at NHS National Services Scotland. Hon Lecturer in Comp Sci, University of Glasgow. Author of https://t.co/gUVMPYIpCj Views my own, etc.

iPhone: 57.586338,-4.435981 Katılım Ocak 2009
946 Takip Edilen601 Takipçiler
Find me on bsky @colin-fraser.net
Find me on bsky @colin-fraser.net@colin_fraser·
If the missing piece is that you make occasional arithmetic errors in the small multiplication or addition steps then a calculator solves this. But if the missing piece is that you lose track of long-running reasoning processes then you have bigger problems.
English
4
2
226
4.8K
Find me on bsky @colin-fraser.net
Find me on bsky @colin-fraser.net@colin_fraser·
“What’s the point of this? Can’t you just give it a calculator?” The point is that if you have the small times tables memorized, the ability to reliably add, and you can follow a sequence of steps, then you can do this with 100% accuracy. If you can’t, then what’s missing?
Yuntian Deng@yuntiandeng

For those curious about how o3-mini performs on multi-digit multiplication, here's the result. It does much better than o1 but still struggles past 13×13. (Same evaluation setup as before, but with 40 test examples per cell.)

English
30
38
759
104K
Simon Rogers
Simon Rogers@sdrogers·
@colin_fraser Each step is to some degree stochastic though, right? So, even if the correct token at some step is really really likely, then as you increase the number of steps, the probability of not picking the correct one becomes very very high.
English
0
0
0
41
Simon Rogers
Simon Rogers@sdrogers·
@GaryMarcus @HZoete I guess they'll also have a universally agreed definition of consciousness to determine it's been reached too...
English
1
0
0
41
Gary Marcus
Gary Marcus@GaryMarcus·
@HZoete That’s a silly slide. No science behind those made up numbers.
English
2
0
7
200
Henry de Zoete
Henry de Zoete@HZoete·
As I head home from five days in Paris at the AI Action Summit, some thoughts on what we learnt about the international picture. * What worked * What didn’t * What needs to change. A 🧵
English
9
23
182
55.6K
Simon Rogers
Simon Rogers@sdrogers·
@colin_fraser Absolutely. And pretty easy, once that tool exists to train a model where beating that test is part of the objective.
English
0
0
0
111
Simon Rogers
Simon Rogers@sdrogers·
@thefulltoss @asianick85 It was doing a lot today. I don't think any team would have survived / won. So, would agree that toss important. But if they'd blocked they would have likely lost by more.
English
1
0
1
29
James Morgan
James Morgan@downatfineleg·
Lose the toss, lose the match. Or play crap, lose the match?
English
1
0
3
584
Robert Insall
Robert Insall@robinsall·
Re: the test match "Stop! He's already dead!"
English
2
0
1
359
Simon Rogers
Simon Rogers@sdrogers·
@robinsall I'm not sure "all", but plenty. Especially when it's really hot.
English
0
0
1
29
Robert Insall
Robert Insall@robinsall·
Jeepers. Sport is all in the mind, isn't it?
English
1
0
1
521
Simon Rogers
Simon Rogers@sdrogers·
@hganjoo_153 I guess if DLS wasn't good at predicting outcome, it also wouldn't be good at defining targets.
English
1
0
0
13
Himanish Ganjoo
Himanish Ganjoo@himganj153·
The remarkable simplicity of cricket -- a simple Duckworth-Lewis based prediction model works almost as well as Cricinfo's Forecaster on average. The game state (overs and wickets left) matters the most, and dwarfs all other minutiae.
Himanish Ganjoo tweet media
English
4
6
62
4.2K
Simon Rogers
Simon Rogers@sdrogers·
@GeoffreyPetty @LRB That didn't scan. Basically, it's goodness seemed to be that it could produce bigger lists of targets. And humans could feel like they weren't responsible for the choice.
English
0
0
0
93
Simon Rogers
Simon Rogers@sdrogers·
@GeoffreyPetty @LRB That, yes. But more the fact that there's no way of evaluating its accuracy, the justification that it could identify more targets with humans able to delegate any responsibility.
English
1
0
0
440
Simon Rogers
Simon Rogers@sdrogers·
Three articles in the @LRB this month that taken together are quite chilling: the wholesale swallowing of the religion of AI (article on Blair) + the use of AI in generating kill lists + the hollowing of the regulatory landscape (in context of grenfell)...
English
0
2
6
14.2K
Simon Rogers
Simon Rogers@sdrogers·
@LRB Mind you, could you make the personal ads more entertaining again please?
English
1
0
0
24
London Review of Books
We are a good newspaper, not a good news newspaper.
Simon Rogers@sdrogers

Three articles in the @LRB this month that taken together are quite chilling: the wholesale swallowing of the religion of AI (article on Blair) + the use of AI in generating kill lists + the hollowing of the regulatory landscape (in context of grenfell)...

English
4
6
45
13.2K
Simon Rogers
Simon Rogers@sdrogers·
@rcolvile Can live very comfortably on that, and I'm confident that enough very skilled applicants would see it that way.
English
0
0
0
419
Robert Colvile
Robert Colvile@rcolvile·
I'm sorry, £200k to run the entire Civil Service?
Robert Colvile tweet media
English
223
118
4.2K
1.1M