Cipher_Challenge (cipher-challenge on Bluesky)

2.6K posts

Cipher_Challenge (cipher-challenge on Bluesky) banner
Cipher_Challenge (cipher-challenge on Bluesky)

Cipher_Challenge (cipher-challenge on Bluesky)

@Cipher_Master

Created by the School of Mathematical Sciences @UniSouthampton. Find at https://t.co/OSTFn3Sqpu

Southampton, England Tham gia Ekim 2008
1.3K Đang theo dõi2.1K Người theo dõi
Geoff Barton
Geoff Barton@RealGeoffBarton·
Poignant. As a few people know, my childhood ambition (before radio DJ) was ventriloquist ....
English
1
0
9
2.1K
Cipher_Challenge (cipher-challenge on Bluesky)
We've a few tickets spare for the National Cipher Challenge event at Bletchley on 4th March. With speakers including Rob Eastaway.If you or your school took part in the competition in the Autumn, email us at cipher@soton.ac.uk with your details so we can see if we can fit you in.
English
0
0
3
165
Charles Arthur
Charles Arthur@charlesarthur·
@graceyldn It is MADDENING. Mine does "one-hour" washes that last for any time between 45 mins and 3 hours. Always with 15 minutes left for any length of time.
English
1
0
1
209
James Titcomb
James Titcomb@jamestitcomb·
The Government’s AI skills website is going well
James Titcomb tweet media
English
3
5
16
3.4K
FoggyCrazyTimes
FoggyCrazyTimes@FoggyCrazyTimes·
@rastokke Ohhh I loved math! Puzzles! Solving! But my oldest thinks abstractly and mathematically but hated formal math. I dropped him down a grade. Still hated it. Pulled out logic and puzzles, love! And then we went story mode, BINGO! Now he loves math
English
1
0
0
234
Anna Stokke
Anna Stokke@rastokke·
1/3💡The problem with "productive" struggle. For kids good at math struggle can be fun because they're successful at the end. For many others, struggle is associated with failure, because they're not successful at the end, and it reinforces for them that they can't do math.
English
18
21
138
26.2K
Cipher_Challenge (cipher-challenge on Bluesky)
@C_Hendrick "The question is what education looks like when the technology underpinning it improves exponentially" - If one horse can pull a heavy cart at 1 mile an hour and two horses can pull it at 2 miles an hour then think how fast 20 horses could pull it!
English
0
0
1
68
James Williams
James Williams@edujdw·
@ReemKhan_07 Thank goodness there is no pomp and circumstance surrounding this post.
Brighton, England 🇬🇧 English
1
0
0
22
Maham Khan
Maham Khan@ReemKhan_07·
ONE word
Maham Khan tweet media
English
819
112
219
15.6K
Susan Elkin
Susan Elkin@SusanElkinJourn·
Really liked Being Mr Wickham at Chichester this pm but omg the journey. Forgot to check trains (won't ever forget again). Engineering works between E Croydon and LGW. Coming back it was train-bus-train- tram - bus. 3.5 hours. I'd have driven if I'd known.
English
1
0
0
201
Sheena
Sheena@Sheena2907·
Can anyone see what's gone wrong here? I can't see why the student keeps getting det 0. Any help would be appreciated!
Sheena tweet mediaSheena tweet media
English
3
0
1
430
Cipher_Challenge (cipher-challenge on Bluesky)
Merry Christmas to everyone in the National Cipher Challenge community. Hope you all have a restful day if you are celebrating, and an enjoyable day whether or not you are.
English
0
0
6
193
Cipher_Challenge (cipher-challenge on Bluesky)
@littmath I can't help wondering how we would deal with a world in which the machines can produce and verify enormous quantities of mathematics. Who will decide what is worth reading and how will they decide that? Scarcity of resource has shaped mathematics. What will abundance do to it?
English
0
0
1
228
Daniel Litt
Daniel Litt@littmath·
Unfortunately I don't think formal verification is (yet) a solution to the oncoming wave of AI math slop. Basic objects in many areas have not yet had their definitions formalized, let alone theorems whose English-language proofs run to 100s of pages. Hopefully this changes soon!
English
25
18
552
32.3K
The Times and The Sunday Times
The top schools are preparing pupils to be curious, problem-solving and risk taking, ready for anything that comes their way — as well as academic success
English
42
30
219
1.4M
Cipher_Challenge (cipher-challenge on Bluesky)
@alz_zyd_ Is there any point in reading a novel now that LLMs can be trained on them? Or play chess now that Deep Mind can beat the best? How about paint a picture now that we have diffusion models? .... Who will read the papers the machines write if you don't bother learning the basics?
English
0
0
3
37
alz
alz@alz_zyd_·
is there much point in majoring in math, now that the computers are better than us at math
English
143
10
516
197.1K
Cipher_Challenge (cipher-challenge on Bluesky)
@alex_prompter I will have to read this paper, but the list of "brittle" failures reads like a list of weaknesses shared by human scientists. Modern scientific communities have evolved a whole host of social mechanisms to moderate them and I am not sure that LLMs couldn't do the same?
English
0
0
1
69
Alex Prompter
Alex Prompter@alex_prompter·
This paper from Harvard and MIT quietly answers the most important AI question nobody benchmarks properly: Can LLMs actually discover science, or are they just good at talking about it? The paper is called “Evaluating Large Language Models in Scientific Discovery”, and instead of asking models trivia questions, it tests something much harder: Can models form hypotheses, design experiments, interpret results, and update beliefs like real scientists? Here’s what the authors did differently 👇 • They evaluate LLMs across the full discovery loop hypothesis → experiment → observation → revision • Tasks span biology, chemistry, and physics, not toy puzzles • Models must work with incomplete data, noisy results, and false leads • Success is measured by scientific progress, not fluency or confidence What they found is sobering. LLMs are decent at suggesting hypotheses, but brittle at everything that follows. ✓ They overfit to surface patterns ✓ They struggle to abandon bad hypotheses even when evidence contradicts them ✓ They confuse correlation for causation ✓ They hallucinate explanations when experiments fail ✓ They optimize for plausibility, not truth Most striking result: `High benchmark scores do not correlate with scientific discovery ability.` Some top models that dominate standard reasoning tests completely fail when forced to run iterative experiments and update theories. Why this matters: Real science is not one-shot reasoning. It’s feedback, failure, revision, and restraint. LLMs today: • Talk like scientists • Write like scientists • But don’t think like scientists yet The paper’s core takeaway: Scientific intelligence is not language intelligence. It requires memory, hypothesis tracking, causal reasoning, and the ability to say “I was wrong.” Until models can reliably do that, claims about “AI scientists” are mostly premature. This paper doesn’t hype AI. It defines the gap we still need to close. And that’s exactly why it’s important.
Alex Prompter tweet media
English
386
2.1K
8.3K
1.2M
Bletchley Park
Bletchley Park@bletchleypark·
This festive season, why not treat yourself to a afternoon tea in the Bletchley Park Mansion? Enjoy a delicious seasonal spread in the historic setting of our dining room, the perfect way to pause, relax, and make your visit even more magical. 🔗 Link in bio to book now. To make a booking using a gift voucher, please call us on 01908 272673. #FestiveTea #BletchleyPark #ChristmasAtBletchleyPark
Bletchley Park tweet media
English
4
11
34
1.9K
Victoria Coren Mitchell
Victoria Coren Mitchell@VictoriaCoren·
I think I must have previously misunderstood the meaning of “wicker”.
Victoria Coren Mitchell tweet media
English
79
26
1.1K
82.2K