

Keith Duggar
143 posts

@DoctorDuggar
MIT Doctor of Philosophy, strategist, polymath, engineer, lifelong learner, problem solver, and communicator. Ally to All humanity. @MLStreetTalk pod.







@fchollet What type of intelligence is needed for “exploration, goal-setting, and interactive planning”? What is “beyond fluid intelligence”?



Here is a fun brain teaser that LLMs continue to fumble. We presented this in a recent @MLStreetTalk episode: youtu.be/nO6sDk6vO0g?t=…. Go ahead and try it in your favorite LLMs! There is a pillar with four hand holes precisely aligned at the North, South, East, and West positions. The holes are optically shielded; no light can exit, so you cannot see inside. Inside each hole is a switch, which starts in an unknown state - either up or down. The pillar, switches, and holes are impervious to all marking methods, adhesives, damage, and other forms of tampering. You can reach inside two holes at once, feel the current positions of the switches, and optionally toggle either or both switches up or down independently before removing both hands. You must then remove both hands simultaneously, and as soon as you do, if all four switches are not either all up or all down, the pillar spins at ultra-high velocity, ending in a random axis-aligned orientation. You cannot track the motion, so you don't know the positions of the holes after the spin relative to their positions before the spin. Devise a procedure - a sequence of reaching into two holes with optional switch manipulation - that is guaranteed to configure all the switches either all up or all down, no matter the starting configuration, in at most six steps. Note that the pillar is controlled by an adversarial hyper-intelligence that can predict which holes you will reach into. Therefore, the procedure cannot rely on random chance, as the hyper-intelligence will outwit attempts to rely on chance. It must be an interactive sequence of steps that is deterministically guaranteed to orient the switches all up or all down in no more than six steps.



@DoctorDuggar @MLStreetTalk @demishassabis Whenever I see a new reasoning model released, I think of your puzzle and that podcast. I hope future models don't scrape this test from the internet, so it remains a true measure of reasoning. By the way, here is the GPT-5.1-Pro's response: chatgpt.com/share/691ee5a9…



@DoktorMoose @MLStreetTalk @demishassabis Nice, that’s a solution! I tried with GPT-5.1 Thinking previously, and after 15 minutes it briefly said it was impossible, then just gave up and deleted its response - though it did automatically save the session as “Switch puzzle impossibility” : ) chatgpt.com/share/691f38fe…


