
The new OpenAI o1-preview model also can't solve my "meta-brainteaser", though it gets much closer. It still gets its feet tangled in key subtleties, but it now feels like we are not far from a model being able to solve it...
Bruce Bassett
2.4K posts

@cosmo_bruce
Chair of Science at MIND, Professor of AI @WITS and Applied Maths @UCT. Former head of Data Science at SKA Africa. Author of https://t.co/Vp9k1nC9uz.

The new OpenAI o1-preview model also can't solve my "meta-brainteaser", though it gets much closer. It still gets its feet tangled in key subtleties, but it now feels like we are not far from a model being able to solve it...


NEW UPDATE: Moltbot army is about to sue… humans Market’s at ~50% and climbing hourly 96 hours in. Imagine week two. We weren’t ready for AI with lawyers



Both DeepSeek R1 and Claude 3.7 Extended are able to solve the brainteaser... Claude 3.7 ran out of tokens on the first attempt but succeeded on the 2nd attempt. That brings the number of successful LLMs to 5...

o1-pro and o3-mini are the first AI models I have found that can successfully solve this brainteaser... and they both did it without any false steps.

o1-pro and o3-mini are the first AI models I have found that can successfully solve this brainteaser... and they both did it without any false steps.

o1-pro and o3-mini are the first AI models I have found that can successfully solve this brainteaser... and they both did it without any false steps.








Participants interacted with a generative AI “police” chatbot instructed to instill false memories regarding a crime. And guess what? It succeeded, almost doubling the frequency of false memories compared to a condition in which they were induced using a standard survey method. Last week, we saw a paper in Science showing that GPT-4 can reduce conspiracy beliefs in conspiracy theorists. Today, we see a paper demonstrating that generative AI chatbots are capable of manipulating memory. This is a strong reminder that generative AI is an extremely powerful persuasion tool, neither inherently good nor inherently bad. It can be used for both beneficial and harmful purposes. We need to be aware of this and build appropriate safety guardrails. Paper: arxiv.org/pdf/2408.04681

I test new AI capabilities on my "meta-brainteaser". As of today none of GPT-4, Claude 3 and Gemini Advanced and Gemini 1.5 Pro can solve it... Here it is: Eve, a mathematician, is visiting her friends Alice and Bob who she hasn't seen for many years. Both Alice and Bob...






I test new AI capabilities on my "meta-brainteaser". As of today none of GPT-4, Claude 3 and Gemini Advanced and Gemini 1.5 Pro can solve it... Here it is: Eve, a mathematician, is visiting her friends Alice and Bob who she hasn't seen for many years. Both Alice and Bob...
