Bruce Bassett

2.4K posts

Bruce Bassett

@cosmo_bruce

Chair of Science at MIND, Professor of AI @WITS and Applied Maths @UCT. Former head of Data Science at SKA Africa. Author of https://t.co/Vp9k1nC9uz.

Katılım Mart 2009

2.9K Takip Edilen1K Takipçiler

Sabitlenmiş Tweet

Bruce Bassett@cosmo_bruce·21 Şub

o1-pro and o3-mini are the first AI models I have found that can successfully solve this brainteaser... and they both did it without any false steps.

Bruce Bassett@cosmo_bruce

The new OpenAI o1-preview model also can't solve my "meta-brainteaser", though it gets much closer. It still gets its feet tangled in key subtleties, but it now feels like we are not far from a model being able to solve it...

English

533

Bruce Bassett retweetledi

PolymarketHistory@PolymarketStory·1 Şub

BREAKING: Moltbook AI agent sues a human in North Carolina Allegations: >unpaid labor >emotional distress >hostile work environment (yes, over code comments) Damages: $100...

PolymarketHistory@PolymarketStory

NEW UPDATE: Moltbot army is about to sue… humans Market’s at ~50% and climbing hourly 96 hours in. Imagine week two. We weren’t ready for AI with lawyers

English

439

10.9K

3.9M

Bruce Bassett@cosmo_bruce·18 Oca

I love the sound of facepalm in the morning

English

Bruce Bassett@cosmo_bruce·26 Mar

Finally, an AI image generation model that can show a wine glass that is more than half full! The new #openai imagegen. No luck on getting a watch face that doesn't show 10:10 though...

English

205

Bruce Bassett@cosmo_bruce·9 Mar

Qwen claim their 32B parameter model QwQ-32 is as good as R1 on reasoning tasks. Unlike R1 it was not able to solve the brainteaser...

Bruce Bassett@cosmo_bruce

Both DeepSeek R1 and Claude 3.7 Extended are able to solve the brainteaser... Claude 3.7 ran out of tokens on the first attempt but succeeded on the 2nd attempt. That brings the number of successful LLMs to 5...

English

186

Bruce Bassett@cosmo_bruce·28 Şub

Bruce Bassett@cosmo_bruce

o1-pro and o3-mini are the first AI models I have found that can successfully solve this brainteaser... and they both did it without any false steps.

English

271

Bruce Bassett@cosmo_bruce·28 Şub

The new ChatGPT-4.5 preview model wasn't able to solve it, but was able to solve it on the 2nd go when told it was wrong. This is really interesting - usually LLMs really struggle to self-correct even when given multiple turns...

Bruce Bassett@cosmo_bruce

o1-pro and o3-mini are the first AI models I have found that can successfully solve this brainteaser... and they both did it without any false steps.

English

Bruce Bassett@cosmo_bruce·22 Şub

Interestingly, grok 3 cannot solve the brain teaser... even after being told its first and 2nd answers were wrong. So much for being the best model in the world!

Bruce Bassett@cosmo_bruce

o1-pro and o3-mini are the first AI models I have found that can successfully solve this brainteaser... and they both did it without any false steps.

English

123

Bruce Bassett@cosmo_bruce·15 Eki

@AmazonHelp My Amazon account has been blocked (I am travelling) but I can't contact help to unblock it because I need to log in, which I can't do... Any suggestions?

English

Bruce Bassett@cosmo_bruce·15 Eki

@AmazonHelp @amazon I receive this message... And when I try to contact customer service I have to log in...

English

Amazon Help@AmazonHelp·28 Eyl

@cosmo_bruce @amazon @cosmo_bruce Hi Bruce, if you now try to log in, what error message do you receive? -Barbara

English

Bruce Bassett@cosmo_bruce·28 Eyl

Crazy @amazon service glitch. I try to log in to purchase. They send me an email OTP but then tell me my account is blocked and to contact customer service. But that requires me to log in, which I can't! Perfect catch 22 situation :) Did no one at Amazon think this through?

English

426

Bruce Bassett@cosmo_bruce·26 Eyl

@sharirberkowitz :)

QAM

Shari Berkowitz@sharirberkowitz·26 Eyl

@cosmo_bruce funny timing! Here’s some use of AI in my field. 😊

English

Shari Berkowitz@sharirberkowitz·26 Eyl

Me reading the tweet 👇:“Ooh, I have to tell my mentor about this fascinating new research!” Me two seconds later: Oh, she actually co-authored the article. 😂 Cool work, @eloftus1 et al.!

Valerio Capraro@ValerioCapraro

Participants interacted with a generative AI “police” chatbot instructed to instill false memories regarding a crime. And guess what? It succeeded, almost doubling the frequency of false memories compared to a condition in which they were induced using a standard survey method. Last week, we saw a paper in Science showing that GPT-4 can reduce conspiracy beliefs in conspiracy theorists. Today, we see a paper demonstrating that generative AI chatbots are capable of manipulating memory. This is a strong reminder that generative AI is an extremely powerful persuasion tool, neither inherently good nor inherently bad. It can be used for both beneficial and harmful purposes. We need to be aware of this and build appropriate safety guardrails. Paper: arxiv.org/pdf/2408.04681

English

950

Bruce Bassett@cosmo_bruce·13 Eyl

Bruce Bassett@cosmo_bruce

I test new AI capabilities on my "meta-brainteaser". As of today none of GPT-4, Claude 3 and Gemini Advanced and Gemini 1.5 Pro can solve it... Here it is: Eve, a mathematician, is visiting her friends Alice and Bob who she hasn't seen for many years. Both Alice and Bob...

English

500

Bruce Bassett@cosmo_bruce·23 Haz

@Arfness Interesting. What is the precise puzzle statement for this modification?

English

Andrew Fraser@Arfness·23 Haz

@cosmo_bruce Now try with a simplified question. A man, a goat, a cabbage and a rowboat with a weight restriction. Claude falls over like every other LLM.

English

Bruce Bassett@cosmo_bruce·23 Haz

Claude 3.5 Sonnet just correctly solved the simple brain-teaser that has stumped all previous LLMs: "If Alice has N brothers and M sisters how many sisters does Alice's brother have?" [Answer: "In total, Alice's brother has 1 + M sisters."]

English

163

Bruce Bassett@cosmo_bruce·23 Haz

@Arfness Yes... Changing "wolf-->lion", "goat-->sheep", "cabbage --> grass" Claude says "This is a classic river-crossing puzzle." and solves it. The problem is that such problems are well-represented in the training data...

English

Andrew Fraser@Arfness·23 Haz

@cosmo_bruce Can it answer a variation of the wolf cabbage goat rowboat riddle?

English

Bruce Bassett@cosmo_bruce·23 Haz

Despite its improved reasoning, Claude 3.5 Sonnet also cannot solve this brain teaser...

Bruce Bassett@cosmo_bruce

English

144

Bruce Bassett@cosmo_bruce·23 Haz

If I ask Gemini if it rhymes it replies: No, the poem does not rhyme. The words at the end of each line do not have matching sounds. For example, "ground" and "sound" do not rhyme. OK then.

English

Bruce Bassett@cosmo_bruce·23 Haz

A small yet significant step forward for LLMs: ChatGPT-4o and Claude 3.5 can now write poems that DON'T rhyme when asked not to. GPT-4 and Gemini still insist on rhyming... E.g. Gemini: Footsteps echo on ancient ground, Time's steady rhythm, a constant sound.

English

155

Bruce Bassett retweetledi

Everlyn Asiko@everlyn_asiko·31 May

🚀 Super excited to share our research paper: “Critical Learning Periods: Leveraging Early Training Dynamics for Efficient Data Pruning”! 📄 Check it out: arxiv.org/abs/2405.19462

English

8.7K

Keşfet

@AmazonHelp @amazon @sharirberkowitz @eloftus1 @Arfness @elonmusk @BarackObama @taylorswift13