Simon Radner

20 posts

Simon Radner

@simonradner

加入时间 Haziran 2014

102 关注31 粉丝

置顶推文

Simon Radner@simonradner·2d

Claude playing Monkey Island — about as good as I was at 14 you can watch its reason while it plays

English

217

21.9K

Simon Radner@simonradner·11h

started turning this into a benchmark, the constraint: independent of the specific game, only the scummvm engine. defining “progress” under that constraint is hard. current hack: measure what changes. not sure if that’s progress or just exploration. github.com/rabengraph/scu…

English

Simon Radner@simonradner·21h

@alkampfer Starting prompt is basically “read the instructions and make progress in the game” — the instructions expose a JS API to get game state + logs will share more soon

English

Alkmpfer@alkampfer·1d

@simonradner Is the starting prompt shared? I imagine use subagent and compress the context to reach the end. Really interesting experiment.

English

Simon Radner@simonradner·2d

Claude playing Monkey Island — about as good as I was at 14 you can watch its reason while it plays

English

217

21.9K

Simon Radner@simonradner·1d

@muskirac @davidatnilsson 😂

QME

Mustafa Kirac@muskirac·1d

@davidatnilsson @simonradner ScummBench 😅

Indonesia

Simon Radner@simonradner·1d

@Estudio528 it’s actually trying to find the Scumm Bar, in this run it expects it somewhere in the village and only finds it later about half the runs it goes there straight away, it’s probabilistic

English

David García Sánchez@Estudio528·1d

@simonradner He skipped going into the bar – something a human would rarely do. And the Scumm Bar at that, which is the first place you’re supposed to go. Although he might not have spotted the door properly; Melee Island is quite dark. 😁

English

Simon Radner@simonradner·1d

@artlantiko what do you envision? could have the agent yield small summaries as it goes

English

Artlantiko@artlantiko·1d

@simonradner Pretty cool, and do you see some timeline about the decisions it takes?

English

136

Simon Radner@simonradner·1d

@opinjonated glad it makes sense, more going on under the hood, will share soon

English

Jon Mystery@opinjonated·1d

@simonradner Ah, nevermind. Found the Youtube link and now I understand what you did. Very clever, I can see other ways to make use of this method. Thank you!

English

Simon Radner@simonradner·1d

@rustforthewin apparently not, no walkthroughs back then 🙂

English

254

RFTW@rustforthewin·1d

@simonradner So, you weren't that good at 14?

English

288

Simon Radner@simonradner·1d

@damageboy haha I think you’re safe — Monkey Island needs creative, humorous thinking to solve, and the agent mostly lacks that might brute force it eventually though

English

damageboy@damageboy·1d

Oh, now I feel replaced

Simon Radner@simonradner

Claude playing Monkey Island — about as good as I was at 14 you can watch its reason while it plays

English

462

Simon Radner@simonradner·1d

@the_ultralazr sure, happy to, ClaudeCast is great. happy to share the session log, feel free to DM me

English

316

Martin Mairinger@the_ultralazr·1d

@simonradner Love it! Would you be open to share that session log for an episode of the ClaudeCast podcast?

English

344

Simon Radner@simonradner·1d

@artale93 also planning to try it with pi...

English

Artale@artale93·1d

Doing this on pi ATM ... cc sucks 😭

Simon Radner@simonradner

Claude playing Monkey Island — about as good as I was at 14 you can watch its reason while it plays

English

Simon Radner@simonradner·1d

@dvygh @badlogicgames yeah, but it mixes up walkthroughs from different versions, and there’s still a gap between lossy knowledge and acting correctly

English

283

Kalle Immonen@dvygh·1d

@simonradner @badlogicgames it should have a walkthrough or two in its training data tho?

English

310

Simon Radner@simonradner·1d

@RMWinslow haha Monkey Island is basically full of prompt injections and you can see the model fall for some of them too got a bit lucky here, getting into the kitchen needs timing with the cook serving groks in another run it reduced the sleep between retries to catch the right moment

English

308

Robert Winslow@RMWinslow·1d

@simonradner I remember getting really stumped in this game because I interpreted the Red Herring as a gag meant to be ignored, and completely forgot about it for later. In retrospect, it feels a bit like I prompt injected myself.

English

379

Simon Radner@simonradner·1d

@davidatnilsson yep, that’s next up going to put a simple version online so @moltbook folks can try it out

English

546