Sam Wolfstone

663 posts

Sam Wolfstone

Sam Wolfstone

@SamWolfstone

Sculpting, AI, Philosophy, Coding

Entrou em Kasım 2020
134 Seguindo246 Seguidores
Sam Wolfstone
Sam Wolfstone@SamWolfstone·
@Teknium Will eagerly wait to use this once the required PRs have been merged in! (I'd rather just stick to official updates rather than apply random PRs on my own instance...)
English
0
0
0
36
Sam Wolfstone
Sam Wolfstone@SamWolfstone·
@AcerFur @synthwavedd Such a cool frickin' benchmark!! Very nice work. Can't wait to see how the new GPT one does on this. Do you have other tests you're keeping in your back pocket in case the labs target these specific tests?
English
1
0
1
90
Acer
Acer@AcerFur·
I'm not one to be too interested in image gen capabilities, but I do care about reasoning capabilities, so I am introducing a new benchmark testing reasoning during image generation. Introducing the Image Reasoning Generation Benchmark (IRGB): #irgb" target="_blank" rel="nofollow noopener">pellaml.github.io/iumb/#irgb
Acer tweet media
English
16
16
198
12.5K
Sam Wolfstone
Sam Wolfstone@SamWolfstone·
@synthwavedd @chatgpt21 @DrBeavisAI Can't help but think that 5o and 5.5 will be different models. 5o needs to be really fast/cheap if it's going to power advanced voice mode. If 5.5 is a step change in intelligence, probably huge, slow and expensive.
English
0
0
5
216
leo 🐾
leo 🐾@synthwavedd·
@chatgpt21 @DrBeavisAI I mean naturally they would say that Internally it's currently slated as 5.5, but 5o isn't out of contention either. Doubt it's 6
English
2
0
25
1.5K
leo 🐾
leo 🐾@synthwavedd·
big week coming up
English
25
12
381
68.8K
Sam Wolfstone
Sam Wolfstone@SamWolfstone·
@johnennis Definitely did at first. Got one of my smartest friends into LLMs so now I have someone who's fairly interested in talking about AI stuff with me. Also have a work-colleague-turned-friend who also loves talking about AI with me now, who has very different views from me Need more
English
0
0
1
19
John Ennis
John Ennis@johnennis·
I think one of the biggest challenges when it comes to going hard into using AI is loneliness I am learning all these awesome things and becoming super capable But the set of people that I can really talk to about it is very small Is anyone else having this experience?
English
1.1K
173
3.8K
161.6K
Sam Wolfstone retweetou
Aidan McLaughlin
Aidan McLaughlin@aidan_mclau·
one of my all-time favorite plots
Aidan McLaughlin tweet media
English
20
72
2.1K
229.3K
Sam Wolfstone
Sam Wolfstone@SamWolfstone·
@AcerFur Feel kinda stupid asking but I'm too curious, what makes it not quite pass the test here?
English
2
0
7
368
Acer
Acer@AcerFur·
doesn't quite pass the animal keyboard test
Acer tweet media
English
4
1
67
7.8K
Acer
Acer@AcerFur·
maskingtape-alpha gaffertape-alpha packingtape-alpha Seems like good image models on the image arena (try them out)... but they're not quite perfect just yet. Still fails the Rubik's Cube reflection test.
Acer tweet media
Acer@AcerFur

@m__dehghani Alright, @m__dehghani time for these next: A validly scrambled Rubik's cube placed by a mirror, clearly showing its mirror reflection. No harsh light reflections. Incorrect centres, edge, and corner pairings:

English
19
16
295
232.1K
Sam Wolfstone
Sam Wolfstone@SamWolfstone·
@KarolCodes @0xSero @theo Even with some details in the SOUL.md, it's really hard to get GPT-5.4 not to yap. If you ask it to be succinct, it'll just be curt in its sentences but still send you 50 lines in the response...
English
1
0
1
19
Karol
Karol@KarolCodes·
@0xSero @theo don't you get 5 pages essays from 5.4 everytime you ask it to do something?
English
1
0
4
94
Theo - t3.gg
Theo - t3.gg@theo·
So, uh, what subscription should I be using for my OpenClaw now? 🙃
English
264
21
1.4K
264.1K
0xSero
0xSero@0xSero·
@SamWolfstone I have 3 more much larger giveaways in the work, it ain’t easy
English
1
0
0
371
0xSero
0xSero@0xSero·
Do you want to try Droid? I’m doing a giveaway 3 people will win 100M Factory credits each.Thats 5 months of their 20$ a month subscription. Winners selected randomly from comments in 48 hours.
0xSero tweet media
English
1.1K
36
789
70.8K
Rohan Varma
Rohan Varma@rohanvarma·
If we made /slow mode in Codex, would you use it? What for? (Slower inference at a cheaper cost)
English
953
32
2.2K
185.6K
Daniel Filan
Daniel Filan@dfrsrchtwts·
Some evaluations work we are getting up to at METR
Daniel Filan tweet media
English
8
2
96
3.1K
Sam Wolfstone retweetou
Adam Kranz
Adam Kranz@adam_kranz·
Claude, in a world full of unknown unknowns: Good, now I have a complete picture
English
7
27
460
12.9K
Rob Bensinger ⏹️
Rob Bensinger ⏹️@robbensinger·
I've seen @theaidocfilm three times, I have never been this excited to rewatch a movie?? feed me your reactions if you've seen it
English
10
2
46
2.2K
Sam Wolfstone
Sam Wolfstone@SamWolfstone·
@EntropyChase @catehall of things in the world, and they'd still be able to 'understand' those things, even if those things were incorrect. Maybe you and I have a different definition of 'understanding', so maybe we're slightly talking past each other...
English
1
0
0
15
Cate Hall
Cate Hall@catehall·
“Stochastic parrot” is such a potent coinage — so fun to say! so conceptually efficient! — that it seems to have permanently colonized a lot of people’s minds despite not being true of today’s models. Genuinely a linguistic work of art.
English
51
28
748
94.9K