FateOfMuffins

147 posts

FateOfMuffins

FateOfMuffins

@FateOfMuffins

Katılım Şubat 2022
34 Takip Edilen3 Takipçiler
Tibo
Tibo@thsottiaux·
What are we obviously not getting right with Codex?
English
2.3K
20
2.1K
386.3K
FateOfMuffins
FateOfMuffins@FateOfMuffins·
@AcerFur What if you /goal GPT 5.5 xHigh in codex with subagents to solve some math problems and compare with Pro?
English
0
0
2
220
Acer
Acer@AcerFur·
damn I should have gotten hooked onto Codex way sooner it's so cool to see a whole bunch of agents edit different parts of a paper
Acer tweet media
English
9
6
108
4.9K
Emad
Emad@EMostaque·
Number 1 @OpenAI Codex request: plz let us use gpt pro in it Or someone compare gpt pro to x high
English
62
19
780
75.9K
FateOfMuffins
FateOfMuffins@FateOfMuffins·
@ashrealite @AcerFur I doubt it'll be a *giant* difference but pretty sure some of the puzzles that ARC says the AI fails at... GPT 5.5 probably could do a couple of the easier ones in codex computer use and it'll be "efficient". It would still probably fail at a lot of the harder ones
English
0
0
0
37
FateOfMuffins
FateOfMuffins@FateOfMuffins·
@ashrealite @AcerFur I know it's just a "trust me bro" but I only had generic custom instructions like avoiding emdashes The reasoning it wrote down at least makes me think it understood the puzzle and was not just entirely clicking at random It solved it in 24 steps
English
0
0
0
18
FateOfMuffins
FateOfMuffins@FateOfMuffins·
@andonlabs The slope increased dramatically for 5.5 near the end. What happened? What happens if you let it run for 2 years instead of 1?
English
0
0
2
752
Andon Labs
Andon Labs@andonlabs·
We got early access to GPT-5.5. It's 3rd on Vending-Bench 2: better than GPT-5.4 but worse than Opus 4.7. However, it's on par with Opus 4.6 without any of the deception or power-seeking we saw from Opus 4.6 and Mythos. So bad behavior isn't necessary. Why is Claude doing it?
Andon Labs tweet media
OpenAI@OpenAI

Introducing GPT-5.5 A new class of intelligence for real work and powering agents, built to understand complex goals, use tools, check its work, and carry more tasks through to completion. It marks a new way of getting computer work done. Now available in ChatGPT and Codex.

English
32
39
647
104.3K
FateOfMuffins
FateOfMuffins@FateOfMuffins·
@GregHBurnham @nikhilchandak29 Although at the end of the day, if you tossed out 90% of the proposed problems, the final mock exams look decent I can DM you the problem bank and you can take a quick look through some of the harder problems to see if it made any interesting ones
English
1
0
1
24
Greg Burnham
Greg Burnham@GregHBurnham·
AI systems are mediocre at suggesting problems for FrontierMath: Open Problems. It would be funny if they got good at that around the same time they get good at solving the problems. Some latent "knowing what question to ask" ability...
English
2
1
30
2.6K
FateOfMuffins
FateOfMuffins@FateOfMuffins·
@GregHBurnham @nikhilchandak29 I have a project with 650 problems I generated in March for a contest last month, which was then narrowed down through automated quality checks from agents repeatedly down to about 40 or so to actually construct 2 mock exams.
English
1
0
1
38
FateOfMuffins
FateOfMuffins@FateOfMuffins·
@TheRealAdamG Codex will only be proto-AGI to me if it can port all the features that Codex has on Mac to Windows in the same release day
English
0
1
3
569
Adam.GPT
Adam.GPT@TheRealAdamG·
Codex, and what Codex becomes in the near term, to me is a proto-AGI. We’re on the flight path, we’ve been descending down through the clouds for a while and we can now see visible landmarks — some familiar and some we’ve never seen (or never seen from this POV). What a time to be alive. (This is not a formal OpenAI POV, rather just the musings of some random head of broccoli)
English
35
33
702
55.5K
FateOfMuffins
FateOfMuffins@FateOfMuffins·
@AcerFur Correct me if I'm wrong but over the last decade, many of the researchers at OpenAI were scouted in the middle of their PhDs and they left to pursue AI research at OpenAI. Did they advance the field further by staying in academia or at the frontier labs?
English
0
0
0
63
FateOfMuffins
FateOfMuffins@FateOfMuffins·
@AcerFur How long would it take to finish a post grad? Where will math be at that point in time? Where can you make the biggest impact in math research? Is it the traditional way where you're not getting a PhD for another 5 years or is it spearheading the AI research for math at OpenAI?
English
1
0
0
78
Acer
Acer@AcerFur·
1/ Seeking advice on this since it’s been on my mind for the past four months or so If I were to drop out of my undergrad Cambridge degree and join OpenAI fully, what options would I really have on trying to continue to postgrad maths at some point?
English
90
6
332
75.3K