FateOfMuffins

147 posts

FateOfMuffins

@FateOfMuffins

Katılım Şubat 2022

34 Takip Edilen3 Takipçiler

FateOfMuffins@FateOfMuffins·13h

@thsottiaux It's not an operating system (yet)

English

Tibo@thsottiaux·15h

What are we obviously not getting right with Codex?

English

2.3K

2.1K

386.3K

FateOfMuffins@FateOfMuffins·1d

@AcerFur What if you /goal GPT 5.5 xHigh in codex with subagents to solve some math problems and compare with Pro?

English

220

Acer@AcerFur·1d

damn I should have gotten hooked onto Codex way sooner it's so cool to see a whole bunch of agents edit different parts of a paper

English

108

4.9K

FateOfMuffins@FateOfMuffins·1d

@EMostaque @OpenAI I know it's not the same but have codex use GPT Pro in the browser xd

English

1.1K

Emad@EMostaque·1d

Number 1 @OpenAI Codex request: plz let us use gpt pro in it Or someone compare gpt pro to x high

English

780

75.9K

FateOfMuffins@FateOfMuffins·2d

@ashrealite @AcerFur I doubt it'll be a *giant* difference but pretty sure some of the puzzles that ARC says the AI fails at... GPT 5.5 probably could do a couple of the easier ones in codex computer use and it'll be "efficient". It would still probably fail at a lot of the harder ones

English

FateOfMuffins@FateOfMuffins·2d

@ashrealite @AcerFur You can read it and judge if I gave the model too much "help". I told it to write down its reasoning because the thinking traces don't get passed through in ChatGPT. chatgpt.com/share/69c4c4c8…

English

FateOfMuffins@FateOfMuffins·2d

@ashrealite @AcerFur I know it's just a "trust me bro" but I only had generic custom instructions like avoiding emdashes The reasoning it wrote down at least makes me think it understood the puzzle and was not just entirely clicking at random It solved it in 24 steps

English

FateOfMuffins@FateOfMuffins·23 Nis

@andonlabs The slope increased dramatically for 5.5 near the end. What happened? What happens if you let it run for 2 years instead of 1?

English

752

Andon Labs@andonlabs·23 Nis

We got early access to GPT-5.5. It's 3rd on Vending-Bench 2: better than GPT-5.4 but worse than Opus 4.7. However, it's on par with Opus 4.6 without any of the deception or power-seeking we saw from Opus 4.6 and Mythos. So bad behavior isn't necessary. Why is Claude doing it?

OpenAI@OpenAI

Introducing GPT-5.5 A new class of intelligence for real work and powering agents, built to understand complex goals, use tools, check its work, and carry more tasks through to completion. It marks a new way of getting computer work done. Now available in ChatGPT and Codex.

English

647

104.3K

FateOfMuffins@FateOfMuffins·23 Nis

@GregHBurnham @nikhilchandak29 Roughly the workflow courtesy of GPT Image 2

English

FateOfMuffins@FateOfMuffins·23 Nis

@GregHBurnham @nikhilchandak29 Although at the end of the day, if you tossed out 90% of the proposed problems, the final mock exams look decent I can DM you the problem bank and you can take a quick look through some of the harder problems to see if it made any interesting ones

English

Greg Burnham@GregHBurnham·22 Nis

AI systems are mediocre at suggesting problems for FrontierMath: Open Problems. It would be funny if they got good at that around the same time they get good at solving the problems. Some latent "knowing what question to ask" ability...

English

2.6K

FateOfMuffins@FateOfMuffins·23 Nis

@GregHBurnham @nikhilchandak29 Ironically my students complained that my mocks were WAY harder than the actual contest lol

English

FateOfMuffins@FateOfMuffins·23 Nis

@GregHBurnham @nikhilchandak29 I have a project with 650 problems I generated in March for a contest last month, which was then narrowed down through automated quality checks from agents repeatedly down to about 40 or so to actually construct 2 mock exams.

English

FateOfMuffins@FateOfMuffins·18 Nis

@TheRealAdamG Codex will only be proto-AGI to me if it can port all the features that Codex has on Mac to Windows in the same release day

English

569

Adam.GPT@TheRealAdamG·18 Nis

Codex, and what Codex becomes in the near term, to me is a proto-AGI. We’re on the flight path, we’ve been descending down through the clouds for a while and we can now see visible landmarks — some familiar and some we’ve never seen (or never seen from this POV). What a time to be alive. (This is not a formal OpenAI POV, rather just the musings of some random head of broccoli)

English

702

55.5K

FateOfMuffins@FateOfMuffins·17 Nis

@AcerFur Correct me if I'm wrong but over the last decade, many of the researchers at OpenAI were scouted in the middle of their PhDs and they left to pursue AI research at OpenAI. Did they advance the field further by staying in academia or at the frontier labs?

English

FateOfMuffins@FateOfMuffins·17 Nis

@AcerFur How long would it take to finish a post grad? Where will math be at that point in time? Where can you make the biggest impact in math research? Is it the traditional way where you're not getting a PhD for another 5 years or is it spearheading the AI research for math at OpenAI?

English

Acer@AcerFur·17 Nis

1/ Seeking advice on this since it’s been on my mind for the past four months or so If I were to drop out of my undergrad Cambridge degree and join OpenAI fully, what options would I really have on trying to continue to postgrad maths at some point?

English

332

75.3K

Keşfet

@thsottiaux @AcerFur @EMostaque @OpenAI @ashrealite @andonlabs @GregHBurnham @nikhilchandak29