William Ritossa

182 posts

William Ritossa

William Ritossa

@williamritossa

Australia Katılım Ocak 2017
847 Takip Edilen105 Takipçiler
Sabitlenmiş Tweet
William Ritossa
William Ritossa@williamritossa·
I enjoy @TheEconomist but can't read it all. Their "Your Day in Brief" section summarise key articles, but not all of them (& don't include the GOATs @matt_levine and @benthompson) A great thing with ChatGPT/GPT API is how quickly you can make tools that used to take days/weeks
English
2
1
22
7.9K
William Ritossa
William Ritossa@williamritossa·
The Nevada Department of Transportation has a live feed of traffic cams, so I quickly Codex’d a page which renders all the cams on the Las Vegas GP F1 track The UX could be improved, but I'm late for my run. GitHub link in thread, just download and run index.html locally
William Ritossa tweet media
English
2
1
3
1.6K
William Ritossa
William Ritossa@williamritossa·
Oh gosh, now the joke about having to play on a minecraft parkour reel/tik tok on split screen to hold people’s attention will become a reality
William Ritossa tweet media
English
0
0
0
63
William Ritossa
William Ritossa@williamritossa·
OpenAI’s pro series reasoning models are now cheaper than the GPT-4 32K when it launched, with the price for the pro models dropping ~90% in less than a year. This opens up sooooo many new product classes as the required economic value is lowered drastically - Input per 1M: $60 for 4 vs $15 for 5pro - Output per 1M: $120 for 4 vs $80 for 5pro
William Ritossa tweet media
English
0
0
0
25
William Ritossa
William Ritossa@williamritossa·
o3 token velocity dropped quite by ~40% on the 2nd. gpt-4o and 5 are unaffected 👀
William Ritossa tweet mediaWilliam Ritossa tweet media
English
0
0
0
54
OpenAI Developers
OpenAI Developers@OpenAIDevs·
The Slack integration and Codex SDK are available to developers on ChatGPT Plus, Pro, Edu, Business, and Enterprise. Admin tools are available on Edu, Business, and Enterprise. More in the blog: openai.com/index/codex-no…
English
2
7
95
23.2K
OpenAI Developers
OpenAI Developers@OpenAIDevs·
Codex is now GA, along with 3 features that make it more useful for engineering teams: - @Codex in Slack - Codex SDK - New admin tools
English
25
64
779
438K
William Ritossa
William Ritossa@williamritossa·
Talk about codex token caching efficiency! 5.37M tokens used until I ran out of context window
William Ritossa tweet media
English
0
0
1
52
William Ritossa
William Ritossa@williamritossa·
On first impressions, GPT-5-Codex is noticeably more rigorous when it needs to be, and at the same time completes simple tasks faster (fewer reasoning tokens. In practice, this means doing /model to toggle up/down reasoning effort, and leaving it on gpt-5-codex-high
English
2
0
2
128
William Ritossa
William Ritossa@williamritossa·
Changing the prompt to be clearer about our desired output fixed this 4/4 🧵
William Ritossa tweet media
English
0
0
0
20
William Ritossa
William Ritossa@williamritossa·
It turns out it's because we were literally asking GPT-5 to output the date after the item! It was just following our instructions 3/4 🧵
William Ritossa tweet media
English
1
0
0
26
William Ritossa
William Ritossa@williamritossa·
A good comment on GPT-5's instruction following ability by @sherwinwu and @oliviergodement made me recall one lesson we learnt the hard way when some of our teams who don't use evals upgraded to GPT-5 without prompt changes... model output often worsened and bugs increased because GPT-5 is much better at instruction following Two examples: 1. If you give it a request where your prompt is a bit unclear or you contradict a different part of the prompt, older models would handle this gracefully (do what you meant, not what you said) but GPT-5 will over-index on your instructions and follow them literally 2. You had to beg GPT-4 and o3 to be talk in a particular way (e.g. be concise, use this tone). Whereas if you beg GPT-5 to do it, it will do it, and it will often over-index for it. After moving to gpt-5 (or writing any new prompt), regardless of whether you have evals: - Be pedantic when reviewing the prompt (or copy+paste it to GPT and ask it to be pedantic) - Manually review the model output <-- this is the key, always, ongoing 1/4 🧵
English
1
0
1
53
William Ritossa
William Ritossa@williamritossa·
@sama That’s a really good observation — here’s why you’ve nailed it …
English
13
14
2.6K
74.1K
Sam Altman
Sam Altman@sama·
i never took the dead internet theory that seriously but it seems like there are really a lot of LLM-run twitter accounts now
English
3.4K
1.5K
33.8K
5.8M
Sam Altman
Sam Altman@sama·
really cool to see how much people are loving codex; usage is up ~10x in the past two weeks! lots more improvements to come, but already the momentum is so impressive.
English
750
365
7.2K
1M
William Ritossa
William Ritossa@williamritossa·
Great take from @btaylor on @AcquiredFM: Say you are debugging code that caused a system to shutdown- you don’t just restart the system. You find the part of the process that was broken and caused the shutdown. The same philosophy should apply Cursor/Codex/Claude. If it produces incorrect code, the philosophy should be don’t fix the code, fix the context that Cursor had that produced the bad code If you just fix the code you don’t have leverage. If you go back and say, ‘what context did this coding AI not have that if it had it, it would have produced the correct code?’ It takes longer in the short term, but in the long run it’s the difference between properly leaning into AI vs just using AI
English
0
0
1
249
William Ritossa
William Ritossa@williamritossa·
Sharing my GPT-5 tl;dr that I sent internally after watching the livestream at 3am Sydney time: Cursor/Coding - GPT-5 is out in Cursor today and is their default model - Early testers say it’s very good at pair programming, better than Opus 4.1 - Lots of people reporting it’s very good at very long conversations/pair programming sessions. But say Opus 4/4.1 still wins on agentic coding tasks - It looks fast - much faster than o3, and gpt-4 when it first came out API - There are 4 models: gpt-5, gpt-5-mini, and gpt-5-nano, gpt-5-chat. I’m particularly to see what gpt-5 is like vs gpt-5-chat. When we moved from GPT-3 to GPT-3.5, it got a lot better at chat tasks but lost some abilities in the RLHF process - It’s available in the API today - It decides how much to think/reason on a problem. You set this in the API with the reasoning parameter, which has a new minimal reasoning param - There is a new verbosity parameter (low, medium, or high) - Structured outputs can now check regex (previously just a JSON schema) API Pricing: - Input tokens are 50% cheaper than gpt-4o - Output tokens are the same price as gpt-4o (but with thinking, it would use more tokens) ChatGPT App - GPT-5 decides how hard to think on a problem by default. This unlocks reasoning for so many people who didn’t use or have the model picker - OAI are deprecating all the other models - Early testers say it’s better than o3 for similar research (where simple = you don’t need to go deep into a problem, not that the problem is hard) but not as good as o3 on deeper research. (I hope that’s not true, since I use o3 to dive deep ~20 times a day) - They say a big unlock is that it is faster. Letting you iterate much faster at a high quality each cycle
English
0
0
1
277
William Ritossa
William Ritossa@williamritossa·
Before: { "id": 123, "status": "ok" } After: {"id":123,"status":"ok"}
English
0
0
0
55
William Ritossa
William Ritossa@williamritossa·
GPT defaults to returning JSON pretty-printed, with indents & new lines - great for humans but redundant for computers. We found asking it to output minified JSON reduced tokens significantly, cutting total response time by ~30% This saved us 30s on a call that was 70s (TTLT)
English
1
0
0
119