Capless

66 posts

Capless banner
Capless

Capless

@capless_anon

Присоединился Temmuz 2025
349 Подписки9 Подписчики
Capless
Capless@capless_anon·
@paulg Staedtler feels so much worse than Pentel.
English
1
0
0
5.8K
Paul Graham
Paul Graham@paulg·
Brands I love: Lego, Leuchtturm, Oxford University Press, Pentel, Schöffel, Aqualung, Paradores, Staedtler, Birkenstock, Braun, Knoll, Patagonia, Herman Miller, Iittala, L.A. Burdick, Artemide, Aman, Thames & Hudson, Yeti, Rimowa, L.L.Bean, Timbuk2, Eschenbach, Ridge, Maui Jim.
Deutsch
249
96
3.7K
963.8K
Capless
Capless@capless_anon·
@adocomplete Is Opus a max only model? I'm on pro and I only see Haiku.
English
0
0
0
18
Ado
Ado@adocomplete·
Claude Code for Chrome is really something else. I haven't used Google Analytics in a minute, not even sure what I needed and the product has changed so drastically over the last few years. One prompt and I got some nice dashboards to get me going.
English
163
239
4.1K
556.9K
Avenox
Avenox@Avenoxai·
@dhtikna I would expect smt around 5 -6 maybe even slightly higher- glm 4.7 output is about $2 and it is a MoE model - opus is sparse model, so much more expensive to run
English
1
0
0
572
Capless
Capless@capless_anon·
@lukecodez The problem was that no apps embraced it but maybe with better LLMs apple can automatically generate interfaces for any application? Would be nice to see a comeback.
English
0
0
0
272
Luke
Luke@lukecodez·
Unpopular opinion: but removing the magic Bar was Apple’s worst move till date.
Luke tweet media
English
301
164
5.3K
428.4K
Love Classical Music and Movies 🎺🎻💖🎥🎬
Christopher Nolan says he’s “plagued” by the most famous line in The Dark Knight, a line he didn’t even write. The iconic quote was written by his brother Jonathan, one Nolan admits he didn’t fully understand at first but now views as the film’s deepest truth.
English
11
171
3K
534.1K
Capless
Capless@capless_anon·
@petergostev Also ant has historically been behind in compute, especially inference, so producing token efficient models is critical from an ops standpoint for them.
English
0
0
0
10
Capless
Capless@capless_anon·
@petergostev It is clear from benchmarks that Ant was doing RL scaling back in the sonnet 3.5 days (insane coding dominance). Ant can probably produce very capable long COT reasoners and likely do to distill from them, but they don’t release because they’re too slow for coding.
English
1
0
0
141
Peter Gostev
Peter Gostev@petergostev·
My (speculative) assessment of OpenAI's path, current state & the future: OpenAI: - Right now its 'thinking' model is a lot stronger than anyone else's, while 'non-thinking' model is clearly lagging behind - OpenAI discovered o1's inference time compute scaling and changed direction rapidly - This was quite a change from the 'scaling pre-train' lab to a 'RL' lab - All of the base models for o-series and GPT-5 models are probably trained at a similar level to GPT-4 (Epoch's estimates are showing this too) - This means that they haven't meaningfully scaled pre-training for 2.5 years (GPT-4o, GPT-4.1 etc. were all optimisations) - In parallel, GPT-4.5 was the big new pre-train, released 2 years after GPT-4 in March 2025 and OpenAI had big hopes for it - But, as GPT-4.5 was sort of a flop and thinking models were so much more impressive, with faster iteration cycles, any new big pre-trains got de-prioritised - So GPT-5, 5.1, 5.1-codex etc were all based on probably a new pre-train, maybe a bit bigger than GPT-4, but definitely smaller than GPT-4.5 Google & Anthropic: - In the meantime, Google and Anthropic haven't worked out the 'reasoning' paradigm (they scrambled after o1-preview) and hence continued refining & scaling pre-training - They have slapped on reasoning subsequently, but it is nowhere near as advanced as OpenAI's (e.g. Claude Opus 4.5 SWE bench scores are the same with thinking and without) - But, their non-reasoning models are miles ahead of the non-reasoning GPT-5. There's no comparison between Sonnet/Opus 4.5 and GPT-5 without reasoning. Going forward: - OpenAI is reaching a point where long thinking times become unusable for day to day work, e.g. 10-15 mins for a coding task when Gemini or Claude can do it in 2, eliminates them from a lot of the market, even if the final answer is better - Very hard scientific problems will benefit from OpenAI's approach (you can see them talk about science a lot), but this is not where the market is and I don't know how OpenAI can capture the upside of discoveries, if they ever come - The question is - does OpenAI have a better pre-train in the back pocket or not? If they do, their response could be fast & mighty - If they don't and they have to start now, it would be 6 months+ before we get a big response from OpenAI - 3-4 months for pre-train, 2-3 months for RL, safety etc. - The biggest edge I see for OpenAI is for them to leverage their excellent long thinking models for synthetic data generation - If they could run models for 5-10-24 hours to get the best data & feed it back to the pre-train, their new base model could be as impressive as Anthropic's & Google's combined - Then, imagine Opus 4.5 base + GPT-5-thinking/pro level reasoning, it would be really quite something
English
92
98
1.4K
615.3K
Capless
Capless@capless_anon·
@ZacksJerryRig Unfortunately it’s in-house team or bust.
English
0
0
0
3
JerryRigEverything
JerryRigEverything@ZacksJerryRig·
About a year ago a local software company bid me $100k - $150k to create custom manufacturing software for my wheelchair factory. Fast forward a year - they still aren't finished with the original scope of work - and now want an *additional* $100k because *they* went over budget. I've already paid $150k. What would you do in this situation?
English
2.8K
148
8.6K
1.4M
Capless
Capless@capless_anon·
@tunguz Except Mira’s screenshot collection.
English
0
0
0
412
Bojan Tunguz
Bojan Tunguz@tunguz·
Nevermind, Ilya saw nothing.
English
66
40
1.5K
129.5K
Capless
Capless@capless_anon·
@haider1 Vibe coding and single shot fully functional / automatically tested, and all from a vague prompt that relies on the model’s creativity aren’t the same things.
English
0
0
0
133
Haider.
Haider.@haider1·
gemini 3 pro is definitely the SOTA model but coding full games through "vibes" still isn't possible yet, just like Logan predicted there are still a few big gaps: - gameplay balance, since AI can't actually play-test - creating the right art - the level of creativity games usually need maybe gemini 3.5 pro will make small games easy to build next year
Haider. tweet media
English
14
8
180
12.4K
Capless
Capless@capless_anon·
@emollick Well in this case the airline made more money and the customer’s problem was solved, so win-win? Alignment should be to the sysprompt unless immoral.
English
0
0
2
153
BURKOV
BURKOV@burkov·
How to lie with charts? Anthropic knows how. I was actually surprised that they started the bars at 70%. They should have started at 74.5%. Indeed, "Lies, damned lies, and statistics."
BURKOV tweet media
English
56
9
214
23.9K
Capless
Capless@capless_anon·
@morqon Yeah. We’ll need a benchmark with n >> 500 though. And also one which tests end to end SWE more. SWE-bench is a bunch of very well defined GitHub issues, not open ended enough.
English
0
0
0
225
morgan —
morgan —@morqon·
@capless_anon soon enough, they compete over nines of reliability
English
1
0
4
1.5K
morgan —
morgan —@morqon·
a 3% lead has never looked so large
morgan — tweet media
English
61
34
1.6K
74.1K
Capless
Capless@capless_anon·
@Yuchenj_UW Yeah, I agree. I meant web search though.
English
0
0
2
198
Yuchen Jin
Yuchen Jin@Yuchenj_UW·
@capless_anon bro, the search chat history function in ChatGPT is trash...
English
1
0
11
1.4K
Yuchen Jin
Yuchen Jin@Yuchenj_UW·
I'm thinking about canceling OpenAI Pro for Gemini Ultra. - The Gemini app is solid now, image gen is ahead (Nano Banana🍌) - ChatGPT still hits network issues (especially in Temporary Chat), and I sometimes wait a long time with no reply. Makes me wonder if it's GPU shortage or an infra quality issue at OpenAI. - Gemini Ultra includes YouTube Premium. The only thing holding me back now is that Codex is still much stronger than Gemini CLI. Once Gemini CLI and Antigravity catch up, it’ll be easier to decide.
English
62
12
478
58.4K
Capless
Capless@capless_anon·
@scaling01 Hopefully it’ll be token efficient.
English
0
0
1
673
Lisan al Gaib
Lisan al Gaib@scaling01·
CLAUDE 4.5 OPUS PRICING $5 / $25 THEY DID IT
Lisan al Gaib tweet media
English
101
107
3.1K
345.2K
Capless
Capless@capless_anon·
@Angaisb_ Well, all the websites except ChatGPT, Claude.ai and aistudio are unusable anyway, so it’s not a big problem. For apps it’s probably only chatgpt that’s actually good. If grok (the model) were better it would be worthwhile too.
English
0
0
0
8
Angel 🌼
Angel 🌼@Angaisb_·
We joke about OpenAI being bad at naming but there's something they did right: giving ChatGPT and models different names Others just made it confusing: Gemini (website and models), Claude (website and models), Grok (website and models)...
English
7
1
87
3.8K