Kyle Coogan

863 posts

Kyle Coogan

Kyle Coogan

@Kylec1215

Katılım Nisan 2015
211 Takip Edilen134 Takipçiler
Kyle Coogan retweetledi
Ken Ono
Ken Ono@KenOno691·
I worked on Tier 4. I haven’t seen the error details. The problems I know were solid, but I must add that AI has now solved them. It’s a sobering reality that human evaluation is reaching its limit.
Epoch AI@EpochAIResearch

We are conducting an AI-assisted review of FrontierMath: Tiers 1-4. This has flagged fatal errors in about a third of problems, and we believe most of these flags to be valid. We will release updated scores on a corrected dataset after completing a thorough human review.

English
10
56
650
69.3K
Kyle Coogan
Kyle Coogan@Kylec1215·
@mattpocockuk Why not just make all of the concepts all integrated into the CLAUDE.md so it can do each skill better? For example, having the knowledge of deep modules engrained in it with Claude md can produce better grill-me sessions
English
0
0
0
190
Matt Pocock
Matt Pocock@mattpocockuk·
1. /grill-with-docs 2. "Oh, I need to prototype some UI" 3. /handoff to /prototype 4. Create prototype, /handoff back to grilling session 5. /to-prd, /to-issues 6. npm run sandcastle 7. /improve-codebase-architecture I love this shit
English
67
141
3.2K
159.6K
Kyle Coogan
Kyle Coogan@Kylec1215·
@btibor91 Sounds like there haven’t fully operationalized the 300 megawatts they’re getting this month
English
0
0
1
578
Matt Pocock
Matt Pocock@mattpocockuk·
I'm about to ship an AI Coding dictionary. But I need help defeating the final boss. So, in your own words... ...what is AI?
English
140
3
243
38.4K
Emanuele Di Pietro
Emanuele Di Pietro@emanueledpt·
.@sama I saw you're active lately I made Codex remote control for iOS with the Codex App-Server I won 6 months of Pro with this and I've been loving using Codex every day Would love a feedback (and a follow back if you want to 🙏)
English
53
21
730
149.6K
Kyle Coogan
Kyle Coogan@Kylec1215·
@emanueledpt @sama Thank you, it’s not syncing properly I think. Some chats sync but some don’t. Also, some messages from mobile don’t send
English
1
0
1
45
nic
nic@nicdunz·
IM SORRY WHAT?????? YOURE NOT SUPPOSED TO RINSE YOUR MOUTH AFTER YOU BRUSH YOUR TEETH???????
nic tweet media
English
30
2
123
21.2K
Kyle Coogan
Kyle Coogan@Kylec1215·
@thsottiaux It literally doesn’t work on windows. Nothing happens after sending /pet and /pet does show up
English
0
0
0
45
Alex Ziskind
Alex Ziskind@digitalix·
Pasting text should not suddenly become a file upload. This is such a nasty usability anti-pattern.
Alex Ziskind tweet media
English
128
11
987
103.9K
Kyle Coogan retweetledi
jem
jem@sheherenow_·
bring back skeuomorphism, cowards
jem tweet media
English
77
157
2.3K
95.4K
Kyle Coogan retweetledi
nic
nic@nicdunz·
maybe a hot take, but you cant just test a model on your benchmark and then claim its scores are proof it sucks while literally everyone is astounded by how insanely good it is. making posts like this is basically the same as posting "our benchmark sucks" rather than proving the model sucks.
BridgeMind@bridgemindai

GPT 5.5 just debuted on BridgeBench. It ranks below GPT 5.4. Read that again. The "most intelligent model ever built" scores worse than its predecessor on real world vibe coding. #8 overall. 84.5 quality. GPT 5.4 sits at #6 with 85.1. Behind Claude Opus 4.6. Behind Claude Opus 4.7. Behind Claude Sonnet 4.6. Behind Grok 4.20. Behind Qwen 3.6 Plus. The smartest model in the world is not the best coding model in the world. BridgeBench just proved it. bridgebench.ai

English
23
14
365
19.8K
nic
nic@nicdunz·
so basically just use medium most of the time and high some of the time
nic tweet media
English
34
24
568
44.3K
Gurbinder
Gurbinder@legionsdev·
Introducing Evilcharts An open-source chart UI website built with shadcn/ui and recharts, beautifully designed and handcrafted.
English
70
52
930
73.6K
Kyle Coogan retweetledi
Daniel Litt
Daniel Litt@littmath·
enjoy that sunset while you can—soon a swarm of superintelligent AI agents will be able to appreciate it more rapidly and efficiently than you ever could
English
43
167
2.7K
62.5K
Kyle Coogan
Kyle Coogan@Kylec1215·
@scaling01 Not many people use no reasoning when using these models
English
0
0
0
170
Lisan al Gaib
Lisan al Gaib@scaling01·
some independent GPT-5.5 benchmarks from SemiAnalysis
Lisan al Gaib tweet media
English
20
9
170
15.7K
Kyle Coogan
Kyle Coogan@Kylec1215·
@ChrisHayduk @TheRealAdamG Not new. Gpt 5.4, Gemini 3.1 pro, and opus 4.6 even would do that all the time if it was waiting on a subagent to do something if needed. I see it so often
English
1
0
0
438
Chris Hayduk
Chris Hayduk@ChrisHayduk·
GPT 5.5 in Codex is actually crazy... I've had it churning for a few hours now on some pretty complex bio data pipelines, and it's proactively making code changes while it waits for data download commands to finish. I've never seen this behavior from Codex before this model
Chris Hayduk tweet media
English
25
18
403
23.1K