infrecursion

3.8K posts

infrecursion

infrecursion

@infrecursion1

Entrou em Nisan 2020
126 Seguindo39 Seguidores
Salina Mendoza
Salina Mendoza@inababi·
@HedgeyeComm If they fired Sam Altman and took him off the board, they would actually have a chance. No bullshit.
English
1
0
10
532
infrecursion
infrecursion@infrecursion1·
@HedgeyeComm Is this somekind of a joke or just average hallucination of a claude sheep? Anthropic does not even have a model that does audio or image. It's pretty dogshit experience almost anywhere.
English
0
0
1
81
infrecursion
infrecursion@infrecursion1·
@SynBio1 Yes, but not because of what you think is the cause. It will be because humans like you become completely irrelevant.
English
1
0
1
48
tecto
tecto@hypertectonic·
@btibor91 Everyone treating this like a feature announcement when it's actually OpenAI admitting Sora as a standalone product didn't work.
English
1
0
4
858
Tibor Blaho
Tibor Blaho@btibor91·
ChatGPT/1.2026.076 (Android) adds an announcement that "Video in ChatGPT is here" - "Transform text and image into video with dialogue, soundtrack, and style."
Tibor Blaho tweet media
English
10
16
269
17.8K
infrecursion
infrecursion@infrecursion1·
@nayshins Start by deleting your own codebase, every line. That will get rid of so much slop.
English
0
0
2
185
Jake
Jake@nayshins·
Has anyone documented all the code slop patterns yet? I want to lint for them and banish them to hades.
English
39
3
201
21.9K
Peter Gostev
Peter Gostev@petergostev·
BullshitBench update: The new GPT-5.4 mini and nano models score quite low. This screenshot shows OpenAI models only, on the full list would put GPT-5.4-mini around 40th place and Nano is around 70th place. Again thinking didn't help much at all.
Peter Gostev tweet media
Peter Gostev@petergostev

BullshitBench v2 is out! It is one of the few benchmarks where models are generally not getting better (except Claude) and where reasoning isn't helping. What's new: 100 new questions, by domain (coding (40 Q's), medical (15), legal (15), finance (15), physics(15)), 70+ model variants tested. BullshitBench is already at 380 starts on GitHub - all questions, scripts, responses and judgements are there so check it out. TL;DR: - Results replicated - @AnthropicAI latest models are scoring exceptionally well - @Alibaba_Qwen is another very strong performer - OpenAI and Google models are not doing well and are not improving - Domains do not show much difference - rates of BS detection are about the same across all domains - Reasoning, if anything, has negative effect - Newer models don't do that much better than older ones (except Anthropic) Links: - Data explorer: petergpt.github.io/bullshit-bench… - GitHub: github.com/petergpt/bulls… Highly recommend the data explorer where you can study the data and the questions & sample answers.

English
8
3
67
6.7K
infrecursion
infrecursion@infrecursion1·
@gabrielchua Please put GPT-5.4 mini on ChatGPT. I want to use a reasoning model for many small to medium capability tasks like web search that doesn't require full GPT-5.4. I don't want to waste my limits using the full model. 5.4 mini should be available for selection, not as a fallback.
English
0
0
1
188
Gabriel Chua
Gabriel Chua@gabrielchua·
Now with `gpt-5.4-mini` and `nano` out, I put together a simple cheat sheet of the latest OpenAI models by use case. Noticed at a few recent hackathons & meetups: some folks still default to `gpt-4o-mini` for LLMs and `whisper-1` for transcription. Newer options tend to fit better now with much better performance. If you’re running into issues switching, lmk!
Gabriel Chua tweet media
English
24
20
282
28.8K
infrecursion
infrecursion@infrecursion1·
@ryanwinchester Yes, there are still people offering rides and riding on horse carriages. What's your point?
English
0
0
2
127
infrecursion
infrecursion@infrecursion1·
@krzyzanowskim No, sorry the worst codebase to work with are those by you, no exceptions. You're a shit coder.
English
0
0
1
20
Marcin Krzyzanowski
Marcin Krzyzanowski@krzyzanowskim·
the worst. literally the worst. no exceptions. the worst codebase to work with coding agents is the one generated by the agents itself. usually, the best codebase to work with agents is the one that predates coding agents.
English
43
9
288
16.4K
Onni
Onni@MyNameIsOnni·
@LeahLundqvist How is that in any way an improvement?
English
4
1
359
18.5K
leah lundqvist
leah lundqvist@LeahLundqvist·
"Even with the same prompt, DALL-E 3 significantly improves upon DALL-E 2"
leah lundqvist tweet media
English
30
10
433
1.2M
infrecursion
infrecursion@infrecursion1·
@jxnlco Why isn't mini offered as an option in ChatGPT? I want to be able to use a mini reasoning model for medium complexity tasks and search without having to spend 5.4 limits.
English
0
0
1
30
sankalp
sankalp@dejavucoder·
is what opus 4.6 to sonnet 4.6 the same as gpt 5.4 to gpt 5.4-mini? we will find out. so far this hasnt beem true
English
7
1
59
5.6K
Nick
Nick@nickcammarata·
alternatively openai make a slightly better model that just answers the research question I asked rather than writing several moderately helpful long lists
English
3
0
90
3.4K
Nick
Nick@nickcammarata·
anthropic please fix your awful ios transcription my workflow rn while walking is is talking to chatgpt for its whisper model and copying and pasting to claude and i want to be freed from this
English
30
9
421
25.3K
infrecursion
infrecursion@infrecursion1·
@LokiJulianus "Yeah imma take everything happening out there and try to fit it inside my worldview, because as everyone knows I am the center of the universe" - You.
English
0
0
1
542
Just Loki
Just Loki@LokiJulianus·
Yeah, this does not sound like imminent recursive self-improvement to me.
Just Loki tweet media
English
47
77
1.4K
347.4K
infrecursion
infrecursion@infrecursion1·
@lefthanddraft It seems humans like you will get replaced before you keep repeating the same dumb questions with no undestanding of how these things work.
English
0
0
1
8
Wyatt Walls
Wyatt Walls@lefthanddraft·
This thing is going to find a cure for cancer before it stops falling for dumb tricks.
Wyatt Walls tweet media
English
159
102
8.4K
254.4K
Logan Kilpatrick
Logan Kilpatrick@OfficialLoganK·
Spent all day vibe coding, fixing bugs in AI Studio, and polishing the experience :) So much fun!
English
185
26
1.4K
86.5K
infrecursion
infrecursion@infrecursion1·
@Sauers_ Dude what tf do you have in your system prompts that you get these weird results all the time?
English
1
0
5
1.1K
Sauers
Sauers@Sauers_·
My Codex has been grinding lean proofs for 12h straight now; I checked in and apparently it created and then deleted 1k+ lines of "Simulation Theory"
Sauers tweet media
English
10
7
233
18.9K
infrecursion
infrecursion@infrecursion1·
@ToFollowBrights @Zai_org Lmao how insane is your entitlement? This lab has single handedly released one after another SOTA quality open source models and the moment they try to raise some money (from their products not vc), they become OpenAI wannabe. Go fuck yourself.
English
0
0
1
77
ᅟTFB
ᅟTFB@ToFollowBrights·
@Zai_org not renewing my annual subscription then... only supporting organizations that share, not more Open AI wannabes
English
1
0
14
2.8K
infrecursion
infrecursion@infrecursion1·
@nlarusstone Lmao, so you're basically so far up in your arse that you think the dog getting better was some kind of scam someone pulled off to confirm your worldview?
English
0
0
1
90
Nicholas Larus-Stone
Nicholas Larus-Stone@nlarusstone·
The funny thing is AI already is accelerating drug discovery AND consumer biotech is going to be a big thing. This just ain’t it
English
2
1
31
1.9K