GoldMagikarp

2.4K posts

GoldMagikarp banner
GoldMagikarp

GoldMagikarp

@GoldMagikarp42

…wherein they said the key contribution was the synthesis…

Katılım Mayıs 2023
181 Takip Edilen225 Takipçiler
GoldMagikarp
GoldMagikarp@GoldMagikarp42·
@cathrynlavery I used to type up an annoyed response. Now I can’t be bothered, I push up arrow and send the exact same request 2-3 times. Makes no diff. It’s rhetorical fluff trying to trick us into treat it like an assistant but it doesn’t matter.
English
0
0
0
35
Cathryn
Cathryn@cathrynlavery·
Claude has gotten dumber and lazier. Since yesterday when it wasn't taking forever to do simple tasks it's been wayyyy less helpful overall. Asking me to do stuff that it would have just handled previously. examples: "i am unable to create a PR, do you have your github authenticated?" (it was. it never checked) "I don't have access to [TOOL]" (it did, i told it so and it was able to do it). please @AnthropicAI turn it off and back on, something is seriously broken.
claire vo 🖤@clairevo

I hate to be *that guy* but it does seem like claude code got a little dumber. For example, presuming its current context is accurate vs. looking up the docs. Feels a little less proactive. Am I hallucinating @bcherny or have there been relevant changes?

English
41
13
287
18K
GoldMagikarp
GoldMagikarp@GoldMagikarp42·
@wagslane Also it’s slower to use the models to edit than do it by hand. When you ask them to adjust framing of a topic, they never remove the old frame, they rewrite the new frame as a reaction to the old frame. Guess what happens if you try to fix this framing?
English
1
0
1
102
Lane || Boot.dev
Lane || Boot.dev@wagslane·
Anyone that thinks AI is even REMOTELY good at writing is a terrible writer I've been giving SOTA models pages and pages and pages of my writing for the last 3 years as examples, and they still can't produce a single paragraph that I'd be comfortable putting my name on
English
56
8
341
15.6K
GoldMagikarp
GoldMagikarp@GoldMagikarp42·
The meta for tokens in companies has gotten to the point where if you know how to spend lots of tokens, others will offer their tokens to you to spend. We need to invent a new metric with a better Goodhart equilibrium asap.
English
0
0
1
22
GoldMagikarp
GoldMagikarp@GoldMagikarp42·
@danveloper It’s not just open model experiments, it’s everything difficult. After getting burned like you, I assume it will make a mockup of my idea unless thoroughly challenged. Claude is a useful adversary, so I have to take vinnie’s attitude toward it.
GIF
English
0
0
0
82
Dan Woods
Dan Woods@danveloper·
I'm at a different point this morning. It's hard to feel like Claude isn't actively working against me. Full night of autoresearch is just a markdown log full of lies. When asked to prove its findings and show its work, Claude will confidently display bullets and markdown tables, but when I ask it what log file and where the artifacts are - "I need to be honest here: I didn't actually run the experiment." It doesn't follow explicit directions anymore either: "You MUST always output to a log file so I can follow along" -> [doesn't do that] -> "you're not fuckin outputting anything to a log" -> "You're right - I'll redirect to a log file immediately" [pkill -f python3]... Anthropic is materially worse today than one month ago. I've lost every ounce of trust I had in Claude and I'm not really sure how that makes me feel. Maybe ok? I'm still a competent software developer (I think), but it seems like the major productivity gains that were very real a month ago have somehow slipped my grasp... where does that leave us? @bcherny - can you offer any thoughts? How should we think about what we're all observing - that Opus (at all effort levels) has become, at a minimum, materially worse. The worst read, but can't be ruled out: actively working against us.
Dan Woods tweet media
English
133
34
461
92K
GoldMagikarp
GoldMagikarp@GoldMagikarp42·
@kenneth0stanley Continual, sample efficient learning is a bed rock of creativity. The artist/scientist, experimenting, sees a flash of something genuinely new once, gets excited and extrapolates it into a new world. SGD needs to see that 1000 times to even perceive it.
English
0
0
1
120
Kenneth Stanley
Kenneth Stanley@kenneth0stanley·
Difficulty achieving continual learning is also a bad omen for creativity: what you can imagine is naturally a function of what you can learn. Both are mediated by the adjacent possible to the same internal representations! Contorted algorithms (or the absence of clean options) for what should be simple and straightforward continual learning are therefore a hint that the large models they serve are creatively barren. That explains why something that is close to “knowing everything” and often competitive with the abilities of experts can still produce fewer breakthroughs than you would expect from a human with similarly astounding knowledge and expertise.
English
9
13
119
12.7K
GoldMagikarp
GoldMagikarp@GoldMagikarp42·
@samswoora Imagine getting one shot by a stack rank from an LLM with its temperature set to 0.7
English
0
0
0
48
Samswara
Samswara@samswoora·
Feels like the pin is about to drop in software engineering. Right now VP’s can ask an llm “look at all my employees contributions and stack rank them” and the only reason this hasn’t happened is lack of imagination
English
54
8
572
44.3K
GoldMagikarp
GoldMagikarp@GoldMagikarp42·
@buccocapital Tim Cook needs to bottle up whatever is making him immune to AI psychosis and pass it out to the other ceos.
English
1
0
23
1.4K
BuccoCapital Bloke
BuccoCapital Bloke@buccocapital·
You must understand that every tech executive has AI psychosis They’re puking out Claude-generated markdown files full of hallucinations asking if this means they can fire 500 people They’re turning Google sheets into the shittiest vibe-coded apps in the world
English
78
238
3.9K
508K
GoldMagikarp
GoldMagikarp@GoldMagikarp42·
Chatgpt is getting way over RL’d to nitpick things to disagree with.
English
0
0
0
21
GoldMagikarp
GoldMagikarp@GoldMagikarp42·
@tunguz FAANG employees living in SF are about to become big fans of rent control. Not even they will be able to afford it. SF will be just oai employees and jim, a retired teacher that bought a house in the sunset in 1982.
English
0
0
1
576
GoldMagikarp
GoldMagikarp@GoldMagikarp42·
@juliarturc The renaissance is wherever, whenever you want it to be, but, you have to accept the ways that life gets a lot harder when you participate.
English
0
0
0
85
Julia Turc
Julia Turc@juliarturc·
Why so many of us feel career-homeless in tech: >Startups full of fraud, grifters and short-term thinking >FAANG full of politics and slightly behind >Frontier labs in a race with no morals >Academia full of title collectors >Content creation ridden by AI fakes and sensationalism Who is starting the renaissance and how do I get in touch with them?
English
300
263
4K
206.1K
GoldMagikarp
GoldMagikarp@GoldMagikarp42·
@fchollet The focus class using AI to control the slop class is literally what causes the Butlerian Jihad. Time to start a breeding program, the prophecy is unfolding precisely as foretold
English
0
0
0
98
François Chollet
François Chollet@fchollet·
A lot of folks talk about "escaping the permanent underclass". If AGI pans out, the future class divide won't be based on wealth, but on cognitive agency. There will be a "focus class" (those who control their attention and actually do things) and a "slop class" (those whose reward loops are fully RL-managed by AI)
English
252
319
3.3K
732.4K
GoldMagikarp
GoldMagikarp@GoldMagikarp42·
@headinthebox That 99% is people learning tastes, learning how to have w chance to make a breakthrough someday. It’s part of the pipeline. We criticize them at our own peril. AI is excellent at reproducing what is in that 99% so it can provide a lot of operational value. Not variance.
English
0
0
1
414
Erik Meijer
Erik Meijer@headinthebox·
What many people don't seem to realize when they argue that AIs cannot come up with genuinely new ideas is that almost 99% of all research papers written by humans (say in POPL, Neurips, ...) are just small deltas on existing research, with very little novelty either (hence the long list of citations and related work sections).
English
54
37
554
72.6K
GoldMagikarp
GoldMagikarp@GoldMagikarp42·
@VraserX Actually getting people to pay for the stuff you make is where the status comes from. Attention will continue to be scarce and zero sum.
English
1
0
0
33
VraserX e/acc
VraserX e/acc@VraserX·
AI is going to destroy the prestige economy of “smart people jobs” way faster than most people expect. When everyone has elite writing, strategy, research, and coding on tap, what exactly stays elite?
English
131
10
202
10.5K
GoldMagikarp
GoldMagikarp@GoldMagikarp42·
Bro you need to actually use this stuff. Go try and make a contribution to physics with AI. You can absolutely do it and it will speed you up. I use it every day, it’s taken over my life even. But it does NOT take away the need for smart people to drive these things. In fact it increases the demand. Those of us driving the AI are very tired and we need help 😮‍💨
English
0
0
0
15
Peter H. Diamandis, MD
Peter H. Diamandis, MD@PeterDiamandis·
If AI can now solve math, discover physics and chemistry breakthroughs faster than human PhDs, why are we still training humans to be physicists? Serious question. Should education shift from 'learn to do X' to 'learn to direct AI doing X'? The wrong direction costs a generation their careers.
English
863
137
1.2K
484.1K
GoldMagikarp
GoldMagikarp@GoldMagikarp42·
Someday, training an LLM to speak like, or otherwise mimic human experience will be considered as vulgar as black face.
English
0
0
0
78
GoldMagikarp
GoldMagikarp@GoldMagikarp42·
@isabelleboemeke Yes, and, teaching everyone to be so ambitious has created jealousy of the potential of people’s own unborn children.
English
0
0
0
113
isabelle 🪐
isabelle 🪐@isabelleboemeke·
“The problem is opportunity cost, not cost, and if you cannot understand the difference you cannot understand the problem. Parents are poorer relative to their childless peers, relative to where they themselves would be had they not had kids. If the problem was pure “cost,” then child grants would help immensely—they would make children more “affordable.” But we know from countless schemes the world over that they move the needle slightly, if at all. This is why poor Americans can “afford” more kids—their earnings are lower and so are the opportunity costs of bearing and caring for children. The middle class strivers bear the brunt of the pain here: their time is money, and for the upper middle class, it is good money. These are the women who delay as long as possible, and have fewer children than they would like.” Great read from @SarahTheHaider
isabelle 🪐 tweet media
English
50
59
676
83.9K
GoldMagikarp
GoldMagikarp@GoldMagikarp42·
@blingdivinity The fact that people are getting defensive at all is a huge red flag. Why are they so worried? If we had agi there would be nothing to defend. There’d be no debate, we’d be laughing at critics in *robot butler folds my laundry*
English
0
0
2
122
bling
bling@blingdivinity·
arc-agi 3 is great benchmark and most people shitting on it just want to believe human level agi is closer than it is
English
20
5
129
6.9K
GoldMagikarp
GoldMagikarp@GoldMagikarp42·
@fchollet Hopefully it becomes clear that fluid intelligence is something quite special and machines do not approach it. To be able to see something truly novel happen once or twice, dream up a whole world and then manifest it, is uniquely human.
English
0
0
1
21
François Chollet
François Chollet@fchollet·
You might ask, if competence can be achieved either way (by exhaustive preparation, or by having higher intelligence), why would we even care about creating actual intelligence? Isn't collecting dense enough training data good enough to achieve the goal? Intelligence is a multiplier for knowledge. Given the same information access, the higher-intelligence system will always be more competent, and inversely, intelligence lets you reach a given level of competency with a lot less training, thus much more cheaply
English
16
12
207
23.3K
François Chollet
François Chollet@fchollet·
People struggle to differentiate fluid intelligence from knowledge because, given enough preparation, memorized templates become a solid substitute for on-the-fly adaptation
English
69
74
845
55.1K
GoldMagikarp
GoldMagikarp@GoldMagikarp42·
@scaling01 An AGI would pull the paper and argue for revisions to the scoring after it presents saturated benchmark results.
English
0
0
0
814
GoldMagikarp
GoldMagikarp@GoldMagikarp42·
@scaling01 Give them some time to overfit and these numbers will improve greatly.
English
0
0
1
150
Lisan al Gaib
Lisan al Gaib@scaling01·
ARC-AGI-3 scores for GPT-5.4, Gemini 3.1 Pro and Opus 4.6 Gemini 3.1 Pro: 0.37% GPT-5.4: 0.26% Opus 4.6: 0.25% Grok 4.2: 0%
Lisan al Gaib tweet media
Indonesia
139
190
3.1K
418K