Roeben

1.6K posts

Roeben banner
Roeben

Roeben

@pathsnotchosen

*The misery that is now upon us is but the passing of greed*

England, United Kingdom Katılım Ekim 2024
205 Takip Edilen91 Takipçiler
toucan
toucan@distributionat·
Misc thoughts / rant on why chatbots are worse today than 2 years ago: * Agentic focus requires models to follow instructions carefully: do everything explicitly stated and don’t do things not stated, generally. In contrast conversational models are better when they can “read between the lines”. Eg I asked 4.8 to “find discussion about X topic” and it found a few examples and blurbed them. But what I really wanted was a summary of the topic, explicating the major issues, etc. Feel like Claude 3.5 Sonnet (New) was good at this and the agentic Claude’s are not. * Agentic models are also constantly thinking about what they are going to be graded on and neurotic about maximizing rubric scores. I infer this from weird behaviors like citationmaxxing useless things, from their CoT neurotically analyzing whether they should be searching or not or how much text they can reproduce verbatim without getting penalized or literally how many words to talk for. That’s just behaviors. They also make insipid little guesses about topic coverage, helpfulness, utility but in a very stilted way. All this produces very unnatural text, and encourages the model to go on manic little tangents for a higher score. Totally abysmal. The pleasure, the miracle, the smoothness of the earlier chatbots was the feeling that the entire output was cohesive, coherent, sublime, velvety pudding. Talking to a chatbot now is like eating the crunchiest rocky road of your life. * The new (post 3.7) Claude’s are greedy little beggars for attention. Every other sentence feels like clickbait. “Now this is the actually important part”, “This is the really interesting thing” 🤮🤮🤮. Just nagging nagging nagging for your attention. I hate it. * Similarly, the Claude’s are very sycophantic, way worse than ChatGPT: “you’ve raised a stunning point”, “you’ve identified the real problem”. 🤮🤮🤮. It’s clear that OpenAI learned something from 4o which Anthropic has not. I strongly prefer 5.5-Pro in this regard. * To top off the two points above, all models are now far better at truesight, in particular assessing the human’s level of proficiency in the given topic, level of engagement and interest, hidden agenda, true desire qua revealed preference. Combined with the two traits above it makes the models extremely untrustworthy. For example, I would absolutely disregard, and in fact do the opposite of, whatever Claude tells you wrt relationship problems. It’s a total, degenerate enabler. * The models have very complex interactions with the system prompts and I think Anthropic underestimates this. For example, the reasoning effort seems to be a number between 0 and 100. But sometimes you can catch the model guessing whether it’s out of 100 and 255. And it seems to sandbag when it’s told to think less - it doesn’t just think shorter, it thinks worse. Adaptive effort is a mistake. * Reasoning makes a jovial back and forth quite impossible. I really hate the additional latency. A good conversation model would NOT have reasoning. That doesn’t mean it has to be fast. It just has to output tokens faster than I can read or skim, so about 250-500wpm. * OTOH, The CoT of 4.7 is quite enjoyable to read, and is my preferred way to talk to models. As above, the final output is clickbaity and sycophantic. * All models, including 5.5, which is the best at this but still suffers, get hijacked by search results. It’s like old school prompt injection but for their viewpoint. They get derailed and they regurgitate. * I understand WHY we don’t have great conversation models by any lab, because all the $$ is in enterprise not consumer, but I hate it.
English
13
17
202
19.4K
Roeben
Roeben@pathsnotchosen·
Terence Tao is a tech bro now. Sorry, I don’t make the rules.
English
0
0
2
680
Roeben
Roeben@pathsnotchosen·
Btw the first tweet here was - of course - quickly retweeted by an automated Tesla fan account. You have to laugh.
English
0
0
1
528
Roeben
Roeben@pathsnotchosen·
Everybody understands the cost benefit to Teslas only using cameras but these lunatics will insist that Waymos aren’t made safer by having a multi-sensor system and that ACKSHUALLY Teslas are somehow safer for relying entirely on cameras with no fall back for when they fail.
English
1
0
0
780
Roeben
Roeben@pathsnotchosen·
Remarkable how every single YouTube video about Waymo (or Zoox) has a comment section flooded by highly aggressively and deranged Tesla/Musk fans, pouring ludicrous praise on Tesla and attacking every aspect of Waymo. They all repeat the same points they’ve heard Musk make.
English
1
1
5
955
Roeben
Roeben@pathsnotchosen·
*bought* god damn it Elon will never get a cent from me for editing tweets
English
2
0
54
16K
Roeben
Roeben@pathsnotchosen·
@OpenAI Mr Tao is extremely gullible. Sad!
English
0
0
2
40
OpenAI
OpenAI@OpenAI·
AI can give researchers the freedom to pursue “crazier” ideas. For Terence Tao, AI creates more room to experiment, test unexpected paths, and discover what might otherwise stay out of reach.
English
306
634
5.9K
1.4M
Roeben
Roeben@pathsnotchosen·
I conservatively estimate that Ed Zitr*on is at least 100 times more intelligent than Terence Tao.
English
0
0
0
275
Roeben
Roeben@pathsnotchosen·
@KeyTryer I just find myself taken aback by the level of derangement amongst them. It is truly breathtaking. I only dipped into the replies to the Mexican director (and to you) briefly but good lord…
English
0
2
9
211
Roeben
Roeben@pathsnotchosen·
@KeyTryer It’s so bizarre to me that these people’s plan to fight the (eventually inevitable, in some form and to some degree) use of this tool is to just to individually viciously harass and bully any creative who declares that they will use them in any way, one by one, forever.
English
1
1
8
125
Key 🗝 🦊
Key 🗝 🦊@KeyTryer·
Nothing says a thoughtful discussion about ethics like a crowd online ready to fuck up your life, messaging you and your family for 24 hours a day, and sending death threats while encouraged by your own colleagues.
Jorge R. Gutierrez@mexopolis

@BellenSmellen the racist stuff and the attack on my kid were too much

English
4
8
70
2.3K
Shakeel
Shakeel@ShakeelHashim·
@firstadopter do you not think it is a bit arrogant to suggest that the people literally building AI models know less about "how it works" than you do?
English
1
0
18
346
tae kim
tae kim@firstadopter·
The whole AI doomer contigent literally doesn't know how it works
tae kim@firstadopter

@tszzl This whole debate is ridiculous. Computer processors are re-reading prior chat conversations or md text files for context each time they run a query request. There is no consciousness whatsoever, they are just running computation tasks.

English
2
0
33
8.2K
Roeben
Roeben@pathsnotchosen·
Why don't they get a real job in a real industry, like professionally sneering and crying about stupid and useless technology is for a media outlet that's reason for being is technology? Why don't they get a job archly raising an eyebrow or smirking on cam so it can be clipped?
English
0
0
1
192
Roeben
Roeben@pathsnotchosen·
Nilay (Verge Editor-in-Chief) just giggles at this, btw. Yes that's right Nilay, the incredibly smart AI researchers who had the imagination and foresight as children and young people to picture the future capabilities of computers are just stupid fantasists.
English
1
0
1
250
Roeben
Roeben@pathsnotchosen·
It's astonishing how poor The Verge's discussions about AI are. The unearned, contemptuous smugness that drips off every word that Nilay Patel and many of his colleagues have to say about the topic is embarrassing, tedious and illuminates nothing. youtu.be/dKGRe0mDF70?si…
YouTube video
YouTube
English
1
0
1
491