RB

2.3K posts

RB

RB

@530RB

Katılım Temmuz 2011
9 Takip Edilen40 Takipçiler
RB
RB@530RB·
@burkov Based on everyone’s comments, I think your argument would be stronger with a solid reason why it also fails for CoT/thinking models
English
0
0
0
76
BURKOV
BURKOV@burkov·
Always remember that when an LLM prints the beginning of a text, it has no idea what the end will be. Therefore, when it says "The answer is yes, and this is why:" the text after "why" would most likely be a very elaborate lie combined with gaslighting in case "yes" was the wrong answer.
BURKOV tweet media
English
70
34
483
53.1K
RB
RB@530RB·
@burkov Or it explains why the LLM does a 180 after saying yes
English
0
0
0
49
Tom Dillon, CFA
Tom Dillon, CFA@profithuntercfo·
@collin_ruth89 it’s not that hybrids don’t work, it’s just that they’re not always worth the extra cost and complexity.
English
19
4
544
60.5K
Collin Rutherford
Collin Rutherford@collin_ruth89·
I don’t understand why every vehicle isn’t a hybrid. Why not use regenerative braking to charge a battery in every car? Huge increase in gas mileage. Why do they still make non-hybrids?
English
1.1K
37
2.7K
623.2K
Olex (Solo gamedev Diablo-like)
I wonder why new "fancy" coding languages refuse to provide user-defined literals. I find them very handy.
Olex (Solo gamedev Diablo-like) tweet media
English
35
4
283
45.2K
KIRI Engine - 3D Scanner App
KIRI Engine - 3D Scanner App@KIRI_Engine_App·
AI-Enhanced LiDAR. Left vs right. A real LiDAR device costs thousands of dollars. What's in your iPhone Pro is a "baby LiDAR". Limited depth resolution, noisy output, not really built for high-precision 3D. You can't change the hardware. So we built an ML layer on top. Denoising, geometry completion, detail recovery. Processed server-side. Same sensor. Very different result.
English
12
62
610
42.9K
RB
RB@530RB·
@redtachyon Just don’t ask it to count r’s
English
0
0
0
31
Ariel
Ariel@redtachyon·
Ok look. Maybe it can generate plausible-looking text. Maybe it can answer general knowledge questions. Maybe it can generate code snippets. Maybe it can answer simple math questions. Maybe it can autonomously research complex topics. Maybe it can do research with a feedback loop. Maybe it can build entire applications in one go. Maybe it can solve open problems in mathematics. But it's not *really* intelligent, you just have AI psychosis
English
26
7
197
12.5K
RB
RB@530RB·
@RyanClogg There are a few occasions I would have paid for this
English
0
0
0
274
Ryan Clogg
Ryan Clogg@RyanClogg·
If you think about it... This is still absolutely insane.
Ryan Clogg tweet media
English
97
84
14.9K
2.2M
RB
RB@530RB·
@1Hassium That was awesome when signals were sent back to the start to continue growth
English
0
0
2
525
108Hassium
108Hassium@1Hassium·
#cellularautomaton #セル・オートマトン x = 15, y = 15, rule = B2e3aeij4a/S1c2-i3-a4-ajrw o$2o$b2o10$12bo$12b2o$13b2o!
CY
6
34
398
29.2K
RB
RB@530RB·
@Hitchslap1 That’s not true at all lol
English
0
0
0
3
Hitchslap
Hitchslap@Hitchslap1·
Vocabulary is way better at measuring intelligence. It is not even close.
Hitchslap tweet media
English
206
20
493
61.4K
RB
RB@530RB·
@PingStruggles Yes but Windows 13 will be the first 100% vibe coded os making Windows 12 partially vibe coded os putting it in the green zone What the graph is missing is a downward slope of both red and green
English
0
0
1
145
Max
Max@PingStruggles·
Windows 12 better not break the cycle just because it’s vibe coded
Max tweet media
English
280
128
3.1K
2.8M
RB
RB@530RB·
@orion78fra @pikuma “Unfortunately, you did not get the answer we were looking for, the solution is was: U(n+1) = 2(U(n) +1) “
English
0
0
0
17
RB
RB@530RB·
@gro_tsen Nice to learn something new, especially when I was about to say this was dumb
English
0
0
0
1.5K
Gro-Tsen
Gro-Tsen@gro_tsen·
Surprising math fact of the day: a monkey is hitting keys at random (uniformly, independently & at constant speed) on a keyboard. The expected value of the time T₁ it takes to type “abracadabra” is greater than the expected value of the time T₂ it takes to type “abracadabrz”.
English
45
40
2.8K
362.9K
RB
RB@530RB·
@NatureUnedited Fear factor was the last time I saw these things
English
0
0
0
24
Nature Unedited
Nature Unedited@NatureUnedited·
Whip spider attacking with its pedipalps (Euphrynichus amanica)
English
245
1.3K
14.6K
5.8M
RB
RB@530RB·
@KarlMuth Could just not include AI and let them see the 0 instead of curving them to 0
English
0
0
0
1.3K
Karl T. Muth 🌐✈️📊
I know there are many (understatement) approaches to AI use where students are being evaluated, and that there is variation between disciplines and levels of study, but I thought I'd share one and perhaps stoke the debate. Anyone (including my students) is welcome to comment...
Karl T. Muth 🌐✈️📊 tweet media
Montréal, Québec 🇨🇦 English
42
13
478
150.3K
RB
RB@530RB·
@Jonathan_Blow Give it a lot of words (say a depth/tree of thesaurus searches) and have it choose the best word from that set. AI is not creative, but it is good at tasks.
English
0
0
0
18
Jonathan Blow
Jonathan Blow@Jonathan_Blow·
I've been trying to use ChatGPT as a thesaurus but it doesn't seem to be very good ... it keeps making generic suggestions even when prompted like "imagine you are very learned, with a huge vocabulary"... it then just picks older generic words. Any hints? English has 600k words!
English
134
2
497
57.8K
Yosarian2
Yosarian2@YosarianTwo·
I am very amused by the "why do people like doordash" discorse You push a button on the magical rectangle in your pocket and a few minutes later any food you can imagine appears Sure it's overpriced and mediocre but this is still high magic and the back of your brain knows it
English
43
36
1.7K
18.7K
RB
RB@530RB·
@RandomSprint No you described how the human is a computer
English
0
0
1
26
RandomSprint🧭
RandomSprint🧭@RandomSprint·
If you think about it, a fully mechanical car can be driven to any position on a piece of infinite tape. It can leave marks by spinning out its tires. The driver can follow instructions regarding where to drive based on the tire marks. Every car is a computer.
messed up cars@messedupcars

English
26
41
1.3K
58.1K
akidderz
akidderz@akidderz·
@nikicaga Scissor stairs weren’t banned because everyone was dumb; they were banned because fire code prized redundancy: two truly separate exits. That safety concern is real. We don’t price the tradeoff: safer vs cheaper. We just act shocked that housing costs exploded.
English
5
2
295
30K
cozyblaze
cozyblaze@cozyblaze265065·
I redid the multi-digit multiplication experiment, now with gpt-5.5. With medium reasoning and 7 samples each cell, it pretty much aced the test with 99.46% accuracy. The model had no tools to call and had to rely on its reasoning. Can it go further? (1/4)
cozyblaze tweet media
Yuntian Deng@yuntiandeng

For those curious about how o3-mini performs on multi-digit multiplication, here's the result. It does much better than o1 but still struggles past 13×13. (Same evaluation setup as before, but with 40 test examples per cell.)

English
30
50
970
178.6K
RB
RB@530RB·
@kittykareninas If you wrote it you can answer questions, in detail, about it. It’s very easy to do if you actually wrote it, but near impossible otherwise.
English
0
0
0
9
kitty
kitty@kittykareninas·
i hate ai detection programs so much. my fully HUMAN WRITTEN essay shows up as 87% ai written. what are you even supposed to do if your teacher brings it up? how can you disprove them?
English
1.6K
7.7K
158.2K
4.2M