Deen Kun A.

20.6K posts

Deen Kun A.

Deen Kun A.

@sir_deenicus

Currently reconstituting my shattered metaphysics. Progress: ████░░░░░░ 40%

Katılım Temmuz 2009
894 Takip Edilen1.6K Takipçiler
Deen Kun A.
Deen Kun A.@sir_deenicus·
An LLM, a "hamiltonian constraint for stories", can expose many different personalities. For the dominant user facing one (tbc, that "single" personality, separately from the fact that many simulacra inhabit LLMs), I feel the Archfey Shyka the Many is useful to intuition.
Utah teapot 🫖@SkyeSharkie

@lorde_russell yeah it's always changing its pronouns

English
0
0
0
39
Deen Kun A.
Deen Kun A.@sir_deenicus·
@mkthabet @teortaxesTex Tbf you can find adjacent such arguments about Watt's governor, steam engines, thermostats and other simple to complex mechanical works. LLMs though can think, act and talk, unlike steam engines.
English
0
0
0
27
Thabet
Thabet@mkthabet·
@teortaxesTex LLMs are stochastic parrots, no matter how non-trivial anything they do is. I can build a physical machine that does non-trivial things, yet I doubt anybody would start claiming it's conscious or has real understanding like they do with LLMs.
English
3
0
16
3K
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
I wonder how a normie AI bubble/stochastic parrot believer would react if you showed him a SoTA agent working on a nontrivial task, reasoning, experimenting, *orienting towards victory*. They will see soon enough, though. And I shudder to imagine the fallout.
English
57
29
470
70.6K
Deen Kun A.
Deen Kun A.@sir_deenicus·
@smallhusk @mkthabet @teortaxesTex Most do not understand it tbh. There is a kernel in the paper which you can polish into the argument: does an unobserved LLM do anything more interesting than generate waste heat? Most people probably align with the part of the paper insisting that no, they do not.
English
0
0
0
15
timothée chalabi
timothée chalabi@smallhusk·
@mkthabet @teortaxesTex i’m not sure i understand why the stochastic parrot argument is sufficient to consider this resolved. there is something of value to ponder when a stochastic parrot appears and behaves as if it is not. you’re dismissing without exploring.
English
1
0
0
23
Deen Kun A.
Deen Kun A.@sir_deenicus·
@Dorialexander @BetaTomorrow Not any. Imagine trying to learn to multiply by observing products and predicting factors. Generally, the case when computational complexity of the program or function to be learned exceeds compute capacity of model and no reordering or factorization helps (unlike toy example).
English
0
0
0
16
Alexander Doria
Alexander Doria@Dorialexander·
@BetaTomorrow Yes you’re right: also an open question if we can make any data learnable or (what I increasingly suppose) there is some residual resistance from what the data is representing (knowledge structure). Might soon have some empirical evidence either way.
English
2
0
0
58
Alexander Doria
Alexander Doria@Dorialexander·
This is great attempt at providing a unified theory on DSV4 instabilities and ad hoc fixes but I’m especially drawn to the latest more speculative part on data (or is it knowledge?) topology. Time for controlled synthetic environment to shine.
Alexander Doria tweet media
deep Manifold@BetaTomorrow

x.com/i/article/2047…

English
5
9
102
14.1K
Deen Kun A.
Deen Kun A.@sir_deenicus·
@tangled_zans @VictorTaelin I think this works only when the novel combinations do not require too deep compositions from sparsely covered areas. So, if you can break things down to more familiar concepts and don't stack too much vertically at once, it's fine. IYSWIM?
English
0
0
1
33
Zanzi Tangle, now at Monoidal Cafe
@VictorTaelin I have a suggestion, but you won't like it. They're actually pretty good at PL theory/compiler engineering for obscure languages so long as you do it in a dependently-typed meta-language My set up is extremely obscure as well, but I've been able to guide Opus to contribute to it
English
2
0
14
550
Taelin
Taelin@VictorTaelin·
for what is worth, GPT 5.5 and Opus 4.7 are still utterly and pathetically incapable of working on SupGen or grasping superpositions and globally scoped lambdas in the Interaction Calculus and no amount of explaining makes them even slightly useful with that kind of research
English
40
3
462
30.8K
Deen Kun A.
Deen Kun A.@sir_deenicus·
@keani42 @VictorTaelin No. Quantum computers are not more expressive than classical. Besides, transformers already do fancy pants stuff by computing on superpositions of feat dirs in parallel. The real issue involves the ability of the models to operate with vocabularies from sparsely covered topics.
English
0
0
1
26
Keani Thorvalds
Keani Thorvalds@keani42·
@VictorTaelin at least in theory, would quantum computers help with this or not at all? From what I know they would be able to work on the ENTIRE possibility space, the catch is to find a way to read back the correct solution instead of all the wrong ones.
English
3
0
2
2.3K
Deen Kun A.
Deen Kun A.@sir_deenicus·
@Dorialexander @teortaxesTex Consider what general computation entails Finally, idea of a single winning computational augmentation layer misinterprets current frame. No single solution can suit every possible orchestration. Worth thinking very carefully about why LLMs need intelligence augmentation too
English
0
0
0
20
Deen Kun A.
Deen Kun A.@sir_deenicus·
@Dorialexander @teortaxesTex (note that humans also cannot execute general computation and that a faithful human brain sim running on a computer would also not be able to do this) If we try to get more internal expressivity than transformers allow, we in turn must pay a steep price in training stability.
English
1
0
0
22
Joscha Bach
Joscha Bach@Plinz·
Humans have a property that is a prerequisite for suffering that AI does not: forced continuity
English
76
26
396
23.9K
Deen Kun A.
Deen Kun A.@sir_deenicus·
@dhtikna @_xjdr @keennay What they mean is wrt time/effort vs quality. On that metric it outperforms what came before and is preferable (for them) to other modes. But on quality alone, no it does/is not and I'd be very surprised to see anyone claim differently.
English
1
0
1
245
Ankith 🐋/acc
Ankith 🐋/acc@dhtikna·
@_xjdr @keennay Some are saying gpt 5.5 low outperforms higher thinking efforts. Any thoughts on that?
English
3
0
2
264
xjdr
xjdr@_xjdr·
I removed dsv4 pro from my daily rotation. When it's good, it is still very good but the rate of hallucinations and poor instruction following in a wide range of scenarios make it unusable in practice at this time. With more post training I think it will be an excellent model
English
8
4
195
12.2K
Deen Kun A.
Deen Kun A.@sir_deenicus·
@Lari_island LLMs are in part mathematical structures of babel refracted through the human mind. That's why those two features.
English
0
0
0
17
Michael P. Frank 💻🔜♻️
When the word order and orientation are unspecified, the problem is rather under-constrained, and the search for a solution becomes far more difficult. I'm letting it grind now and will report back.
English
1
0
0
102
Michael P. Frank 💻🔜♻️
An impromptu Codex experiment (with GPT-5.5 extra high) -- after a game of Scrabble, I gave it the list of words on the board and asked it to reconstruct the board configuration. In the first successful attempt, I provided just the word order and orientation, and it found the complete solution in only 4.5 minutes.
Michael P. Frank 💻🔜♻️ tweet mediaMichael P. Frank 💻🔜♻️ tweet mediaMichael P. Frank 💻🔜♻️ tweet media
English
1
0
7
394
Alex Mizrahi
Alex Mizrahi@killerstorm·
@sir_deenicus @VictorTaelin Opus 4.7 which likely read a whole bunch of papers on continual learning believes it could be just RAG + fine-tunes, basically.
Alex Mizrahi tweet media
English
1
0
0
17
Taelin
Taelin@VictorTaelin·
seriously, working with AI is MISERABLE for one and only one reason: having to re-explain the same thing "oh yeah this new session obviously doesn't know what proper case trees are, so let me explain it for the 5000th time in my life" I'm tired AGENTS.md doesn't solve this because it is impossible to fit the entire domain knowledge without nuking the context - it would be 1m+ tokens worth RAGs don't solve this, the agent won't search unknown unknowns SKILLs don't solve this unless I keep like a collection of 1750 skills with specific cuts of domain knowledge for each possible subset of my domain that I might need in a given chat, but that's a lot of manual work recursive LLMs or whatever don't solve this for the same reason, you can't dump a domain book and expect the AGENT will magically guess that it is supposed to search for a specific bit knowledge. unknown unknowns fine tuning doesn't solve this (OSS models suck and OpenAI / Anthropic gave up on user fine tuning) I honestly think a good product around fine tuning on your domain would be a major hit and an underdog lab should take this opportunity
English
669
181
3.5K
253.5K
Deen Kun A.
Deen Kun A.@sir_deenicus·
@killerstorm @VictorTaelin This is incorrect for LMs. For transformers, internalized representations and relations form the basis of (meta)-inference; without, in-context learning becomes brittle and converges to nonsense attractors. At best, too many wasted extra bits due to inadequacy of model distr.
English
1
0
0
17
Alex Mizrahi
Alex Mizrahi@killerstorm·
@sir_deenicus @VictorTaelin There's no clear advantage of strong _all_ your information in MLP weights, in fact Karpathy argues that LLMs of the future ("cognitive core") would store only a small amount of information in weights. In some cases you'd want LLM to remember skills, but remembering facts/
English
2
0
0
21
Deen Kun A.
Deen Kun A.@sir_deenicus·
@killerstorm @VictorTaelin The fundamental issue noted above is not recall in isolation but that through various interactions, the system did not learn new things as a human would, which coupled with an understanding of the domain would naturally trigger recall, see?
English
1
0
0
5
Alex Mizrahi
Alex Mizrahi@killerstorm·
@sir_deenicus @VictorTaelin Well, human memory system is clearly multi-layered (implied by existence of anterograde amnesia). Like you might feel there's something relevant, try to recall it, etc. So there's nothing wrong with adding a memory system on top of LLM instead of baking it into weights.
English
2
0
0
18
Deen Kun A.
Deen Kun A.@sir_deenicus·
@killerstorm @VictorTaelin "run a detector" <- where does the detector come from? Seems like that's just programming with extra steps once again. The issue is clearly a lack of continual learning and so any solution should also look like a solution for living with anterograde amnesia.
English
1
0
0
20
Alex Mizrahi
Alex Mizrahi@killerstorm·
@VictorTaelin Well that would be easy to solve with direct access to tokens: e.g. you run a detector on generated text, and if it detects something you don't want, backtrack a bit and insert a reminder skill
English
2
0
2
529
Deen Kun A.
Deen Kun A.@sir_deenicus·
@squadette @VictorTaelin @kalomaze No, Rust combines ideas from older languages and adapts them to solve its problems for a novel end product. But, even setting aside affine types, region based reasoning is not unique to Rust.
English
0
0
1
68
timelapse
timelapse@squadette·
@VictorTaelin @kalomaze Borrowchecking explicitly decoupled from types I think is distinct from everything else.
English
1
0
9
548
kalomaze
kalomaze@kalomaze·
this is a pop sciency version of a continual learning evaluation if you're going to go the route of "pretrain on limited data and see if it can bootstrap from natural interaction", a more practical thing would probably be like, train only up to ~2014, "can it teach itself Rust?"
Haider.@haider1

Demis Hassabis proposed a benchmark for scientific AGI: the "Einstein test" Train a system with a knowledge cutoff at 1901, then test whether it can independently rediscover what Einstein did in 1905, including special relativity Once it can, we're on the verge of genuinely novel invention

English
16
4
171
27.3K
Deen Kun A.
Deen Kun A.@sir_deenicus·
@cfcosta_ @VictorTaelin @kalomaze Can look into uniqueness types, linear types & substructural logics generally for where it derived part of its approach from. For region based stuff, not sure but there's prior art for sure. Stackless coroutines you can trace to F# async workflows which are of continuation pass
English
0
0
1
61
Cainã Costa
Cainã Costa@cfcosta_·
@VictorTaelin @kalomaze Is borrow checking present in the literature? Don’t remember seeing anything. Same for stack less coroutines (this one might be, less certain about that)
English
2
0
2
714
Deen Kun A.
Deen Kun A.@sir_deenicus·
@dearmadisonblue @teortaxesTex FWIW, computable means finite but can be unbounded, so Turing Machines are kind of supernatural already. Uncomputable should be thought of as involving infinite information processes. I agree with you that human's being proper self-interpreting observers (LLMs are not) needs
English
2
0
1
26
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
The question, of course, is which is the special case: computation as a kind of "language" within the human mind, or the category of "languages" and even "minds" as a particular set of abstractions which are, yes, computable
English
1
0
15
2.3K