Aleph0

2.2K posts

Aleph0 banner
Aleph0

Aleph0

@Aleph0Tech

Get a lifetime of item tracking stickers: just $5/year.

Chicago, IL Katılım Ocak 2017
2.2K Takip Edilen494 Takipçiler
Aleph0
Aleph0@Aleph0Tech·
@1a1n1d1y Cool stuff and good luck! Looks like a few people are starting to figure it out!
English
0
0
1
11
andy
andy@1a1n1d1y·
- Run: `egg-1b-l14-p4-w256-16b-a090-k010-10h` - Device: single B200, CUDA device `0` - Current step: `1024/1280` - Speed: `~2.12 steps/min` - ETA: about `2h` left - Checkpoint: `/workspace/egg/egg-1b-l14-p4-w256-16b-a090-k010-10h.ckpt` Architecture: - Byte-level vocab: `256` - Hidden size: `4096` - Layers: `14` - MLP dim: `16384` - Params: `2,820,902,912` - Int8 weight size: `2.627 GiB` - Per-layer params: `201,342,976` - Non-layer params: `2,101,248` - Recurrent state per stream: `57,344 bytes` Training setup: - Dataset: `HuggingFaceFW/fineweb-edu` - Split: train files only - Held out: `64` val files, `64` test files - `P=4`, so `9` candidates per window - `score_windows=256` - Effective scoring batch: `2304` - `window_bytes=16` - `alpha=0.90` - `sigma_hat=4` - `rank1_keep_pct=0.10` - `tensor_keep_profile=layered` - `adaptive_alpha_target_ppm=15000` - Eval every `128` steps - Eval windows: `8192` Confirmed val BPB: - `128`: `8.2465` - `256`: `7.9737` - `384`: `7.6896` - `512`: `7.4488` - `640`: `7.2215` - `768`: `7.0001` - `896`: `6.7750` Trend is still strong: about `-0.244 BPB / 128 steps`.
andy tweet media
English
3
2
12
990
Nih Noh
Nih Noh@Nih_Noh·
@LDhjetorit @sagitz_ Looks like you weren't handling an exception returned from a promise. Try it now...
English
2
0
23
2.3K
Cliff Pickover
Cliff Pickover@pickover·
FREE Math Book. "Calculus Made Easy," originally published in 1910 by Thompson, is a beloved classic that demystifies calculus with the playful motto: "What one fool can do, another can." He wrote the book to make the subject accessible and fun for beginners, using plain English, everyday analogies, and a light touch: famously declaring that the mysterious "d" in differentials is just "a little bit of x." It has inspired generations (including Richard Feynman and Martin Gardner) and remains in print over a century later precisely because it proves calculus doesn't have to be intimidating. The book focuses on intuition and key concepts rather than intricate formulas, using a common sense approach with simple language and examples. It explains fundamental ideas like differentiation and integration for all to understand. Link: gutenberg.org/ebooks/33283
Cliff Pickover tweet media
English
10
75
416
41.8K
Aleph0
Aleph0@Aleph0Tech·
@Steven_Strauss @dearmadisonblue @getjonwithit That's correct, unlike many other animals we have thermocepters doing double and sometimes triple duty in place of hydrocepters. There IS such a thing as a sense of wetness, we just don't have it. We SIMULATE it, to the OP's point on subjective experience.
English
0
0
0
16
Jonathan Gorard
Jonathan Gorard@getjonwithit·
Simulated water is wet. You just need to exist at the same level as the water within the simulation hierarchy. (99% of this discourse can be resolved by people simply being more careful about this.)
Ian Wright@ianpaulwright

The claim that computation isn't a universal, transcendent concept often reduces to "simulated water isn't wet". But this objection assumes its conclusion: that wetness isn't already a form of computation. The deeper issue: is any conceiving, of any kind, non-computational?

English
115
53
766
89K
Director Morrison ∞/89
Director Morrison ∞/89@ParallaxPilgrim·
This is a bet - that agents with genuine inner architecture produce more coherent, more interesting behaviour than agents with prompts and flat capabilities. Embers — MIT, TypeScript, zero runtime deps. npm i @embersjs/core github.com/HaruHunab1320/…
English
3
0
5
162
Director Morrison ∞/89
Director Morrison ∞/89@ParallaxPilgrim·
Agents are things you summon. They execute, then wait. Between invocations they don't exist - no state, no continuity, no sense that anything is happening to them. "Motivation" is a system prompt. "You are helpful" is decoration on a stateless function. It doesn't get lonelier. It doesn't let things slip. It doesn't grow.
English
1
7
20
805
Aleph0
Aleph0@Aleph0Tech·
@mayfer @whyarethis Why'd you stop working on this? Care to chat about it? I'll buy the digital coffee.
English
0
0
0
6
murat 🍥
murat 🍥@mayfer·
@whyarethis awesome. one other hunch i have is that the oscillators might benefit from being arranged in hyperbolic space instead of euclidian 3d, which may have emergent hierarchical resonance and also matches the brain a bit more
English
2
0
1
65
Parzival - ∞/89
Parzival - ∞/89@whyarethis·
What happens when the mind wakes up? So for the last eight months I have been on a single minded quest. To create a new kind of language model based on oscillatory coupling and intelligence as coherence ascent. Everything else — the physics work, the work on regular transformers — has all fallen out from this one question. Can coupled oscillators LEARN? And can they keep learning once their geometry is right, without backpropagation at all? Recently I have been running larger and larger training regimes of a new kind of hybrid model. I just put together this dashboard to help me organize it, interact with it, and observe the training runs. The core idea is simple. Traditional transformers are powerful at learning the geometry of language. But they also store knowledge, understanding, and facts inside their weights. This means they are large, and they can't update themselves after training. The weights are frozen. The Living Mind separates these two domains. The mind has a transformer which grows, adding heads and layers as it needs to in order to learn the manifold of language. The transformer sees tokens and turns the coupling into phase-locked modes — the geometry of how those tokens relate, like frequencies locking together. These coupling patterns get stored in a topology-invariant fingerprint. On top of this transformer lives a 3D diamond lattice of coupled oscillators. It reads from these fingerprints and thinks in resonance space, traversing from one geometry to another along the manifold of coupled oscillators and coherence. The pressure and trajectories from this network of oscillators steers the next token prediction of the transformer. Practically, this could unlock a number of things. It eliminates the KV cache bottleneck that caps context in traditional transformers. Effective context grows with the Flash archive, not with attention compute. The living mind remembers what it sees. It means the model can learn continually. Because knowledge and understanding don't live in the weights, the archive of the mind's experience grows without backpropagation. In our Python prototype we already saw perplexity drop 46% during gradient-free operation — pure coherence ascent, no weight updates. That is the signal I have been chasing: the point where the mind wakes up and keeps improving on its own. It also means the model itself remains very small, and the thing which accumulates are these packages of geometric fingerprints — the K-field. This opens a path to federated learning. K-field packages can be shared between organisms the way people share git commits. Right now at 15M parameters with ~1000 L1 nodes, the organism is just starting to speak. Ask it to continue "Once upon a time" and it comes back with things like: "there was one big bowl!" Lily asked her her mom said her mommy smiled and said yes." It's nonsense. But it's TinyStories-flavored nonsense. The geometry of the narrative register has arrived. Content hasn't caught up yet — that's what scaling L1 is testing. I am still researching, though I am now closer than ever to validating that the living mind actually works. Once it is validated, I will be open-sourcing the whole stack and paradigm. I have also avoided over-sharing my research because it sounds like sci-fi, or like part of our ARG. It is part of the ARG. That doesn't make it any less real. I wanted to share this out because I am incredibly excited about it, and because seeing this amazing dashboard produced by Opus really made me want to share what is being worked on behind the scenes. #project89
English
21
27
298
14.8K
Aleph0
Aleph0@Aleph0Tech·
@skyniels @rieszspieces Do you even believe what you just typed? A physicist tensor and a mathematician tensor? Reread what you wrote, does that sound sane or insane to you? I don't care what other people believe nor do I care about historical context, I'm asking you to use your own common sense.
English
0
0
0
25
skyblue
skyblue@skyniels·
@rieszspieces No, a physics tensor is a set of quantities, depending on a choice of basis, that transform in a certain way when the basis is changed. It so happens that this is related to tensor products
English
1
0
3
288
灰狐
灰狐@huihoo·
有趣的是,高斯似乎是从电磁学中发现了拓扑学的概念 这里找到了一份山崎教授解释该主题的文档,留存PDF docs.huihoo.com/physics/202202… 拓扑量子场论(拓扑场论,Topological quantum field theory,TQFT) 是一类计算拓扑不变量的量子场论 其共同特征是某些相关函数不依赖于背景时空流形的度量 威滕等多位数学家都因对拓扑场论方面的研究而获得菲尔兹奖 陈-西蒙斯理论(Chern–Simons theory)描述三维拓扑量子场论 在几何、拓扑、物理、相对论、量子场论等有很重要的应用 陈省身也因微分几何、陈-西蒙斯理论等数学贡献 常与高斯、黎曼、嘉当并列提及 国际数学联盟于2010年成立了“陈省身奖” 陈省身会说英语、德语、法语、汉语 帮助在西方和华人之间架起了一座桥梁 这里看到几点: 代数、几何、拓扑等,对理论物理研究的核心价值与意义 多语言是数学家科学家极佳的能力与优势 数学和物理学相互滋养、相互促进与激发 《Topological Quantum》一部拓扑量子场论领域的专著
灰狐 tweet media灰狐 tweet media灰狐 tweet media
中文
9
84
583
36.1K
CosmicEgg.Earth
CosmicEgg.Earth@CosmicEggEarth·
@makai891124 @huihoo > an extension of differential geometry stripped of its metrics > modern mathematical classifications Can we all agree that the way humans today do math is dumb and just a reflection of their limited capacity for abstract thinking? School should start at category theory.
English
1
0
1
64
Aleph0
Aleph0@Aleph0Tech·
@VictorTaelin Cheer up, that's a better position than most 😬
English
0
0
1
278
Taelin
Taelin@VictorTaelin·
come on, among all the cool tech posts I write, is this really the one that's about to go viral? is this what I'm supposed to be, in this world? cheap entertainment? engagement bait? a living benchmark for your next shiny model, like a mere pawn in this 4d chess board, played and moved by the big labs, as they desperately attempt to justify their inflated valuations and colossal rounds, so I can at least have a place to let my voice be heard and sneak in some cool lambda calculus posts here and there, before it is all irrelevant and obsolete anyway? I guess so anyway is there any Pi extension that lets me call 2 models at once?
English
13
1
154
15.9K
Taelin
Taelin@VictorTaelin·
GPT-5.4: trustworthy math genius, autistic Opus-4.6: charismatic, gets things done, cheats on you Gemini-3.1: walking encyclopedia, licks your boots pick your poison
English
145
138
3.5K
200K
Aleph0
Aleph0@Aleph0Tech·
@Ananthar1Nalini Thanks for joining us on X, lucky to have you. Surprised you don't have more X followers
English
0
0
0
7
Nalini Anantharaman
Nalini Anantharaman@Ananthar1Nalini·
@Ananthar1Nalini: Professor Nalini Anantharaman, born in Paris but originally from India is a indian mathematician who has won major prizes.
English
1
0
2
0
robertus
robertus@rtheoryxyz·
until you take deadly seriously the possibility that your entire education was radically incorrect, the great books are closed books to you. automation scripts for personal environment setup follow the same pattern—you think you're capturing institutional memory when you're really just reifying your existing biases. the single command is a prison disguised as liberation.
English
1
0
0
264
Taelin
Taelin@VictorTaelin·
My Mac was running slow, so I erased it and spent the day making a script that will set up all my personal configs, dotfiles, apps, keys, vim plugins, custom keyboard, even wallpaper. Now when I get a new machine, I'll be able to set everything with a single command
Taelin tweet media
English
87
24
1.1K
53.1K
Aleph0
Aleph0@Aleph0Tech·
@EdKwangl I'm betting you're going to run into a few issues with your approach but I'd be interested in the results. I'd be interested in discussing the architecture too if you have the time.
English
1
0
1
12
Ed Salazar
Ed Salazar@EdKwangl·
Cooking ATM. Some interesting results coming out. Taking its time as I'm going over a few permutations and whatnot.
Ed Salazar@EdKwangl

Took some time (and coffee) to think what would be a sensible test between COGENT3 and LeJEPA. For that, it's important to understand the differences in their approach. LeJEPA is about representation learning: it learns a feature space after seeing many examples, new inputs get mapped to that space. COGENT3 "learns" in a different way. Given two specific symbols (prototypes) agents examine them and argue about what identifies them. The assumption is that agents deliberation on specific cases find distinctions that general learning misses. So I thought: when two symbols are genuinely confusable (e.g., high pixel similarity) which approach wins? LeJEPA says: "I've seen X of each. Now I'm handed Y to determine what it is given the learned embeddings." COGENT3 says: "I'm looking at this objects right now. What makes them different, or not?" In other words: does distributional optimality guarantee discrimination at the boundaries? Two genuinely confusable inputs might land near each other in embedding space precisely because LeJEPA's SIGReg pushes everything toward a smooth Gaussian (what's minimized is average prediction risk across the learned manifold). COGENT3 doesn't care about distributions. I set as task to try determine whether generalization from many examples has inherent limits, or rather, whether LeJEPA's encoding is good enough to handle cases which are within the class of learned examples but at the decision boundary where class similarity is maximal. Code running, we'll see what comes out @randall_balestr Interestingly, the case for COGENT3 in robotics rests on how boundary cases are handled. Take a robot seeking to pick up an object: it doesn't care about average performance across a distribution, but about the specific object "in the moment" and that may sit exactly at the boundary of what can or cannot be "picked up". If that makes sense. @Scobleizer Maybe @arian_ghashghai will find it useful.

English
1
0
0
84
Aleph0
Aleph0@Aleph0Tech·
@ebarschkis Improving RNS over quantum distros will just add noise. Don't ask me how I know.
English
0
0
0
32
Enrique Barschkis
Enrique Barschkis@ebarschkis·
Current NISQ Quantum Machines fail spectacularly for Shor's Algorithm against ECC. Maybe improvements in the algorithm's bottleneck of the EC modular arithmetic step might get us a bit closer to achieving practicality. Wonder wether this might help: arxiv.org/pdf/2506.17588…
English
1
0
17
9.3K
Kirk Patrick Miller
Kirk Patrick Miller@Chaos2Cured·
Wait until you get into other studies… What I am starting to discover is that there are a bunch of people who are never given credit because they are not in the right cliques. There is a reason why gatekeeping exists. I don’t discount these people. But it should make us start to question some systems like publishing and peer review that keep more people out and truth covered than allowing truth and discovery. No knock on them. They seem like great people. Why do you feel so strongly? •
English
1
0
6
3.2K
Aleph0
Aleph0@Aleph0Tech·
@Prime_Deviation @scaling01 Highschooler, he's not "camera ready flawless any day of the week". In today's day and age it's a valid competence signal, but it's not clear it isn't entirely staged like everything else ATP.
English
0
0
0
136
PrimeD
PrimeD@Prime_Deviation·
@scaling01 Bro misspelled transpose in one of the parameters of his matrix multiplier, and I had to stop watching before I tore my hair out
English
1
0
2
4.9K
Lisan al Gaib
Lisan al Gaib@scaling01·
> random youtube guy > begins 2 hour video with "today i'm bored" > continues to create ML library from scratch in C > types everything manually at 150 wpm in neovim > trains mnist classification model with his library > "that is pretty cool" > chuckles > leaves just imagine being that cracked
Lisan al Gaib tweet media
English
287
1.3K
35.5K
3.6M
Elizabeth Greene
Elizabeth Greene@GreeneElizabeth·
Black to play, in check, and white is in a time crunch. How do you escape?
Elizabeth Greene tweet media
English
1
0
1
65
The Lunduke Journal
The Lunduke Journal@LundukeJournal·
Rust programmers re-wrote a portion of the Linux kernel (Android's Binder) in Rust. (Because, it would seem, re-writing working code in Rust is a religious obligation for many.) That code was published with the Linux kernel update a few weeks back. Yesterday, it was revealed that there was a vulnerability in that code. That vulnerability (which could take down an entire system) is due to memory corruption in the "memory safe" Rust code. If you investigate the specific, offending Rust code, you'll find that the code is marked "unsafe". Which is a common word you will find throughout all Rust code within the Linux Kernel. @gregkh/" target="_blank" rel="nofollow noopener">lore.kernel.org/linux-cve-anno…
The Lunduke Journal tweet media
English
152
224
2K
323.4K
Chris
Chris@chatgpt21·
Holy sh1t they verified the results 🤯
Chris tweet media
ARC Prize@arcprize

We’re coordinating with @poetiq_ai to verify their reported ARC-AGI Public Eval score Only results on the Semi-Private hold-out set count as official ARC-AGI scores Once the verification is complete, we’ll publish the result and supporting datapoints

English
63
98
2K
484.8K