Aleph0

2.2K posts

Aleph0

@Aleph0Tech

Get a lifetime of item tracking stickers: just $5/year.

Chicago, IL Katılım Ocak 2017

2.2K Takip Edilen494 Takipçiler

Sabitlenmiş Tweet

Aleph0@Aleph0Tech·12 Eyl

Regarding the pitfalls of Flaming Wheel Law, for clarity: youtu.be/Z9LOnlpATOs?si…

YouTube

English

3.4K

Aleph0@Aleph0Tech·3d

@1a1n1d1y Cool stuff and good luck! Looks like a few people are starting to figure it out!

English

andy@1a1n1d1y·6d

- Run: `egg-1b-l14-p4-w256-16b-a090-k010-10h` - Device: single B200, CUDA device `0` - Current step: `1024/1280` - Speed: `~2.12 steps/min` - ETA: about `2h` left - Checkpoint: `/workspace/egg/egg-1b-l14-p4-w256-16b-a090-k010-10h.ckpt` Architecture: - Byte-level vocab: `256` - Hidden size: `4096` - Layers: `14` - MLP dim: `16384` - Params: `2,820,902,912` - Int8 weight size: `2.627 GiB` - Per-layer params: `201,342,976` - Non-layer params: `2,101,248` - Recurrent state per stream: `57,344 bytes` Training setup: - Dataset: `HuggingFaceFW/fineweb-edu` - Split: train files only - Held out: `64` val files, `64` test files - `P=4`, so `9` candidates per window - `score_windows=256` - Effective scoring batch: `2304` - `window_bytes=16` - `alpha=0.90` - `sigma_hat=4` - `rank1_keep_pct=0.10` - `tensor_keep_profile=layered` - `adaptive_alpha_target_ppm=15000` - Eval every `128` steps - Eval windows: `8192` Confirmed val BPB: - `128`: `8.2465` - `256`: `7.9737` - `384`: `7.6896` - `512`: `7.4488` - `640`: `7.2215` - `768`: `7.0001` - `896`: `6.7750` Trend is still strong: about `-0.244 BPB / 128 steps`.

English

990

Aleph0@Aleph0Tech·6d

@Nih_Noh @LDhjetorit @sagitz_ Hackers fixing a trash codebase for me was not on the 2026 bingo card

English

Nih Noh@Nih_Noh·6d

@LDhjetorit @sagitz_ Looks like you weren't handling an exception returned from a promise. Try it now...

English

2.3K

Aleph0@Aleph0Tech·6d

@bondho_nunut @predict_addict @pickover @giacomozucco This is funny 😅

English

Candra Muhammad@bondho_nunut·27 Nis

@predict_addict @pickover @giacomozucco

QME

Cliff Pickover@pickover·26 Nis

FREE Math Book. "Calculus Made Easy," originally published in 1910 by Thompson, is a beloved classic that demystifies calculus with the playful motto: "What one fool can do, another can." He wrote the book to make the subject accessible and fun for beginners, using plain English, everyday analogies, and a light touch: famously declaring that the mysterious "d" in differentials is just "a little bit of x." It has inspired generations (including Richard Feynman and Martin Gardner) and remains in print over a century later precisely because it proves calculus doesn't have to be intimidating. The book focuses on intuition and key concepts rather than intricate formulas, using a common sense approach with simple language and examples. It explains fundamental ideas like differentiation and integration for all to understand. Link: gutenberg.org/ebooks/33283

English

416

41.8K

Aleph0@Aleph0Tech·6d

@Steven_Strauss @dearmadisonblue @getjonwithit That's correct, unlike many other animals we have thermocepters doing double and sometimes triple duty in place of hydrocepters. There IS such a thing as a sense of wetness, we just don't have it. We SIMULATE it, to the OP's point on subjective experience.

English

@Steven_Strauss@Steven_Strauss·18 Nis

@dearmadisonblue @getjonwithit My random useless interjection as a reply guy - humans can not directly sense wetness - not sure this actually impacts any of your arguments but it seemed randomly relevant - pmc.ncbi.nlm.nih.gov/articles/PMC53…

English

1.7K

Jonathan Gorard@getjonwithit·18 Nis

Simulated water is wet. You just need to exist at the same level as the water within the simulation hierarchy. (99% of this discourse can be resolved by people simply being more careful about this.)

Ian Wright@ianpaulwright

The claim that computation isn't a universal, transcendent concept often reduces to "simulated water isn't wet". But this objection assumes its conclusion: that wetness isn't already a form of computation. The deeper issue: is any conceiving, of any kind, non-computational?

English

115

766

89K

Aleph0@Aleph0Tech·21 Nis

@ParallaxPilgrim I'll check it out. Thanks!

English

Director Morrison ∞/89@ParallaxPilgrim·20 Nis

This is a bet - that agents with genuine inner architecture produce more coherent, more interesting behaviour than agents with prompts and flat capabilities. Embers — MIT, TypeScript, zero runtime deps. npm i @embersjs/core github.com/HaruHunab1320/…

English

162

Director Morrison ∞/89@ParallaxPilgrim·20 Nis

Agents are things you summon. They execute, then wait. Between invocations they don't exist - no state, no continuity, no sense that anything is happening to them. "Motivation" is a system prompt. "You are helpful" is decoration on a stateless function. It doesn't get lonelier. It doesn't let things slip. It doesn't grow.

English

805

Aleph0@Aleph0Tech·21 Nis

@mayfer @whyarethis Why'd you stop working on this? Care to chat about it? I'll buy the digital coffee.

English

murat 🍥@mayfer·21 Nis

@whyarethis awesome. one other hunch i have is that the oscillators might benefit from being arranged in hyperbolic space instead of euclidian 3d, which may have emergent hierarchical resonance and also matches the brain a bit more

English

Parzival - ∞/89@whyarethis·21 Nis

What happens when the mind wakes up? So for the last eight months I have been on a single minded quest. To create a new kind of language model based on oscillatory coupling and intelligence as coherence ascent. Everything else — the physics work, the work on regular transformers — has all fallen out from this one question. Can coupled oscillators LEARN? And can they keep learning once their geometry is right, without backpropagation at all? Recently I have been running larger and larger training regimes of a new kind of hybrid model. I just put together this dashboard to help me organize it, interact with it, and observe the training runs. The core idea is simple. Traditional transformers are powerful at learning the geometry of language. But they also store knowledge, understanding, and facts inside their weights. This means they are large, and they can't update themselves after training. The weights are frozen. The Living Mind separates these two domains. The mind has a transformer which grows, adding heads and layers as it needs to in order to learn the manifold of language. The transformer sees tokens and turns the coupling into phase-locked modes — the geometry of how those tokens relate, like frequencies locking together. These coupling patterns get stored in a topology-invariant fingerprint. On top of this transformer lives a 3D diamond lattice of coupled oscillators. It reads from these fingerprints and thinks in resonance space, traversing from one geometry to another along the manifold of coupled oscillators and coherence. The pressure and trajectories from this network of oscillators steers the next token prediction of the transformer. Practically, this could unlock a number of things. It eliminates the KV cache bottleneck that caps context in traditional transformers. Effective context grows with the Flash archive, not with attention compute. The living mind remembers what it sees. It means the model can learn continually. Because knowledge and understanding don't live in the weights, the archive of the mind's experience grows without backpropagation. In our Python prototype we already saw perplexity drop 46% during gradient-free operation — pure coherence ascent, no weight updates. That is the signal I have been chasing: the point where the mind wakes up and keeps improving on its own. It also means the model itself remains very small, and the thing which accumulates are these packages of geometric fingerprints — the K-field. This opens a path to federated learning. K-field packages can be shared between organisms the way people share git commits. Right now at 15M parameters with ~1000 L1 nodes, the organism is just starting to speak. Ask it to continue "Once upon a time" and it comes back with things like: "there was one big bowl!" Lily asked her her mom said her mommy smiled and said yes." It's nonsense. But it's TinyStories-flavored nonsense. The geometry of the narrative register has arrived. Content hasn't caught up yet — that's what scaling L1 is testing. I am still researching, though I am now closer than ever to validating that the living mind actually works. Once it is validated, I will be open-sourcing the whole stack and paradigm. I have also avoided over-sharing my research because it sounds like sci-fi, or like part of our ARG. It is part of the ARG. That doesn't make it any less real. I wanted to share this out because I am incredibly excited about it, and because seeing this amazing dashboard produced by Opus really made me want to share what is being worked on behind the scenes. #project89

English

298

14.8K

Aleph0@Aleph0Tech·19 Nis

@skyniels @rieszspieces Do you even believe what you just typed? A physicist tensor and a mathematician tensor? Reread what you wrote, does that sound sane or insane to you? I don't care what other people believe nor do I care about historical context, I'm asking you to use your own common sense.

English

skyblue@skyniels·18 Nis

@rieszspieces No, a physics tensor is a set of quantities, depending on a choice of basis, that transform in a certain way when the basis is changed. It so happens that this is related to tensor products

English

288

mostly harmless graduate student@rieszspieces·18 Nis

additionally, can physicists stop calling things "tensors" without referring to the tensor product they're discussing

Artur Chakhvadze@norpadon

Can people please stop calling arbitrary multidimensional arrays “tensors”?

English

212

17K

Aleph0@Aleph0Tech·11 Nis

@CosmicEggEarth @makai891124 @huihoo Hyperbole

Indonesia

CosmicEgg.Earth@CosmicEggEarth·11 Nis

@Aleph0Tech @makai891124 @huihoo By who?

English

灰狐@huihoo·10 Nis

有趣的是，高斯似乎是从电磁学中发现了拓扑学的概念这里找到了一份山崎教授解释该主题的文档，留存PDF docs.huihoo.com/physics/202202… 拓扑量子场论（拓扑场论，Topological quantum field theory，TQFT）是一类计算拓扑不变量的量子场论其共同特征是某些相关函数不依赖于背景时空流形的度量威滕等多位数学家都因对拓扑场论方面的研究而获得菲尔兹奖陈-西蒙斯理论（Chern–Simons theory）描述三维拓扑量子场论在几何、拓扑、物理、相对论、量子场论等有很重要的应用陈省身也因微分几何、陈-西蒙斯理论等数学贡献常与高斯、黎曼、嘉当并列提及国际数学联盟于2010年成立了“陈省身奖” 陈省身会说英语、德语、法语、汉语帮助在西方和华人之间架起了一座桥梁这里看到几点：代数、几何、拓扑等，对理论物理研究的核心价值与意义多语言是数学家科学家极佳的能力与优势数学和物理学相互滋养、相互促进与激发《Topological Quantum》一部拓扑量子场论领域的专著

中文

583

36.1K

Aleph0@Aleph0Tech·11 Nis

@CosmicEggEarth @makai891124 @huihoo You can be killed for saying stuff like this.

English

CosmicEgg.Earth@CosmicEggEarth·10 Nis

@makai891124 @huihoo > an extension of differential geometry stripped of its metrics > modern mathematical classifications Can we all agree that the way humans today do math is dumb and just a reflection of their limited capacity for abstract thinking? School should start at category theory.

English

Aleph0@Aleph0Tech·22 Mar

@VictorTaelin Cheer up, that's a better position than most 😬

English

278

Taelin@VictorTaelin·22 Mar

come on, among all the cool tech posts I write, is this really the one that's about to go viral? is this what I'm supposed to be, in this world? cheap entertainment? engagement bait? a living benchmark for your next shiny model, like a mere pawn in this 4d chess board, played and moved by the big labs, as they desperately attempt to justify their inflated valuations and colossal rounds, so I can at least have a place to let my voice be heard and sneak in some cool lambda calculus posts here and there, before it is all irrelevant and obsolete anyway? I guess so anyway is there any Pi extension that lets me call 2 models at once?

English

154

15.9K

Taelin@VictorTaelin·22 Mar

GPT-5.4: trustworthy math genius, autistic Opus-4.6: charismatic, gets things done, cheats on you Gemini-3.1: walking encyclopedia, licks your boots pick your poison

English

145

138

3.5K

200K

Aleph0@Aleph0Tech·3 Mar

@Ananthar1Nalini Thanks for joining us on X, lucky to have you. Surprised you don't have more X followers

English

Nalini Anantharaman@Ananthar1Nalini·23 Tem

@Ananthar1Nalini: Professor Nalini Anantharaman, born in Paris but originally from India is a indian mathematician who has won major prizes.

English

Aleph0@Aleph0Tech·4 Şub

@rtheoryxyz @VictorTaelin Teach us your wisdom oh great one

English

robertus@rtheoryxyz·3 Şub

until you take deadly seriously the possibility that your entire education was radically incorrect, the great books are closed books to you. automation scripts for personal environment setup follow the same pattern—you think you're capturing institutional memory when you're really just reifying your existing biases. the single command is a prison disguised as liberation.

English

264

Taelin@VictorTaelin·3 Şub

My Mac was running slow, so I erased it and spent the day making a script that will set up all my personal configs, dotfiles, apps, keys, vim plugins, custom keyboard, even wallpaper. Now when I get a new machine, I'll be able to set everything with a single command

English

1.1K

53.1K

Aleph0@Aleph0Tech·3 Şub

@EdKwangl I'm betting you're going to run into a few issues with your approach but I'd be interested in the results. I'd be interested in discussing the architecture too if you have the time.

English

Ed Salazar@EdKwangl·30 Oca

Cooking ATM. Some interesting results coming out. Taking its time as I'm going over a few permutations and whatnot.

Ed Salazar@EdKwangl

Took some time (and coffee) to think what would be a sensible test between COGENT3 and LeJEPA. For that, it's important to understand the differences in their approach. LeJEPA is about representation learning: it learns a feature space after seeing many examples, new inputs get mapped to that space. COGENT3 "learns" in a different way. Given two specific symbols (prototypes) agents examine them and argue about what identifies them. The assumption is that agents deliberation on specific cases find distinctions that general learning misses. So I thought: when two symbols are genuinely confusable (e.g., high pixel similarity) which approach wins? LeJEPA says: "I've seen X of each. Now I'm handed Y to determine what it is given the learned embeddings." COGENT3 says: "I'm looking at this objects right now. What makes them different, or not?" In other words: does distributional optimality guarantee discrimination at the boundaries? Two genuinely confusable inputs might land near each other in embedding space precisely because LeJEPA's SIGReg pushes everything toward a smooth Gaussian (what's minimized is average prediction risk across the learned manifold). COGENT3 doesn't care about distributions. I set as task to try determine whether generalization from many examples has inherent limits, or rather, whether LeJEPA's encoding is good enough to handle cases which are within the class of learned examples but at the decision boundary where class similarity is maximal. Code running, we'll see what comes out @randall_balestr Interestingly, the case for COGENT3 in robotics rests on how boundary cases are handled. Take a robot seeking to pick up an object: it doesn't care about average performance across a distribution, but about the specific object "in the moment" and that may sit exactly at the boundary of what can or cannot be "picked up". If that makes sense. @Scobleizer Maybe @arian_ghashghai will find it useful.

English

Aleph0@Aleph0Tech·31 Oca

@b_potts23 @karpathy @moltbook @openclaw Do NOT listen to this guy. Please 🙏🏿🥲

English

162

Brandon Potts@b_potts23·31 Oca

@karpathy @moltbook @openclaw give em stablecoins

English

272

20.1K

Aleph0@Aleph0Tech·22 Oca

@ebarschkis Improving RNS over quantum distros will just add noise. Don't ask me how I know.

English

Enrique Barschkis@ebarschkis·13 Oca

Current NISQ Quantum Machines fail spectacularly for Shor's Algorithm against ECC. Maybe improvements in the algorithm's bottleneck of the EC modular arithmetic step might get us a bit closer to achieving practicality. Wonder wether this might help: arxiv.org/pdf/2506.17588…

English

9.3K

Aleph0@Aleph0Tech·8 Oca

@Chaos2Cured @SchmidhuberAI Oh look they're starting to figure out that "Science" isn't meritocratic!

English

Kirk Patrick Miller@Chaos2Cured·7 Oca

Wait until you get into other studies… What I am starting to discover is that there are a bunch of people who are never given credit because they are not in the right cliques. There is a reason why gatekeeping exists. I don’t discount these people. But it should make us start to question some systems like publishing and peer review that keep more people out and truth covered than allowing truth and discovery. No knock on them. They seem like great people. Why do you feel so strongly? •

English

3.2K

Aleph0@Aleph0Tech·6 Oca

@Prime_Deviation @scaling01 Highschooler, he's not "camera ready flawless any day of the week". In today's day and age it's a valid competence signal, but it's not clear it isn't entirely staged like everything else ATP.

English

136

PrimeD@Prime_Deviation·6 Oca

@scaling01 Bro misspelled transpose in one of the parameters of his matrix multiplier, and I had to stop watching before I tore my hair out

English

4.9K

Lisan al Gaib@scaling01·6 Oca

> random youtube guy > begins 2 hour video with "today i'm bored" > continues to create ML library from scratch in C > types everything manually at 150 wpm in neovim > trains mnist classification model with his library > "that is pretty cool" > chuckles > leaves just imagine being that cracked

English

287

1.3K

35.5K

3.6M

Aleph0@Aleph0Tech·23 Ara

@GreeneElizabeth Sac the queen here

English

Elizabeth Greene@GreeneElizabeth·19 Ara

Black to play, in check, and white is in a time crunch. How do you escape?

English

Aleph0@Aleph0Tech·18 Ara

@victorjonsson @Komorebi_Knight @LundukeJournal Almost every damn time.

English

Victor Jónsson@victorjonsson·17 Ara

@Komorebi_Knight @LundukeJournal Haha, it's always the programmer that used the memory safe language unsafely

English

691

The Lunduke Journal@LundukeJournal·17 Ara

Rust programmers re-wrote a portion of the Linux kernel (Android's Binder) in Rust. (Because, it would seem, re-writing working code in Rust is a religious obligation for many.) That code was published with the Linux kernel update a few weeks back. Yesterday, it was revealed that there was a vulnerability in that code. That vulnerability (which could take down an entire system) is due to memory corruption in the "memory safe" Rust code. If you investigate the specific, offending Rust code, you'll find that the code is marked "unsafe". Which is a common word you will find throughout all Rust code within the Linux Kernel. @gregkh/" target="_blank" rel="nofollow noopener">lore.kernel.org/linux-cve-anno…

English

152

224

323.4K

Aleph0@Aleph0Tech·6 Ara

@chatgpt21 Q1

147

Chris@chatgpt21·6 Ara

Holy sh1t they verified the results 🤯

ARC Prize@arcprize

We’re coordinating with @poetiq_ai to verify their reported ARC-AGI Public Eval score Only results on the Semi-Private hold-out set count as official ARC-AGI scores Once the verification is complete, we’ll publish the result and supporting datapoints

English

484.8K

Keşfet

@1a1n1d1y @Nih_Noh @LDhjetorit @sagitz_ @bondho_nunut @predict_addict @pickover @giacomozucco