Timoleon (Timos) Moraitis

2.1K posts

Timoleon (Timos) Moraitis

@timos_m

Building brain-like AI @noemon_ai Previously @Huawei @IBMResearch @ETH_en @UZH_en @ntua

Zurich, Switzerland Katılım Ocak 2009

1.9K Takip Edilen1.4K Takipçiler

Timoleon (Timos) Moraitis@timos_m·13h

@teortaxesTex It's not just lack of understanding - it's mechanistically nontrivial.

English

124

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex·22h

Still fascinating how LLMs don't understand the "Not X - Y" pattern. Something mechanistically nontrivial here.

Amerikanets 📉@ripplebrain

The AI posts are getting noticeably worse lately. wtf do you mean "this isn't resilience—it's infrastructure"

English

107

8.5K

Timoleon (Timos) Moraitis@timos_m·1d

@kevinmcld "The service said it would use this principle to build the first of its Perceptron thinking machines that will be able to read and write." Admittedly, at least this part worked.

English

De myth ology@kevinmcld·1d

@timos_m It doesn't really work, it just appears to fleetingly. The binary was and still is just a toy model of reality.

English

Timoleon (Timos) Moraitis@timos_m·1d

In the 50s even. The roadmap was off by 70 years, but it worked ultimately.

Dane Malenfant@dvnxmvl_hdf5

@kalomaze It feels like everything fun was defined in the 90s and now we are just scaling

English

302

Timoleon (Timos) Moraitis@timos_m·1d

@Dorialexander Is "Alexander Doria" a pseudonym for Alexandria and its library?

English

137

Alexander Doria@Dorialexander·2d

I mean, we still have to test this experimentally. (guess who has 700b tokens pre-1905…)

Simo Ryu@cloneofsimo

Its very possible that LLM trained on newtonian physics may never come up with relativity to explain cosmic scale gravity. In that case Einstein would have to intervene and solve it instead. But would he have had come up with it, assuming he offloaded all the physics problem solving to LLMs? I think this is serious problem. Undoubtedly many GOATs are only GOATs because they built all the intuition from problem solving themselves. Grothendieck famously reinvented measure theory from scratch when he was teenager. If people offload their RL envs they couldve used, to LLMs, we will never get the next Einstein

English

12K

Timoleon (Timos) Moraitis@timos_m·1d

@dvnxmvl_hdf5 @kalomaze In the 50s even. They were off by 70 years in their roadmap, but it ultimately did work.

English

Dane Malenfant@dvnxmvl_hdf5·2d

@kalomaze It feels like everything fun was defined in the 90s and now we are just scaling

English

12.1K

kalomaze@kalomaze·2d

even sutton's writings from the 90s have this property

James Campbell@jam3scampbell

it's fascinating how much of AI development was prophesied ahead of time... even reading through old RL texts, it's as if the authors knew they were only playing with toy problems in preparation for the "real RL" to come in the distant future

English

110

11.4K

Timoleon (Timos) Moraitis@timos_m·1d

@eshear You may find our work on this interesting. proceedings.mlr.press/v162/rodriguez…

English

Emmett Shear@eshear·1d

Learning is as much about effectively forgetting noise as it is about remembering signal.

English

412

19.7K

Timoleon (Timos) Moraitis@timos_m·4d

@SchwabeHenning @leothecurious @sindy_loewe @BellecGuill It works for me, but here it is again, just in case. openreview.net/forum?id=8gd4M…

English

davinci@leothecurious·4d

i guess we're finally getting there

davinci@leothecurious

> prediction will be recasted as a hierarchical objective distributed across multiple levels of abstraction and temporal horizons along the depth of the model while being dynamically modulated to support optimally adaptive inference and learning

English

132

14.7K

Timoleon (Timos) Moraitis@timos_m·5d

@nblqbl @noemon_ai The idea is: Localized learning --> localized features --> more compositional features Coincidentally, the banner on my profile here shows examples of such features from our ICLR 2023 paper openreview.net/forum?id=8gd4M… Yes, you could try to play with this: x.com/i/status/20340…

Timoleon (Timos) Moraitis@timos_m

@SuryaGanguli @noemon_ai Yes, that older work is public github.com/NeuromorphicCo…

English

Nabil Iqbal@nblqbl·5d

@timos_m @noemon_ai super interesting. is it easy to understand why having a local learning rule would lead to features with these properties? (is there e.g. a toy model?)

English

147

Timoleon (Timos) Moraitis@timos_m·6d

Models trained with certain biologically-inspired local learning rules rather than backpropagation can be MUCH more robust to adversarial attacks, as we have shown in the past. Our view at @noemon_ai is that the locality of Hebbian plasticity in certain architectures leads to more compositional features, which in turn helps address the binding problem and adversarial robustness. In this new framework of "Perceptual Manifold" (PM) by @AleSalvatore00, @stanislavfort and @SuryaGanguli, our models would have measurably smaller PM than the backprop-trained ones.

Surya Ganguli@SuryaGanguli

Our new paper: "Solving adversarial examples requires solving exponential misalignment", expertly lead by @AleSalvatore00 w/ @stanislavfort arxiv.org/abs/2603.03507 Key idea: We all want to align AI systems to human values and intentions. We connect adversarial examples to AI alignment by showing they are a prototypical but exponentially severe form of misalignment at the level of perception. The fact that adversarial examples remain unsolved for over a decade thus serves as a cautionary tale for AI alignment, and provides new impetus for revisiting them. We shed light on why adversarial examples exist and why they are so hard to remove by asking a basic question: what is the dimensionality of neural network concepts in image space? For ResNets, and CLIP models, we show that neural network concepts (the space of images the network confidently labels as a concept) fill up almost the ENTIRE space of images (~135,000 dimensions out of ~150,000 for ImageNet & ~3000 out of 3072 for CIFAR10). In contrast natural image concepts are only ~20 dimensional. This indicates exponential misalignment between brain and machine perception (neural networks perceive exponentially many images as belonging to a concept that humans never would). This also explains why adversarial examples exist: if a concept fills up almost all of image space, ANY image will be close to that concept manifold. We further do experiments across > 20 networks showing that adversarial robustness inversely relates to concept dimensionality, though the most robust networks do not completely align machine and human perception. Overall the curse of dimensionality raises its ugly head as an impediment to both adversarial examples and alignment: if can be difficult to get AI systems to behave in accordance with human intentions, values, or perceptions over an exponentially large space of inputs. See @AleSalvatore00's excellent thread for more details: x.com/AleSalvatore00…

English

7.4K

Timoleon (Timos) Moraitis@timos_m·6d

@tab_delete Thou knowest, he wrote it in Greek though, so why use archaic English?

English

1.4K

Theo Baker@tab_delete·6d

Paging Marcus Aurelius… (“Look within. Within is the fountain of the good, and it will ever bubble up, if thou wilt ever dig.”)

More Perfect Union@MorePerfectUS

Billionaire Marc Andreessen says he has "zero" introspection, and that the idea itself is a modern invention.

English

556

54.8K

Timoleon (Timos) Moraitis@timos_m·6d

@eshear If all models are overfit, does "overfitting" mean anything, and does it identify how to solve the issue? But you are technically right.

English

104

Emmett Shear@eshear·6d

@timos_m It is not a separate problem. It’s literally too many dimensions in the representation space — overfit.

English

223

Emmett Shear@eshear·6d

Overfit! Overfit! Overfit!

Alessandro Salvatore@AleSalvatore00

Why can't we solve adversarial examples? After a decade of work, neural nets still get fooled by imperceptible noise. We think we finally know the geometric reason why — and it connects to AI alignment. 🧵

English

8.7K

Timoleon (Timos) Moraitis@timos_m·6d

@AleSalvatore00 x.com/i/status/20340…

Timoleon (Timos) Moraitis@timos_m

QME

3.4K

Alessandro Salvatore@AleSalvatore00·6d

English

801

68.9K

Timoleon (Timos) Moraitis@timos_m·6d

@SuryaGanguli @noemon_ai Yes, that older work is public github.com/NeuromorphicCo…

English

146

Surya Ganguli@SuryaGanguli·6d

@timos_m @noemon_ai Cool! We would love to look at some of your trained networks! Is it possible to share?

English

148

Timoleon (Timos) Moraitis@timos_m·17 Mar

@WebstarDavid @beffjezos @extropic He's a big guy, but c'mon!

English

David Webster@WebstarDavid·17 Mar

@beffjezos @extropic how many years till u are bigger than nvidia?

English

1.2K

Beff (e/acc)@beffjezos·17 Mar

In 3.5 years @extropic: -reinvented how to use the transistor -reinvented architectures for probabilistic compute -reinvented deep learning for thermo compute -created our CUDA-like THRML -created our TF-like framework (coming soon) -scaled our systems 1000x yoy (3 gens of TSUs)

English

764

58K

Timoleon (Timos) Moraitis@timos_m·16 Mar

arxiv.org/abs/2206.14048 nature.com/articles/s4146…

ZXX

234

Timoleon (Timos) Moraitis@timos_m·16 Mar

Reminder: Long Short-Term Memory (LSTM) has been far outperformed. By Synaptic Plasticity, by us. We also showed this enables chips that burn 100x less power than GPUs. As another reminder, we have ~stopped publishing since. But the progress of the team @noemon_ai has been relentless, so feel free to extrapolate. ICML & Nature Comms paper links in the comment below 👇

Rohan Paul@rohanpaul_ai

Sam Altman just said in his new interview, that a new AI architecture is coming that will be a massive upgrade, just like Transformers were over Long Short-Term Memory. And also now the current class of frontier models are powerful enough to have the brainpower needed to help us research these ideas. His advice is to use the current AI to help you find that next giant step forward. --- From 'TreeHacks' YT Channel (link in comment)

English

2.3K

Timoleon (Timos) Moraitis@timos_m·12 Mar

They don't report only median/robust. The more fine-grained analysis looks even more pessimistic to me. You do have a point, but I think what you/we are experiencing is very specific to ML experimentation. It's a rare combination of high-value long horizons on the one hand, with dense verifiability and standard public knowledge recipes on the other hand.

English

252

Dimitris Papailiopoulos@DimitrisPapail·12 Mar

METR and other long-horizon eval orgs are being conservative and moderate in how they measure agent capabilities. That's reasonable as we have already enough hype and don't need more. But I think we're missing something important by only reporting median/robust performance. I've had Claude Code and Codex sustain end to end ML research tasks for days without intervention. Not robustly across all settings, but it's happening and it's incredible. We need a shameless, cherry-picked frontier eval. Not to mislead but because knowing exactly where the ceiling of capabilities lies is just as important as knowing the average. I keep seeing pessimistic long horizon results and thinking: am I in a bubble? Are MY 50-hour autonomous tasks a hallucination? I don't think they are!! AI agents can do sustained multi-day research. Not always and not for everyone, but it's real and people should know where the frontier actually is.

English

153

18.5K

Keşfet

@teortaxesTex @kevinmcld @Dorialexander @dvnxmvl_hdf5 @kalomaze @eshear @SchwabeHenning @leothecurious