Simon Ouellette

55 posts

Simon Ouellette

@SimonOuellette6

I work with robots, lasers and neural networks.

Katılım Mayıs 2017

59 Takip Edilen113 Takipçiler

Simon Ouellette@SimonOuellette6·3d

I want an AI orchestration framework that knows LLMs are dumb and mess things up all the time, so I vibe coded my own: github.com/SimonOuellette… I write tickets, specify their dependencies, and once it completes it waits for human verification before moving onto dependent tasks. It's like managing a team of (mediocre) junior developers.

English

Jack Cole@MindsAI_Jack·5d

Is anyone who was super excited for OpenClaw still using it and finding it beneficial?

Wes Roth@WesRoth

Google search volume for "OpenClaw" has experienced a massive crash, plummeting right back down to pre-launch baseline levels almost as quickly as it spiked.

English

4.2K

Simon Ouellette@SimonOuellette6·5 Mar

@fchollet This is a classic bias/variance trade-off. LLMs are very weakly biased, so data inefficient compared to humans that have meta-learned biases via millions of years of evolution.

English

993

François Chollet@fchollet·5 Mar

I keep reading this take (below) every few months, presented as if extremely profound, and it is just offensively dumb. It confuses data and information, it ignores the fact that not all information is equally valuable, and it ignores the importance of retention rate. As a thought experiment: if this were true, if your retina cell count were 10x greater, you'd be "trained on 10x more tokens" and therefore you'd be way smarter. Same if their firing frequency were 10x greater. With 10x more retina cells firing 10x faster you'd be "trained on 100x more tokens"! Obviously this makes no sense -- the signal coming from these cells is extremely correlated over space and time, so their raw information content (what remains post-compression) is extremely low compared to the "raw bit" encoding. The human visual system actually processes 40 to 50 bits per second after spatial compression. Much, much less if you add temporal compression over a long time horizon. Latest LLMs get access to approximately 3 to 4 orders of magnitude of information more than a human by age 20 (post compression in both cases). About O(10T) bits vs O(10-100B) bits. And that's just *raw information* but of course not all information is equal, otherwise we wouldn't be spending tens of billions of dollars on training data annotation and generation. Plus, that's only *information intake* but of course humans have far lower retention than LLMs (by 3-4 OOM). You could write a short essay about how incredibly off the mark this take is.

English

108

142

1.7K

172.6K

Simon Ouellette@SimonOuellette6·2 Mar

@fchollet 100%. Said differently: intelligence is an adaptive strategy in response to the non-ergodicity of life.

English

François Chollet@fchollet·2 Mar

At its core, fluid intelligence is a survival strategy in novel, adversarial environments.

English

675

41.1K

Simon Ouellette@SimonOuellette6·28 Şub

@fchollet @MLStreetTalk Is this what is happening with the big frontier models saturating the ARC-AGI-2 public leaderboard? Did they just brute force it?

English

104

François Chollet@fchollet·27 Şub

For benchmarks that target novel tasks, a common form of benchmark hacking that arbitrages this gap is to generate a dense sampling of potential tasks by manually parameterizing the space and then brute-forcing it. Very expensive but it works. There's little you can do to restore benchmark validity here besides increasing the dimensionality of the task space.

English

168

19.7K

François Chollet@fchollet·27 Şub

Even after the steep progress of the past 3 months, it remains that AI performance is tied to task familiarity. In domains that can be densely sampled (via programmatic generation + verification), performance is effectively unbounded, and will keep increasing from current levels. In novel, unfamiliar domains, performance remains low and further progress still requires new ideas, not just more data and compute.

Taelin@VictorTaelin

Ok, I think my experiment leaving AI working on stuff 24/7 ends here. It doesn't work. Code explodes in complexity, results are not that great, the AI can't get past hard walls (it is still completely unable to even *grasp* SupGen), and it is insanely expensive (spent ~1k over the last 2 days). The best results are on the JS compiler, mostly because it is familiar (compared to inets), but not worth losing control over the codebase. I think the dream of having AI's working on the background and making real progress on things that matter (i.e., truly new things) isn't here yet. It is still a machine hard-stuck on its own training data, incapable of thinking out of the box. It is great for building things that were already built. But not new things Also coding normally has the under-appreciated advantage that you're doing two things at the same time: building a codebase *and* learning it. AI's do only half of that. The other half is obviously impossible 🤔

English

135

1.4K

123.9K

Simon Ouellette@SimonOuellette6·14 Şub

The message is clear: execution-guided neural program synthesis that leverages powerful CoT-trained LLMs is the way to go for ARC-AGI. This lineage of solutions just keeps beating new records. The question that remains for us working on the Kaggle competition (instead of public leaderboard) side of things is: how do we package this concept into something tiny and efficient to achieve the same results? Is it even possible to have all the required core knowledge priors to achieve this level of coding success in a medium-size, efficient model that can run in the allotted time?

English

Simon Ouellette@SimonOuellette6·6 Şub

@riesinclair @cremieuxrecueil It's in the DSM-5. Difficulties with abstract reasoning, cognitive flexibility and conceptual thinking are characteristic traits and frequently reported deficits in autism spectrum disorder.

English

582

Rie Sinclair@riesinclair·6 Şub

@SimonOuellette6 @cremieuxrecueil There’s no “lack” of abstracting concepts. There’s a massive communication difference. Think right hemisphere vs left. Explain in a way that makes sense and there’s no problem unless added intellectual limits which can happen to anyone.

English

726

Crémieux@cremieuxrecueil·5 Şub

There might be a biological substrate for this. In autistics, there's significantly reduced cortical synaptic pruning, particularly during childhood and adolescence, leading to an overabundance of synapses in the brain. Autistic brains do ~1/3 the pruning normal brains do!

critter@BecomingCritter

the red lines represent trains

English

101

1.3K

89.3K

Simon Ouellette@SimonOuellette6·26 Oca

One of the disadvantages of using a DSL-based approach for @arcprize is that you have to write the program ground truths for all your training data samples. This is time consuming, and it means you can't readily leverage all available public training data. In order to ramp up my training data generation efforts, I implemented a "Task DB" manager UI and an automatic task augmentation framework. This might be interesting for you from a standpoint of diversifying your training dataset -- my framework can be used to generate task samples and manually (but also "semi-automatically") create new task sample generators. Code, documentation, and current task DB (which will grow over time) available at: github.com/SimonOuellette…

English

Simon Ouellette@SimonOuellette6·25 Ara

Sometimes it's good to zoom out a bit:

English

Simon Ouellette@SimonOuellette6·25 Ara

I'm skeptical of the theory that this chart is explained by AI taking entry-level programmer jobs. First, when ChatGPT first came out, it was a terrible coder: there is no way it was used to replace programmer jobs right away. Second, at my work, we use AI to increase our code output, not to hire less programmers. The demand for code is not static. Increased code output == faster scaling and market adaptation. Third, this is likely a case of correlation != causation.

English

118

Simon Ouellette@SimonOuellette6·22 Ara

Yann LeCun: "No Free Lunch Theorem" Demis Hassabis: "No, you're wrong. Turing completeness!" Of course Yann LeCun is correct that human intelligence is specialized to whatever task distribution humanity has been facing over the course of its evolution. We have inductive biases and priors that make us super efficient at computing things that really matter to us (e.g. identifying objects in movement and predicting where they'll be next), and make us really inefficient at things we don't really care about (e.g. what is the 55th prime number?). Demis Hassabis is also right that we are Turing complete learners. However, a brute force search over the space of all possible Python programs is a Turing complete solver: so clearly intelligence is about more than just Turing completeness. One is making a claim about efficiency and performance over task distributions, the other is making a claim about computability. Both claims are correct. In a way this is like the Nature vs Nurture debate spilling over into AI...

English

129

Simon Ouellette@SimonOuellette6·19 Ara

I think that if we build an @arcprize puzzle set based entirely on priors that humans do not have -- priors with high Kolmogorov complexity in human reasoning terms -- we'll struggle on them as much as current AI struggles on ARC-AGI-2. Meanwhile, a neural program synthesis approach that has these "alien priors" built-in would find the solutions quite easily, and an NN could be trained to solve the training set. My point: Now that the "static model" paradigm is dead, I think the next critical step is figuring out how to infuse one's solution with the exact human priors that were used to generate the ARC-AGI puzzles. So it's mostly all about the pretraining data (and the DSL, for those who use one) -- because that's how you infuse priors into your model. And that is not trivial. NVARC beating the ARChitects by 7.5% in spite of using essentially the same solution but with a better training set is probably a good example of that.

English

772

Simon Ouellette@SimonOuellette6·19 Ara

@MindsAI_Jack I remember that in my essay that won the Lab42 ARC essay competition years ago I was praising the UT like it's the Holy Grail for ARC. The concept is making a comeback it seems!

English

Jack Cole@MindsAI_Jack·19 Ara

Glad someone revived the UT. The CNN part was also done before. Hope they get it verified. The 16% is pass@1, so they could score higher on ARC2.

Rohan Paul@rohanpaul_ai

This paper tweaks a looped Transformer so a small model gets better at multi-step puzzle reasoning without getting bigger. It reports 53.8% correct on its 1st try on ARC-AGI1 and 16.0% on ARC-AGI2, both pattern puzzles. The authors argue most prior gains came from recurrence, repeating the same computation, plus nonlinearity, using activations that bend numbers. A Universal Transformer shares 1 layer’s weights across depth, and that layer mixes parts of the input, then runs many times to refine the internal representation. The Universal Reasoning Model (URM) adds a short 1D convolution, a tiny sliding filter, inside the feed-forward gate, the nonlinear part, so nearby positions mix locally. URM also truncates backpropagation, the training step that sends an error signal backward to update weights, so only later loops get that signal. That keeps training steadier when many loops are used, because very long backward paths can turn the learning signal into noise. Ablations show both tweaks matter, and stripping nonlinear parts drops accuracy a lot, which points to iterative nonlinear refinement as the main driver. ---- Paper Link – arxiv. org/abs/2512.14693 Paper Title: "Universal Reasoning Model"

English

712

Simon Ouellette@SimonOuellette6·15 Ara

I'm thinking of doing an experiment that I'd call "Alien ARC-AGI": we generate puzzles from entirely non-human priors and see how well humans perform on them. Would we be considered dumb by its standards? Would we be forced to rely on a very slow, cognitively expensive brute force search? How many examples would we need to learn those alien priors? Would we perform better or worse than frontier AI is performs on ARC-AGI? How well would our learning generalize to new compositions of the same alien priors?

English

Simon Ouellette@SimonOuellette6·30 Kas

AI generated code has at least two major flaws (note: I use Cursor AI): 1. Tendency towards long, flat, monolithic blocks of code with little to no code reuse. You have to prompt it explicitly to refactor the code in a more modular way, and even so it does it poorly by cutting off at arbitrary places that require method signatures with lots of arguments -- instead of designing the code for modularity from the get go. 2. Tendency towards highly defensive code, checking every possible outcome (even those that literally cannot happen, or that should break and fail if they do happen) -- resulting in bloated code that silently fails. From a purely coding style perspective, AI is the worst programmer I've met in my 20 years of professional experience. The code it generates, unless heavily curated, quickly becomes unmaintainable. Still, somehow, a really useful programming tool nonetheless.

English

Simon Ouellette@SimonOuellette6·28 Kas

My Master's thesis was about showing that if a model is aware of its own epistemic uncertainty, neural planning/reasoning can be done in a very sample efficient way: you search for paths that maximize expected reward while minimzing epistemic uncertainty. So your reasoning doesn't get derailed by hallucinations -- you don't bother with trajectories for which you know you can't make reliable predictions. I keep coming back to this conclusion that epistemic uncertainty is (probably) important for AGI. When it comes to neurally guided search, RL, etc.: the probabilities we use don't actually represent epistemic uncertainty -- instead they reflect the probability distribution found in the training set. The probabilities themselves can be hallucinated when making out-of-distribution predictions. So the model doesn't know that it doesn't know.

English

Simon Ouellette@SimonOuellette6·8 Kas

@MindsAI_Jack The problem is: how do you measure the existence of consciousness? You can't, it's a purely non-physical phenomenon. How do you know I experience qualia? Maybe I'm just a pure automaton with no subjective experience. There is no difference materially/objectively between the two

English

Jack Cole@MindsAI_Jack·8 Kas

@SimonOuellette6 Yes, many see it that way. Wonder if Plank's perspective would be falsified if consciousness controllably emerges in NNs and you could turn it on/off (showing computability).

English

Jack Cole@MindsAI_Jack·8 Kas

Given that so many species are conscious, assuming it is computable, assuming it helps prediction (has a real effect), then it could flicker briefly in LLMs. It appears to, but researchers should try to disprove it using valid methods. They have failed to so far.

Dileep George@dileeplearning

AI Consciousness, qualia, and personhood...my current thoughts. Can AI systems have consciousness? Yes, I think it is possible to build AI systems to have consciousness. While we haven’t pinned down exactly what it means, we will. Consciousness is related to information processing and representation, and substrate independent. Is language required for consciousness? No. Does adding consciousness to an AI system have implications? Adding consciousness will make an AI system more performant. Not all architectures are compatible with implementing a consciousness loop. The performance implications depend more on the kind of world models the AI system has, than the consciousness loop itself. Adding consciousness to simple world models will have almost no effect, whereas adding consciousness to complex human-like world models can have a significant difference in performance. N Will AI systems feel the same ‘qualia’ as we feel? Qualia is the ‘feel’ associated with a sensation or perception. Why does the 3D world feel like it does? Why does red feel like red and not like a bell? For some modalities, AI systems will have qualia similar to ours. For example, perception of space could be where AI systems can have the same qualia, if it is implemented similar to that in animals. However, if spatial perception is implemented using LIDARs, they will have a very different qualia compared to ours. For things like the feeling of taste or smell, qualia can be entirely different for an AI system. For pleasure and pain, quite a large chunk of our qualia comes from our physical embodiment and biochemistry. Those will be very different for AI systems. Is consciousness the same as qualia? While consciousness is required for qualia, not all conscious access might be associated with qualia. Moreover, qualia can be tied to the substrate — the feeling of how food tastes might be tied closely to our biological implementation. Do LLMs feel pleasure and pain. Are they like ours? No, LLMs do not feel pleasure and pain in the sense that we do. Should we assign ‘personhood’ to AI systems? First let’s consider what ‘personhood’ means to us. 1) We are unique and destructible. You cannot be resurrected if you die, at least not yet. And our lifetimes are finite and we roughly know the maximum amount of time we have1. 2) We are people because we have experiences and we remember them — childhood, growing up, teenage years, falling in love, enjoying sports and nature, getting sick, caring for others, loosing loved ones. Our personhood is intricately tied to this experience. AIs do not intrinsically have these properties. They are not mortal. They can have many other characteristics of consciousness we might not have. While it is possible for us to build AIs with some of the human constraints that give us personhood, I don’t see why we need to do that. We should lean on the advantages of the artificial system, rather than impose constraints that lead us to attribute personhood to it. We might build robots for role play and amusement (a la West World) by faking these constraints to give us an illusion of personhood, but those can be built in a way that doesn’t need to grant our robots personhood. Does adding consciousness to an AI system have moral implications? I think consciousness can be decoupled from feelings of pain and pleasure and suffering. I think it is possible to build systems that are conscious, but do not have pain and suffering. Moreover, since AI systems are not mortal, I don’t think they need to be considered as persons. So, I think it is possible to build conscious AI systems to serve us without any moral concerns that we are exploiting or torturing AIs.

English

617

Simon Ouellette@SimonOuellette6·5 Kas

@fchollet You hypothesize, you experiment, you learn. Surely there is a name for that.

English

139

François Chollet@fchollet·5 Kas

ML research is an engineering discipline, not a philosophy seminar. You build, you test, you learn. Untested ideas are just speculation.

English

106

236

2.6K

130.1K

Simon Ouellette@SimonOuellette6·21 Eki

One of the consequences of living in one of those countries that likes to regulate everything to death: I'm no longer eligible to any cash prizes from #arcprize -- Payoneer no longer offers its services in my area. This won't stop me from continuing my research on ARC-AGI, but needless to say it will affect how I prioritize things.

English

Keşfet

@fchollet @MLStreetTalk @riesinclair @cremieuxrecueil @arcprize @MindsAI_Jack @elonmusk @BarackObama