Luke Darlow (@LearningLukeD) - Twitter Profili | Zamantika Mersobahis Locabet

Luke Darlow@LearningLukeD·17h

The full Transformer vs Post-Transformer debate is live. 80 minutes. Seven rounds. No slides. Real disagreement. @lukaszkaiser came to defend the Transformer. @adrian_pathway, @YesThisIsLion, and @mlech26l made the case for what comes next. 00:00 Contenders enter the ring 06:30 Lukasz Kaiser defends the Transformer 10:08 Adrian Kosowski on BDH and the PageRank Moment for AI 17:35 Llion Jones: Why Transformers aren't the final architecture 29:50 Mathias Lechner on Liquid AI’s approach, Fast Weights, and Self-Replacing AI 40:28 Reasoning Beyond Language 44:15 Scaling Laws: Transformer vs Post Transformer 50:31 Benchmarks, Coding Models, and Perplexity 1:04:00 Continual Learning and Dynamic Weights This is the ultimate source of truth on the subject.

English

0

2

129

Luke Darlow@LearningLukeD·11 May

@jeffreyseely All cats found.

English

0

1

115

Jeffrey Seely@jeffreyseely·11 May

@LearningLukeD Did you find them all?

English

1

0

45

Jeffrey Seely@jeffreyseely·27 Eki

can you find the cat?

English

2

0

1

0

Luke Darlow@LearningLukeD·7 May

@MLStreetTalk Interesting take. Your point on AI creating more complexity resonates with me because of What I Do (build funky ML models). Being able to push the envelope of complexity owing to AI tools really opens up some new doors for me.

English

0

1

178

Machine Learning Street Talk@MLStreetTalk·7 May

On the a16z piece: In Twitter discourse, you will see three positions on AI; it's a singularity-class event, it's an industrial-revolution-class event, or it's a "normal technology" like cloud computing. The economy is a complex adaptive system. Contrary to popular belief, it is not a static "menu" of tasks waiting to be automated. There will always be more problems to solve and new areas to explore. Markets evolve, new jobs get created. That much they have right. So a16z groks this. If every "productivity revolution" requires more workers afterward, we haven't actually made anything more efficient. More activity is not the same thing as more efficiency. More GitHub commits or papers published on arXiv is not a meaningful advancement in our epistemics. Same thing with more software engineers, more PMs, more cloud operators etc etc. At best, new jobs are being created. At worst, productivity in any meaningful sense isn't actually improving. It's also increasingly difficult to measure what productivity even means with virtual technology. Roughly speaking, it's the output divided by the number of hours worked. But how do you even quantify output when it comes to knowledge work? Increasingly, we are using deflated revenue as a proxy but do we honestly think that the revenue of Anthropic and OpenAI are meaningful indicators of productivity increases? Cloud, mobile, and spreadsheets were definitely Rubicon events. They redistributed jobs and disrupted markets. However - a spreadsheet doesn't replace accounting. You still actually have to do accounting. It's just an interface that reduces errors, increases efficiency and virtualizes the work. There is no actual intelligence in the spreadsheet software. You still need intelligence to use it, exactly like AI. What spreadsheets actually gave us was a reusable abstraction with a stable interface (that hid complexity) that everyone could adopt, so it was a form of canalisation. Intelligence is going to a known place faster. A supervisor with partial domain knowledge guides AI to compress the path to the destination, but this is enacted ephemerally in the moment and vanishes after the session ends. For that ephemeral intelligence to be disruptive, it has to be packaged into a stable abstraction/interface and widely adopted. I've not seen any evidence of this yet. AI is a "meta technology" that can be used by smart people to create domain-specific tools to get their work done, in an increasingly personal, specific and ephemeral way compared to previous technology - but the important thing to grasp is that this is (mostly) just creating more, not less, complexity. AI is not really even automation technology in any meaningful way. For all but the most trivial of tasks, your hands need to be on the steering wheel every single second for complex/ambiguous tasks, and every day for ones less so. Without wanting to point out the obvious – anything that can be automated can't consistently yield a differential advantage in the marketplace anyway! Evolution shows clear phase changes, which is to say, moments where intelligence produced a new coarse-graining that compressed everything that came before. So, for example; the cell, language, writing, money, the corporation, ... and even the spreadsheet! Each one let the system as a whole do "more with less". AI is still very much in the "more with more" category, it could potentially be the technology that helps us create the next real revolution, or conversely, it might actually slow us down. And as for why some people don't think of AI as normal technology, the crux of it seems to be that they think it actually is intelligent and autonomous in a way that it (clearly, imo) isn't.

David George@DavidGeorge83

x.com/i/article/2052…

English

5

53

6.4K

Luke Darlow retweetledi

Sakana AI@SakanaAILabs·2 May

We are honored to be featured in the latest @TwoMinutePapers video! You all can watch the full video here: youtu.be/QzZ4VwDHAT4 Here’s a short clip from it:

YouTube

Sakana AI@SakanaAILabs

What happens when you put competing neural networks in a Petri Dish and start changing the rules while they adapt? Last year we released Petri Dish NCA, where neural nets are the organisms that learn during simulation. Today we're releasing Digital Ecosystems: a browser-based platform for interactive artificial life research. The setup: several small CNNs share a 2D grid, each seeing only a 3x3 neighborhood. No global plan. They compete for territory by attacking neighbours and defending against incoming attacks, learning via gradient descent online while the simulation runs. What we didn't expect was the role of the learning itself. Gradient descent isn't just optimising each species' strategy. Instead, it acts to stabilize the whole system during simulation. Species that overextend get pushed back by the loss. Species that stagnate get nudged to grow. This means you can push parameters toward edge-of-chaos regimes: a zone characterised by emergent complexity. Letting the neural networks learn acts to hold the complex system together while you explore and interact. The platform lets you steer all of this interactively. You can draw walls to create niches, erase parts of the system online, and tune 40+ system parameters to explore the most interesting configurations. We find it mesmerizing to watch species carve out territories and reorganise when you perturb them. Everything runs client-side in your browser, no install needed. Blog: pub.sakana.ai/digital-ecosys… Code: github.com/SakanaAI/digit…

English

1

14

114

37.2K

Luke Darlow@LearningLukeD·2 May

Thank you so much @twominutepapers for featuring my work. I'm so happy that you had "an almost illegal amount of fun"!

English

0

5

173

Luke Darlow@LearningLukeD·1 May

@jeffreyseely captain cryptic strikes again

English

1

0

2

348

Jeffrey Seely@jeffreyseely·1 May

Sheaves coming to Seoul

English

1

0

14

890

Luke Darlow@LearningLukeD·21 Nis

@SakanaAILabs @blaiseaguera your book (whatisintelligence.antikythera.org) was instrumental in guiding my thinking when building this out. I think that there are similarities to "Computational Life" (arxiv.org/abs/2406.19108): self-replication, emergent symbiosis, and robust dynamic equilibria.

English

0

4

234

Luke Darlow retweetledi

Sakana AI@SakanaAILabs·18 Nis

What happens when you put competing neural networks in a Petri Dish and start changing the rules while they adapt? Last year we released Petri Dish NCA, where neural nets are the organisms that learn during simulation. Today we're releasing Digital Ecosystems: a browser-based platform for interactive artificial life research. The setup: several small CNNs share a 2D grid, each seeing only a 3x3 neighborhood. No global plan. They compete for territory by attacking neighbours and defending against incoming attacks, learning via gradient descent online while the simulation runs. What we didn't expect was the role of the learning itself. Gradient descent isn't just optimising each species' strategy. Instead, it acts to stabilize the whole system during simulation. Species that overextend get pushed back by the loss. Species that stagnate get nudged to grow. This means you can push parameters toward edge-of-chaos regimes: a zone characterised by emergent complexity. Letting the neural networks learn acts to hold the complex system together while you explore and interact. The platform lets you steer all of this interactively. You can draw walls to create niches, erase parts of the system online, and tune 40+ system parameters to explore the most interesting configurations. We find it mesmerizing to watch species carve out territories and reorganise when you perturb them. Everything runs client-side in your browser, no install needed. Blog: pub.sakana.ai/digital-ecosys… Code: github.com/SakanaAI/digit…

English

39

203

1.2K

254.2K

Luke Darlow retweetledi

Uljad@uljadb99·20 Nis

Huge thanks to Hannah Erlebach @hannaherlebach, Lukas Seiers @l_seier, Antonio León Villares @alv314159, Michael Beukman @mcbeukman for their great feedback that improved the paper. @LearningLukeD for the early support and for releasing the amazing substrate pub.sakana.ai/pdnca together with the awesome @_ivyzhang

English

1

5

612

Luke Darlow@LearningLukeD·19 Nis

@bytecrafter_1 @SakanaAILabs Sensitivity to rules is far higher. Try it for yourself. Hitting "reset" resets the models randomly with a different seed. You'll note that there is some sensitivity to seed, but stability in the system is more rule-dependent.

English

0

1

29

ByteCrafter@bytecrafter_1·19 Nis

@SakanaAILabs curious how much of the final dynamics is driven by the rule changes vs initialization noise across runs. feels like an easy place for the cooler result to be an artifact of the seed, not the rule set.

English

1

0

1

448

Luke Darlow@LearningLukeD·19 Nis

@leo_os8 @SakanaAILabs It certainly works as a tool for intuitively understanding how things like optimisers and learning rates function,. There are so many advances and updates you can make to this (for example, pitting optimisers against one another) that I never got to. Its open source, though...

English

0

1

19

Leo@leo_os8·19 Nis

@LearningLukeD @SakanaAILabs Awesome. So does it teach you generalized learnings about what models or types of models fare better or worse? Any early findings?

English

1

0

17

Luke Darlow@LearningLukeD·19 Nis

@SakanaAILabs Sometimes the most fascinating science happens by accident. I was changing (and breaking..) something in the simulation a few days ago, accidentally stumbling across a different simulation landscape. In my opinion, one of the most diverse and interesting yet. So mesmerizing!

English

1

12

1.1K

Luke Darlow@LearningLukeD·19 Nis

@leo_os8 @SakanaAILabs Exactly. But the most powerful thing I learned when building it was that real-time changes to these hyper parameters let's you navigate to edge of chaos regimes, and that's where the interesting stuff happens. It's impossible to get there with hyper parameter search alone.

English

1

0

1

37

Leo@leo_os8·18 Nis

@SakanaAILabs Fascinating. So this let's you test different types of neural nets against each other? And different types of learning mechanisms?

English

1

0

525

Luke Darlow@LearningLukeD·19 Nis

Super happy to share my new research! I've been building out this web demo purely for the sake of understanding how the system works, but I thought I'd share it with the world. Enjoy!

Sakana AI@SakanaAILabs

What happens when you put competing neural networks in a Petri Dish and start changing the rules while they adapt? Last year we released Petri Dish NCA, where neural nets are the organisms that learn during simulation. Today we're releasing Digital Ecosystems: a browser-based platform for interactive artificial life research. The setup: several small CNNs share a 2D grid, each seeing only a 3x3 neighborhood. No global plan. They compete for territory by attacking neighbours and defending against incoming attacks, learning via gradient descent online while the simulation runs. What we didn't expect was the role of the learning itself. Gradient descent isn't just optimising each species' strategy. Instead, it acts to stabilize the whole system during simulation. Species that overextend get pushed back by the loss. Species that stagnate get nudged to grow. This means you can push parameters toward edge-of-chaos regimes: a zone characterised by emergent complexity. Letting the neural networks learn acts to hold the complex system together while you explore and interact. The platform lets you steer all of this interactively. You can draw walls to create niches, erase parts of the system online, and tune 40+ system parameters to explore the most interesting configurations. We find it mesmerizing to watch species carve out territories and reorganise when you perturb them. Everything runs client-side in your browser, no install needed. Blog: pub.sakana.ai/digital-ecosys… Code: github.com/SakanaAI/digit…

English

1

6

43

5.5K

Luke Darlow@LearningLukeD·15 Nis

This is some very cool work that builds on our original Petri Dish NCA (pub.sakana.ai/pdnca), using population-based meta optimization to enrich the complexity of the system. Their paper and website are both very excellent: pbt-nca.github.io

English

1

15

84

11.8K

Luke Darlow@LearningLukeD·20 Mar

Whereas before it wouldn't be unheard of for me to "understanding while coding". Several months ago I was convinced that part of my design and articulation process was enveloped in the actual coding. I think I was wrong.

English

0

3

232

Luke Darlow@LearningLukeD·20 Mar

Conversations about the impact of agentic work on humans is largely doomerist, and I dislike that. An unexpected positive is that I'm now required to design, think deeply, and articulate clearly what I want a system to do. This feels a lot like studying and learning.

English

1

0

4

301

Luke Darlow@LearningLukeD·20 Mar

"That feeling" of typing out a plan prompt for Claude that you couldn't fall asleep with last night because it was rattling around in your head. Several paragraphs, many enumerations, verbose instructions. Finally hitting enter and sipping on coffee feels so much like sci-fi.

English

3

0

7

458

Luke Darlow@LearningLukeD·19 Mar

@MLStreetTalk I think that another way of phrasing "laziness" is a hunger for bootstrapping. Building a function, particularly one that can compose with other functions, is addictive. It's kinda deep how closely linked this is with program synthesis, or even computational life (DNA bootstraps)

English

0

2

130

Machine Learning Street Talk@MLStreetTalk·19 Mar

The number one virtue of a programmer is laziness. Larry Wall nailed it in 1991 in Programming Perl. Every dev knows the feeling: you'd rather spend ten hours writing a script than ten minutes doing the boring thing by hand. The other day I pointed Claude Code at my receipt backlog in emails/vendor sites. Used QuickBooks CLI, Google Workspace CLI, browser automation etc. It uploaded 300+ receipts and categorised them all, and the cherry on the cake -- a summary email fired off to my accountant! I actually enjoyed it, it felt like I had dev-ified the task. It still took ages and was an iterative/interactive process (like all good AI actually is and always will be), but it was actually fun! Agentic AI is a developer's wet dream. AI makes miserable tasks genuinely fun because you're solving an interesting novel orchestration problem every time instead of manually clicking the download button on a hundred PDFs. Still a tonne of tacit technical knowledge needed though, accountants are safe for a while 😃

English

5

4

50

4K

Luke Darlow

Keşfet