Richard Sutton

424 posts

Richard Sutton banner
Richard Sutton

Richard Sutton

@RichardSSutton

Student of mind and nature, libertarian, chess player, cancer survivor. @ Keen, UAlberta, Amii, https://t.co/u8za2Kod54, The Royal Society, Turing Award

Edmonton, Alberta, Canada Katılım Ekim 2010
59 Takip Edilen60.6K Takipçiler
Sabitlenmiş Tweet
Richard Sutton
Richard Sutton@RichardSSutton·
AI researchers seek to understand intelligence well enough to create beings of greater intelligence than current humans. Reaching this profound intellectual milestone will enrich our economies and challenge our societal institutions. It will be unprecedented and transformational, but also a continuation of trends that are thousands of years old. People have always created tools and been changed by them; this is what humans do. The next big step is to understand ourselves. This is a quest grand and glorious, and quintessentially human.
English
74
146
979
269.5K
Richard Sutton
Richard Sutton@RichardSSutton·
If you are interested, you can learn a bit more about me from this video portrait from the Heidelberg Laureates Forum: youtu.be/jRPR6lx-iuw?si…
YouTube video
YouTube
English
4
18
126
12.5K
Richard Sutton retweetledi
Joseph Modayil
Joseph Modayil@JosephModayil·
A recent paper answered a question I had for over twenty years: how does a brain organize the sense of smell? This mouse study shows a 1 dimensional spatial code gives a brain map for ~1000 different smell sensors. This raises so many more questions. cell.com/cell/fulltext/…
English
2
17
62
11.4K
Richard Sutton retweetledi
Furkan Gözükara
Furkan Gözükara@FurkanGozukara·
Amazing. No words to describe this tune. Emotional and to the point. New Iranian LEGO movie : We Share the Same Pain 💔 😭🥹 via Brick Beat Battalion
English
296
5.2K
12.5K
315.8K
Richard Sutton
Richard Sutton@RichardSSutton·
RL Ethics has a Predictive Semantics I would like to try to explain the view of ethics and values that arises from my research in reinforcement learning in simple, layman’s terms that are accessible to all. Reinforcement learning agents seek to maximize their reward over time, where reward is essentially pleasure minus pain. This is not quite hedonism, because the maximization takes into account all the consequences, long-term as well as short. A reinforcement learning agent might endure pain to get a larger pleasure later, or forego an immediate pleasure if it stored up later, greater pain. Formally, reward is a number at each time step, and the reinforcement learning agent seeks to maximize value—the sum of the rewards at future time steps. (This could be defined precisely with some math.) The assignment of rewards to time steps is a free choice that defines the agent’s goal; different agents could have different rewards, and there is no basis (yet) for preferring one set of rewards over another. Value though is a different matter. Given a world and a way of generating rewards, the true values at each time step are fully determined. The rewards are primary, dependent on nothing else, whereas the values are secondary, following from the rewards (and the dynamics of the environment). In decision making, the agent should make the choice that leads to highest immediate value, not highest immediate reward. If rewards are arbitrary, values follow from the rewards, and correct behavior follows from the values, then all seems straightforward. What about all the complexities and controversies of ethics? Some of these are still present, arising because the values, though well defined, are initially unknown and can be difficult to calculate or learn. If the agent has knowledge of the world, then it may be able to calculate the values, but to do so exactly generally requires too much knowledge, computation, and memory. In practice, in new situations the calculation must be done partly at decide time, and cannot be done to completion without slowing down action selection too much. In the absence of knowledge and computation, but given a generous allocation of memory and time, the agent can alternatively learn the values, again approximately. It is common for the agent to store an approximation to the world’s state’s values, and then to gradually improve these approximations—these predictions of subsequent rewards—by further experience. The stored approximate values are immediately available estimates of the desirability of situations; they are directly analogous to our intuitive sense of good and bad. They are ready for immediate use, but may only be rough approximations to the true values. They may be made more accurate with calculation (if the agent has knowledge) or learning (with more experience). This completes the explication of the value system of the individual. Next we will go on to consider the value systems of groups. But the individual forms such an essential foundation that is never replaced, so let’s dwell on it a moment longer by reviewing its stark tenants: Each agent wants to get pleasure (reward) from the world. Pleasure is built-in to the agent and obvious when it happens, but when it will happen depends on the world and must be learned or calculated—and the world is too complex for either of these methods to yield answers that are completely correct. That is, every state of the world has a real, objective value (the amount of pleasure that will follow it), but estimates of its value are subjective. Forming better value estimates is a major cognitive task. They are a key intermediate step towards getting more pleasure from the world. Agents work on this all the time. It determines what they do. If an agent lived alone, then this would be the end of our discussion of values and ethics. But people are not solo agents. Peoples’ worlds are comprised, in part, of other people, and this has many impacts of their attempts to estimate value and obtain reward. They live within groups of agents with whom they interact frequently and whom are major determinants of their success is obtaining reward. And thus, to achieve our reward, each of us must take into account, as best we are able, the rewards and values of those around us. … The most important insight is that it's alright, and perhaps obligatory, for the ultimate value to be hedonic (based on reward), as long as it is not "selfish" (disregarding the impact on others). The ultimate meaning of something being good, or right, or ethical, or moral, is that it will probably have a good outcome for the individual. Whether it will or not is extraordinarily difficult to calculate, so instead we use heuristics—approximations using features of a situation. The mistake is to think that those features are definitional rather that approximate predictive. The real definitional meaning of good is that it turns our well for us on average.
English
12
56
310
31.5K
Richard Sutton retweetledi
Khurram Javed
Khurram Javed@kjaved_·
It is good to have a well-funded place that understands that intelligence is not about distilling human knowledge and skills into neural networks. Most of the current AI systems, while amazing, outsource discovery of knowledge to humans. Systems that learn from human knowledge, in the limit, will be able to know and do everything that humans can do currently. These systems are immensely valuable, but they will be continually superseded by humans who can both use these systems and learn from their own experience. Systems that learn from their own experience, in the limit, will be able to do things that no human can do now and no human would even be able to do. The latter systems would be vastly more powerful.
Ineffable Intelligence@IneffableLabs

Introducing Ineffable Intelligence. Led by David Silver, we're assembling the best engineers and researchers in the world to make first contact with superintelligence. We’ll be solving the hardest problems in AI on the way. Come join us. ineffable.ai

English
8
13
149
22.3K
Richard Sutton retweetledi
Khurram Javed
Khurram Javed@kjaved_·
The naive way to look at the future of AI is to look at numbers like network parameter count, transistor density, and benchmarks, and reason about the future solely from these numbers. This way prevails because it doesn't require thinking about the details of computer architectures and algorithms. It leads to reductionist arguments like "any marginal compute given to an adversary would be devastating because is dangerous." The reality, as Jensen points out multiple times, is more complex. Unless there is an unprecedented breakthrough in chip manufacturing, the path to better AI is through better learning algorithms and architectures. This has been the trend for the last five years and will likely continue to be so. I know from my own work that you can get 100x to 1000x gains in computational efficiency by using better learning algorithms. Jensen also correctly argues that an ASIC that runs a specific model efficiently is not a replacement for the CUDA stack. I have been working with non-traditional algorithms over the past few years (sparse event-driven neural networks), and the specialized ASICs (including Nvidia's Tensor Cores) are largely useless. CUDA cores, on the other hand, are extremely good at running these algorithms even though they were not designed for them. In a world where CUDA didn't exist, CPUs would be the best computing platform for discovering better algorithms, not TPUs. You cannot achieve human-like learning at human-like energy consumption simply by improving chip manufacturing. The right algorithms running on the right chip made by the 7 nm process would vastly outperform the current algorithms running on the best chips made by the 2 nm process.
Dwarkesh Patel@dwarkesh_sp

The Jensen Huang episode. 0:00:00 – Is Nvidia’s biggest moat its grip on scarce supply chains? 0:16:25 – Will TPUs break Nvidia’s hold on AI compute? 0:41:06 – Why doesn’t Nvidia become a hyperscaler? 0:57:36 – Should we be selling AI chips to China? 1:35:06 – Why doesn’t Nvidia make multiple different chip architectures? Look up Dwarkesh Podcast on YouTube, Apple Podcasts, Spotify, etc. Enjoy!

English
3
10
140
21.4K
Richard Sutton retweetledi
James Melville 🚜
James Melville 🚜@JamesMelville·
Lebanon is being destroyed. 1.2 million people, including 400,000 children, forced out of their homes by bombardment. Towns and villages flattened. Innocent civilians killed. Parents grieving over their lost children. There is no justification for this.
English
1.2K
10K
16.6K
325.7K
Richard Sutton retweetledi
Bernie Sanders
Bernie Sanders@BernieSanders·
This week, I will be forcing a vote to block nearly $500 million in bombs and bulldozers to Israel.  Enough is enough. U.S. taxpayers must not keep funding the Netanyahu government’s mass killing and displacement of civilians in Gaza, Iran and Lebanon.
English
3.5K
12.6K
75.1K
1.2M
Richard Sutton retweetledi
Runas Dos Lunas
Runas Dos Lunas@DosRunas·
🔴 En un gesto de firmeza histórica que resuena en medio del caos existencial del conflicto, la primera ministra italiana Giorgia Meloni ha alzado la voz desde la tribuna de la ONU con una claridad que corta el aire. “Acuso a Israel de haber cruzado la línea roja. Condeno sin ambages la masacre de civiles palestinos y anuncio que Italia apoyará las sanciones europeas contra el Estado israelí”, declaró con determinación. Este pronunciamiento marca un giro valiente y necesario en la posición de un gobierno que, hasta ahora, había mantenido lazos estrechos con Tel Aviv. Meloni no solo cuestiona la proporcionalidad de las operaciones militares, sino que denuncia abiertamente la violación de las normas humanitarias más elementales, esa barbarie que convierte barrios enteros en tumbas y sueños de infancia en escombros. Habla de una “strage” una carnicería inaceptable, que ya no puede ser ignorada bajo el manto de la “autodefensa” cuando el precio se paga con la sangre de inocentes. La decisión de Italia de respaldar medidas restrictivas a nivel europeo, junto con la suspensión de la renovación automática del acuerdo de cooperación en defensa, revela una postura que prioriza la dignidad humana por encima de alianzas estratégicas cómodas. Es un acto de coherencia ética en un mundo donde el silencio cómplice se ha vuelto demasiado frecuente. Mujeres con espina dorsal de acero, sí. Porque hace falta una columna vertebral forjada en la hermenéutica del dolor ajeno, en la fenomenología del sufrimiento que no se puede seguir normalizando, para mirar de frente al poder y decir: “Hasta aquí”. En estos tiempos de oscuridad imperial, donde el derecho internacional se dobla como papel ante la fuerza bruta, gestos como este de Meloni nos recuerdan que la verdadera soberanía no radica en la obediencia a bloques hegemónicos, sino en la defensa intransigente de la vida, de la justicia y de ese humanismo profundo que nos hace, aún en medio del horror, seguir creyendo en la posibilidad de un mundo menos cruel. Que esta voz inspire a más líderes a romper el silencio sean de izquierda o derecha y que el canto por la paz en Palestina no se apague. Porque cada civil asesinado es una sinfonía interrumpida, y la humanidad entera merece que volvamos a entonarla completa. 🔥
Español
144
2K
4.2K
151.5K
Richard Sutton retweetledi
François Chollet
François Chollet@fchollet·
One thing about DL researchers that has always been surprising to me, is that a lot of them have never been exposed to forms of learning other than fitting the parameters of a curve via gradient descent, and are even unable to conceive that there might exist other options
English
90
129
1.7K
152.8K
Richard Sutton retweetledi
Bernie Sanders
Bernie Sanders@BernieSanders·
While the world focuses on the destruction in Iran, we must not ignore what Israel is doing in Lebanon. 1,461 have been killed. 4,430 have been injured. 1.2 million have been displaced. Israel now occupies 14% of Lebanon. Enough is enough. No more US military aid to Israel.
English
6.2K
44.5K
193.2K
3.2M
Richard Sutton retweetledi
Massimo
Massimo@Rainmaker1973·
This might be one of the most detailed Moon images ever captured 1000 frames stacked using a Nikon Z8 and Takahashi TSA-120 telescope, producing a stunning 40MP masterpiece
English
53
462
3.4K
116.8K
Richard Sutton retweetledi
Ron Paul
Ron Paul@RonPaul·
Trump's Most Unhinged Social Media Post Yet
English
37
146
755
32.9K
Richard Sutton retweetledi
Masoud Pezeshkian
Masoud Pezeshkian@drpezeshkian·
To the people of the United States of America
Masoud Pezeshkian tweet mediaMasoud Pezeshkian tweet mediaMasoud Pezeshkian tweet mediaMasoud Pezeshkian tweet media
English
8.5K
47.7K
167.9K
18.2M