Andrew Bonello

21K posts

Andrew Bonello banner
Andrew Bonello

Andrew Bonello

@AndrewBonello

AI & Software. Comedy, Voice, Acting & Improv. Film Analysis & Reviews @FilmTagger_com. @Google @DWAnimation @Cinesite @StHughsCollege alum. Love #Golang.

Katılım Haziran 2009
3.4K Takip Edilen2.5K Takipçiler
Andrew Bonello retweetledi
Patrick McKenzie
Patrick McKenzie@patio11·
This essay is one of the most readable criticisms of US housing finance and policy I’ve ever seen: @byrnehobart/the-30-year-mortgage-is-an-intrinsically-toxic-product-200c901746a" target="_blank" rel="nofollow noopener">medium.com/@byrnehobart/t…
English
6
53
301
0
Andrew Bonello retweetledi
✨ Jean Yang ✨
✨ Jean Yang ✨@jeanqasaur·
@polikarn @thatplguy But! You may (correctly) say: people don't like to write specifications OR types. One of my other favorite papers is by Ras Bodik (again), about using machine learning to automatically mine specifications from program behavior. dl.acm.org/doi/10.1145/50… 6/
English
1
1
5
1.6K
Andrew Bonello retweetledi
François Chollet
François Chollet@fchollet·
I'm joining forces with @mikeknoop to start Ndea (@ndeainc), a new AI lab. Our focus: deep learning-guided program synthesis. We're betting on a different path to build AI capable of true invention, adaptation, and innovation.
François Chollet tweet media
English
126
222
2.4K
193K
Andrew Bonello retweetledi
Burny - Effective Curiosity
Burny - Effective Curiosity@burny_tech·
The Kalman filter is a widely used algorithm for estimating the hidden states of a dynamic system from a series of noisy measurements. It works by recursively predicting the system's state using a dynamic model, and then updating this prediction with new measurement data. Some key points about the Kalman filter: - It is an optimal estimator for linear systems with Gaussian noise, minimizing the mean squared error of the estimated state. [2] - It consists of two main steps: prediction and update. In the prediction step, it estimates the current state based on the previous state and the system dynamics. In the update step, it incorporates a new measurement to correct the prediction. [3] - It accounts for both process noise (uncertainty in the system dynamics) and measurement noise (errors in the sensor data). [1] - It requires a mathematical model of the system dynamics (state transition matrix) and the measurement process (measurement matrix). [3] - The filter is recursive, meaning it only needs the current measurement and the previous state estimate to compute the new state estimate, without requiring storage of the entire measurement history. [2] - It has found widespread applications in areas like navigation, object tracking, signal processing, and control systems due to its effectiveness and computational efficiency. [2] The Kalman filter provides an elegant and powerful solution for state estimation problems involving noisy sensor data and uncertain system dynamics, making it a fundamental tool in many engineering and scientific fields. [1][2][3]
Burny - Effective Curiosity tweet media
English
9
63
698
130.8K
Andrew Bonello retweetledi
Andrew Bonello retweetledi
kasra
kasra@kasratweets·
here are my notes on @fchollet's neat explanation of the differences between deep learning and program synthesis, and the advantages and disadvantages of each, and how they'd fit together to build AGI. in deep learning, your underlying model is a differentiable curve; in program synthesis, your model is a discrete graph of operators – you’re picking from a set of operators and structuring that into a program. this has implications for the amount of compute and data needed for each: - in deep learning your learning engine is gradient descent, which is very compute efficient – you have a very informative feedback signal about where the solution is. but it's very data inefficient — you need a dense sampling of the data distribution. - in program synthesis, your learning engine is combinatorial search. this is extremely data efficient (I believe because the problem space is inherently more constrained?), but it’s extremely compute inefficient (because the search space is massive). how does this apply to AGI? deep learning is great for system 1 thinking; discrete program search is great for system 2 thinking. AGI will likely require a combination of both approaches. Chollet expects that an AGI system would have an outer program that does program synthesis and it will use deep learning to assist it.
kasra tweet mediakasra tweet media
English
24
116
813
113.3K
Andrew Bonello retweetledi
François Chollet
François Chollet@fchollet·
I believe that program synthesis will solve reasoning. And I believe that deep learning will solve program synthesis (by guiding a discrete program search process). But I don't think you can go all that far with just prompting a LLM to generate end-to-end Python programs (even with a verification step and many samples). That won't scale to very long programs.
English
45
73
827
170.4K
Andrew Bonello retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
We have to take the LLMs to school. When you open any textbook, you'll see three major types of information: 1. Background information / exposition. The meat of the textbook that explains concepts. As you attend over it, your brain is training on that data. This is equivalent to pretraining, where the model is reading the internet and accumulating background knowledge. 2. Worked problems with solutions. These are concrete examples of how an expert solves problems. They are demonstrations to be imitated. This is equivalent to supervised finetuning, where the model is finetuning on "ideal responses" for an Assistant, written by humans. 3. Practice problems. These are prompts to the student, usually without the solution, but always with the final answer. There are usually many, many of these at the end of each chapter. They are prompting the student to learn by trial & error - they have to try a bunch of stuff to get to the right answer. This is equivalent to reinforcement learning. We've subjected LLMs to a ton of 1 and 2, but 3 is a nascent, emerging frontier. When we're creating datasets for LLMs, it's no different from writing textbooks for them, with these 3 types of data. They have to read, and they have to practice.
Andrej Karpathy tweet media
English
384
1.8K
11.8K
695.7K
Andrew Bonello retweetledi
Andrew Bonello retweetledi
wh
wh@nrehiew_·
How to train a State-of-the-art reasoner. Let's talk about the DeepSeek-R1 paper and how DeepSeek trained a model that is at frontier Sonnet/o1 level.
wh tweet media
English
15
252
2.5K
290K
Andrew Bonello retweetledi
evanthebouncy
evanthebouncy@evanthebouncy·
want to get into program synthesis but don't know how to started? I wrote a minimalist intro to modern program synthesis that can help you -- from problem formulation to generating code by fine-tuning llm on huggingface. evanthebouncy.github.io/program-synthe…
English
2
17
132
0
Andrew Bonello retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
@ID_AA_Carmack The question is will top AIs get better at gui faster than all apps add text. I think I have a guess
English
68
36
1.4K
114.3K