Aman

5.6K posts

Aman banner
Aman

Aman

@arcaman07

Incoming MS CS @GeorgiaTech Scaling RL + Continual Learning @lossfunk

Katılım Ağustos 2020
224 Takip Edilen975 Takipçiler
Sabitlenmiş Tweet
Aman
Aman@arcaman07·
Can frontier AI models actually read a painting? I tested 4 frontier AI models on 15 artworks worth $1.46B, first from the image alone and then with basic metadata. What I found was not just a performance gap, but a recognition vs commitment gap. Three of the four models could identify the correct artist from pixels alone on essentially every painting. But that did not mean they would commit to the valuation implied by what they saw. Gemini 3.1 Pro was strongest in both settings. GPT-5.4 improved sharply once metadata was added. Blog: arcaman07.github.io/blog/can-llms-…
Aman@arcaman07

x.com/i/article/2044…

English
0
1
7
737
Aman retweetledi
Matt Mullin
Matt Mullin@matthewwmullin·
NASA HAS RELEASED OVER 12,000 IMAGES OF THE ARTEMIS II MISSION. Unbelievable perspectives captured by the Crew! The aurora on the eclipse is incredible.
Matt Mullin tweet mediaMatt Mullin tweet mediaMatt Mullin tweet mediaMatt Mullin tweet media
English
273
8.7K
60.4K
1.9M
Aman retweetledi
𝗿𝗮𝗺𝗮𝗸𝗿𝘂𝘀𝗵𝗻𝗮— 𝗲/𝗮𝗰𝗰
Stanford's latest seminar is a deep dive into the evolution of world modeling in AI. Focuses on the shift in the world model from traditional reconstruction methods toward latent space prediction. Covers topics like: - Introduction to JEPA & World Models - Causal JEPA - LOWER Model - Practical Applications & Planning - Future Outlook
𝗿𝗮𝗺𝗮𝗸𝗿𝘂𝘀𝗵𝗻𝗮— 𝗲/𝗮𝗰𝗰 tweet media
English
21
164
1.5K
198.6K
Aman retweetledi
Lawrence Chan
Lawrence Chan@justanotherlaw·
A recent viral paper claims to reverse-engineer the parameter counts of frontier models: GPT-5.5 = 9.7T, Opus 4.7 = 4.0T, o1 = 3.5T, etc. @ben_sturgeon and I investigated and found serious issues in the paper; fixing them gives GPT-5.5 as ~1.5T (90% CI: 256B-8.3T).
Lawrence Chan tweet media
English
29
96
950
204.5K
himanshu
himanshu@retr0sushi_·
always a beginner :) ps : if you have resources or roadmaps don't be shy to share them with me pls!
himanshu tweet mediahimanshu tweet media
English
6
1
42
2.9K
Aman
Aman@arcaman07·
@carlagriffs it's happening quite often, I was a bit confused then what is the point of rebuttals if these issues still persist.
English
1
0
2
518
Carla Griffiths
Carla Griffiths@carlagriffs·
@arcaman07 sorry man, i def relate, there's this new persistent trend of goal post moving in tier A conferences
English
1
0
2
621
Aman
Aman@arcaman07·
ML conference timeline: 1) submit the paper you have been working on for several months. 2) reviewers need additional experiments and clarifications. 3) as the primary author you run all of those experiments and report those findings. 4) reviewers are satisfied, don't increase the scores and ignore you. 5) PC says those experiments cleared all clarifications but please add those to updated paper ( you can't revise during rebuttals) and submit to another venue.
English
3
6
163
15.1K
Aman retweetledi
Vincent Sitzmann
Vincent Sitzmann@vincesitzmann·
In my recent blog post, I argue that "vision" is only well-defined as part of perception-action loops, and that the conventional view of computer vision - mapping imagery to intermediate representations (3D, flow, segmentation...) is about to go away. vincentsitzmann.com/blog/bitter_le…
English
43
164
1K
380.6K
Aman retweetledi
Dwarkesh Patel
Dwarkesh Patel@dwarkesh_sp·
Did a very different format with @reinerpope – a blackboard lecture where he walks through how frontier LLMs are trained and served. It's shocking how much you can deduce about what the labs are doing from a handful of equations, public API prices, and some chalk. It’s a bit technical, but I encourage you to hang in there - it’s really worth it. There are less than a handful of people who understand the full stack of AI, from chip design to model architecture, as well as Reiner. It was a real delight to learn from him. Recommend watching this one on YouTube so you can see the chalkboard. 0:00:00 – How batch size affects token cost and speed 0:31:59 – How MoE models are laid out across GPU racks 0:47:02 – How pipeline parallelism spreads model layers across racks 1:03:27 – Why Ilya said, “As we now know, pipelining is not wise.” 1:18:49 – Because of RL, models may be 100x over-trained beyond Chinchilla-optimal 1:32:52 – Deducing long context memory costs from API pricing 2:03:52 – Convergent evolution between neural nets and cryptography
English
146
595
6.5K
1.2M
Aman retweetledi
David Duvenaud
David Duvenaud@DavidDuvenaud·
Announcing Talkie: a new, open-weight historical LLM! We trained and finetuned a 13B model on a newly-curated dataset of only pre-1930 data. Try it below! with @AlecRad and @status_effects 🧵
English
200
454
3.6K
1.4M
Aman retweetledi
Hater Report
Hater Report@HaterReport·
LeBron and LeBron and Bronny in 2006 Bronny in 2026
Hater Report tweet media
English
300
13K
110.7K
4.1M
Aman retweetledi
Imbue
Imbue@imbue_ai·
Deep learning works extraordinarily well. And we still largely don't know why. A new paper from @learning_mech, @KuninDaniel, and 12 co-authors argues that a scientific theory of deep learning is emerging, and coins a name for the emerging field: learning mechanics. We sat down with Jamie and Dan on Generally Intelligent to talk about what a physics of deep learning would actually look like, why now, and what's left to figure out. 3:05 Learning mechanics as the physics to mechanistic interpretability's biology 4:13 Why deep learning needs a theory 7:07 Why deep learning is uniquely hard to engineer 12:11 How a week in the woods became a paper 25:59 The barrier to theory isn't opacity, but complexity 36:26 Deep learning's first gas law 47:22 Why more particles makes the problem easier 56:22 The discretization hypothesis 1:01:50 The strongest signal that a compact theory exists 1:05:07 The Platonic Representation Hypothesis 1:15:41 Why learning mechanics and mech interp need each other 1:25:29 Theory as safety infrastructure
English
1
30
138
17.6K
Aman retweetledi
Luke Bailey
Luke Bailey@LukeBailey181·
Self-play led to superhuman Go performance, why hasn’t it for LLMs? In practice, long run self-play plateaus like RL. We study why this happens, and build a self-play algorithm that scales better. It solves as many problems with a 7B model as the pass@4 of a model 100x bigger.
GIF
English
29
150
998
135.7K
Aman retweetledi
Sakana AI
Sakana AI@SakanaAILabs·
What happens when you put competing neural networks in a Petri Dish and start changing the rules while they adapt? Last year we released Petri Dish NCA, where neural nets are the organisms that learn during simulation. Today we're releasing Digital Ecosystems: a browser-based platform for interactive artificial life research. The setup: several small CNNs share a 2D grid, each seeing only a 3x3 neighborhood. No global plan. They compete for territory by attacking neighbours and defending against incoming attacks, learning via gradient descent online while the simulation runs. What we didn't expect was the role of the learning itself. Gradient descent isn't just optimising each species' strategy. Instead, it acts to stabilize the whole system during simulation. Species that overextend get pushed back by the loss. Species that stagnate get nudged to grow. This means you can push parameters toward edge-of-chaos regimes: a zone characterised by emergent complexity. Letting the neural networks learn acts to hold the complex system together while you explore and interact. The platform lets you steer all of this interactively. You can draw walls to create niches, erase parts of the system online, and tune 40+ system parameters to explore the most interesting configurations. We find it mesmerizing to watch species carve out territories and reorganise when you perturb them. Everything runs client-side in your browser, no install needed. Blog: pub.sakana.ai/digital-ecosys… Code: github.com/SakanaAI/digit…
English
38
199
1.1K
235.6K
Aman retweetledi
Michael Y. Li
Michael Y. Li@michaelyli__·
Can a language model learn, end-to-end, what to keep in its own KV cache and what to throw away? Can it learn to forget while it learns to reason? Deep learning's central lesson: capability emerges from end-to-end optimization, not heuristics/strong inductive biases. But for efficiency, we rely heavily on hand-designed approaches. 🗑️ Introducing Neural Garbage Collection (NGC): we train a language model to jointly reason and manage its own KV cache, using reinforcement learning with outcome-based task reward alone. No SFT, no proxy objectives, no summarization in natural language. New paper with @jubayer_hamid, Emily Fox, and @noahdgoodman!
Michael Y. Li tweet media
English
25
135
901
159.9K
Aman retweetledi
Rosinality
Rosinality@rosinality·
Problem generator in self-play tends to hack the rewards by making non-useful but complex problems. This work incorporates a guide model to pick useful problems by how well it relates to the unsolved problems.
Rosinality tweet media
English
4
37
266
15.7K
Aman retweetledi
Percy Liang
Percy Liang@percyliang·
What would truly open-source AI look like? Not just open weights, open code/data, but *open development*, where the entire research and development process is public *and* anyone can contribute. We built Marin, an open lab, to fulfill this vision:
Percy Liang tweet media
English
65
224
1.2K
199.6K
Aman retweetledi
Mihir Prabhudesai
Mihir Prabhudesai@mihirp98·
What if AI learned physics the way Newton did – by experiencing it? We built Sim2Reason: train LLMs inside virtual worlds governed by real physics laws, zero human annotation. Result: +5–10% improvement on International Physics Olympiad, zero-shot. 🧵
English
38
214
1.6K
192.9K
Aman retweetledi
Jean Kaddour @ ICLR 2026
Jean Kaddour @ ICLR 2026@jeankaddour·
Introducing Target Policy Optimization (TPO): TPO turns GRPO into supervised learning: build a target distribution over sampled completions, then fit with cross-entropy. The gradient vanishes once the target is matched, making multi-epoch training smooth. 🧵(1/4)
English
11
66
494
37.1K