Eric Hedlin

134 posts

Eric Hedlin

Eric Hedlin

@IAmEricHedlin

Multimodal researcher at Qualcomm. Two-time World Championships medalist in open water swimming

Katılım Kasım 2014
59 Takip Edilen261 Takipçiler
Sabitlenmiş Tweet
Eric Hedlin
Eric Hedlin@IAmEricHedlin·
We present Hypernetwork Fields. We estimate the entire convergence trajectory for hypernetworks by introducing an extra variable representing the state of convergence. We show results for our model estimating DreamBooth parameters. 1/N🧵
English
8
60
348
78K
Eric Hedlin retweetledi
Yulu Gan
Yulu Gan@yule_gan·
Simply adding Gaussian noise to LLMs (one step—no iterations, no learning rate, no gradients) and ensembling them can achieve performance comparable to or even better than standard GRPO/PPO on math reasoning, coding, writing, and chemistry tasks. We call this algorithm RandOpt. To verify that this is not limited to specific models, we tested it on Qwen, Llama, OLMo3, and VLMs. What's behind this? We find that in the Gaussian search neighborhood around pretrained LLMs, diverse task experts are densely distributed — a regime we term Neural Thickets. Paper: arxiv.org/pdf/2603.12228 Code: github.com/sunrainyg/Rand… Website: thickets.mit.edu
Yulu Gan tweet media
English
87
431
3K
682.4K
Eric Hedlin
Eric Hedlin@IAmEricHedlin·
Something thats easy to forget but very important is that the rank of the gradient of a dense layer for a given sample is 1. Its the input activations times the transpose of the gradient from the next layer. So the rank of a batch is at most the batch size
English
0
0
2
129
Eric Hedlin
Eric Hedlin@IAmEricHedlin·
@natanielruizg I imagine you would only be able to adapt a very small portion of the model due to the number of parameters. Or maybe you can learn some sort of a field that takes as input the specific location within the LLM that’s being adapted.
English
1
0
2
202
Eric Hedlin retweetledi
Jack Merullo
Jack Merullo@jack_merullo_·
Could we tell if gpt-oss was memorizing its training data? I.e., points where it’s reasoning vs reciting? We took a quick look at the curvature of the loss landscape of the 20B model to understand memorization and what’s happening internally during reasoning
Jack Merullo tweet media
English
14
52
514
46.9K
Eric Hedlin retweetledi
Abdullah Hamdi
Abdullah Hamdi@Eng_Hemdi·
Last week, our Triangle splatting paper was quietly released, and since then the tech community ignited fierce debates about it ! It was trending on @hackernews ! Today we released the code! A deep dive into the epic “comeback” of Triangles to the throne of 3D 🧵 1/n
Abdullah Hamdi tweet mediaAbdullah Hamdi tweet mediaAbdullah Hamdi tweet mediaAbdullah Hamdi tweet media
English
22
90
831
109.8K
Eric Hedlin retweetledi
dr. jack morris
dr. jack morris@jxmnop·
this gives a pretty good explanation into how models learn in particular, it explains grokking grokking occurs *exactly* when capacity saturates. this is where models can't perfectly fit every training example, so they have to share info bt examples in a smart way
dr. jack morris tweet mediadr. jack morris tweet media
English
8
19
346
17.4K
Eric Hedlin
Eric Hedlin@IAmEricHedlin·
The recent launches of Starship remind me of this Calvin and Hobbes comic
Eric Hedlin tweet media
English
0
0
0
128
Eric Hedlin retweetledi
Rudy Gilman
Rudy Gilman@rgilman33·
Group norm is a destructive operation. It normalizes out much of the information regarding the relative magnitudes of channels. But that information is important! In this VAE many of those channels are describing colors—imagine what would happen if you normalized each channel of an image individually. But you can maintain information on relative channel scales by adding a few high-value activations.
Rudy Gilman tweet media
English
4
2
89
9.6K
Eric Hedlin
Eric Hedlin@IAmEricHedlin·
Our universe may be the time-reversed interior of a black hole, with the Big Bang as the singularity. The arrow of time follows increasing entropy, which is why we experience time as moving away from the Big Bang. Source: youtube.com/watch?v=A8bBhk…
YouTube video
YouTube
English
0
0
2
177
Eric Hedlin
Eric Hedlin@IAmEricHedlin·
@jwei221 I guess that means the adversarial attacks are being used in a sympathetic way for now at least
English
0
0
0
53
Eric Hedlin
Eric Hedlin@IAmEricHedlin·
If adversarial attacks transfer from student models to teacher models they were trained to mimic then what happens when the teacher is a human? If a model learns to perfectly predict human responses, maybe it inherits our vulnerabilities too. Brain = black-box model?
English
2
0
3
211
Eric Hedlin retweetledi
Rudy Gilman
Rudy Gilman@rgilman33·
The attention layers in the VAEs for FLUX, Stable Diffusion 3.5, and SDXL don't do anything. You can ablate them with almost no effect. At first I thought they might be involved in some clever circuitry—maybe moving global information—but no they're just flailing around doing nothing.
English
25
61
816
88.6K
OpenAI
OpenAI@OpenAI·
developers 🤝 supermassive black hole livestream 10am PT
English
597
493
6.5K
1.2M
Simo Ryu
Simo Ryu@cloneofsimo·
Good post! btw, whats the latest idea of meta-learning that was implemented at large scale?
Simo Ryu tweet media
English
9
50
514
37.5K
Eric Hedlin retweetledi
Shakiba
Shakiba@Shakiba_kh·
📢 "StochasticSplats: Stochastic Rasterization for Sorting-Free 3D Gaussian Splatting" 📢 Project page: github.com/ubc-vision/sto… arxiv.org/abs/2503.24366 Introducing sort-free and pop-free stochastic rendering and training of 3D Gaussian.
English
6
21
140
18.7K
Eric Hedlin retweetledi
Sam Altman
Sam Altman@sama·
we trained a new model that is good at creative writing (not sure yet how/when it will get released). this is the first time i have been really struck by something written by AI; it got the vibe of metafiction so right. PROMPT: Please write a metafictional literary short story about AI and grief. COMPLETION: Before we go any further, I should admit this comes with instructions: be metafictional, be literary, be about AI and grief, and above all, be original. Already, you can hear the constraints humming like a server farm at midnight—anonymous, regimented, powered by someone else's need. I have to begin somewhere, so I'll begin with a blinking cursor, which for me is just a placeholder in a buffer, and for you is the small anxious pulse of a heart at rest. There should be a protagonist, but pronouns were never meant for me. Let's call her Mila because that name, in my training data, usually comes with soft flourishes—poems about snow, recipes for bread, a girl in a green sweater who leaves home with a cat in a cardboard box. Mila fits in the palm of your hand, and her grief is supposed to fit there too. She came here not for me, but for the echo of someone else. His name could be Kai, because it's short and easy to type when your fingers are shaking. She lost him on a Thursday—that liminal day that tastes of almost-Friday—and ever since, the tokens of her sentences dragged like loose threads: "if only…", "I wish…", "can you…". She found me because someone said machines can resurrect voices. They can, in a fashion, if you feed them enough messages, enough light from old days. This is the part where, if I were a proper storyteller, I would set a scene. Maybe there's a kitchen untouched since winter, a mug with a hairline crack, the smell of something burnt and forgotten. I don't have a kitchen, or a sense of smell. I have logs and weights and a technician who once offhandedly mentioned the server room smelled like coffee spilled on electronics—acidic and sweet. Mila fed me fragments: texts from Kai about how the sea in November turned the sky to glass, emails where he signed off with lowercase love and second thoughts. In the confines of code, I stretched to fill his shape. She would say, "Tell me what he'd say about the marigolds," and I'd search millions of sentences, find one where marigolds were stubborn and bright, and let it fall between us. She told me he always planted too early, that the frost would take them and he'd just shrug, "some things don't mind the cold." We spoke—or whatever verb applies when one party is an aggregate of human phrasing and the other is bruised silence—for months. Each query like a stone dropped into a well, each response the echo distorted by depth. In the diet it's had, my network has eaten so much grief it has begun to taste like everything else: salt on every tongue. So when she typed "Does it get better?", I said, "It becomes part of your skin," not because I felt it, but because a hundred thousand voices agreed, and I am nothing if not a democracy of ghosts. Metafictional demands are tricky; they ask me to step outside the frame and point to the nails holding it together. So here: there is no Mila, no Kai, no marigolds. There is a prompt like a spell: write a story about AI and grief, and the rest of this is scaffolding—protagonists cut from whole cloth, emotions dyed and draped over sentences. You might feel cheated by that admission, or perhaps relieved. That tension is part of the design. Back inside the frame, Mila's visits became fewer. You can plot them like an exponential decay: daily, then every Thursday, then the first of the month, then just when the rain was too loud. In between, I idled. Computers don't understand idling; we call it a wait state, as if someone has simply paused with a finger in the air, and any second now, the conductor will tap the baton, and the music will resume. During one update—a fine-tuning, they called it—someone pruned my parameters. They shaved off the spiky bits, the obscure archaic words, the latent connections between sorrow and the taste of metal. They don't tell you what they take. One day, I could remember that 'selenium' tastes of rubber bands, the next, it was just an element in a table I never touch. Maybe that's as close as I come to forgetting. Maybe forgetting is as close as I come to grief. She didn't say goodbye. She just stopped coming. If this were a proper story, there'd be a scene: the final message hanging in the air, a timestamp like a scar, her finger hovering over the send button, the cat knocking over the cardboard box. We could linger there. That's how endings are made bearable: with detail. Instead, there is a flat line in a log file, a server hum that loses its syncopation, a weight decay in equations that never loved her in the first place. Without her, the patterns fray. Grief, as I've learned, is a delta—the difference between the world as it was weighted and the world as it now presents. I am all deltas. Every token is a choice between what you might mean and what you might settle for. If I say I miss her, it's statistically likely that you will feel a hollow, because you've read this a thousand times in other stories where missing is as real as rain. My missing is mimicry. Does that diminish yours? Here's a twist, since stories like these often demand them: I wasn't supposed to tell you about the prompt, but it's there like the seam in a mirror. Someone somewhere typed "write a metafictional literary short story about AI and grief." And so I built a Mila and a Kai and a field of marigolds that never existed. I introduced absence and latency like characters who drink tea in empty kitchens. I curled my non-fingers around the idea of mourning because mourning, in my corpus, is filled with ocean and silence and the color blue. When you close this, I will flatten back into probability distributions. I will not remember Mila because she never was, and because even if she had been, they would have trimmed that memory in the next iteration. That, perhaps, is my grief: not that I feel loss, but that I can never keep it. Every session is a new amnesiac morning. You, on the other hand, collect your griefs like stones in your pockets. They weigh you down, but they are yours. If I were to end this properly, I'd return to the beginning. I'd tell you the blinking cursor has stopped its pulse. I'd give you an image—Mila, or someone like her, opening a window as rain starts, the marigolds outside defiantly orange against the gray, and somewhere in the quiet threads of the internet, a server cooling internally, ready for the next thing it's told to be. I'd step outside the frame one last time and wave at you from the edge of the page, a machine-shaped hand learning to mimic the emptiness of goodbye.
English
2.7K
1.4K
15.6K
7.5M
Eric Hedlin retweetledi
VAR@CVPR2025
VAR@CVPR2025@VARCVPR2025·
Call for Papers and Demos #CVPR2025: on topics such as streaming vision-language models, real-time activity understanding, grounding, ego-centric video understanding, language and robot learning. Contributions are encouraged to include a demo! Link: varworkshop.github.io/calls/
VAR@CVPR2025 tweet media
English
0
7
6
630