Shmuel Berman
52 posts

Shmuel Berman
@ShmuelBerman
PVL Lab @ Princeton | Memory and Perception | Anthropic Fellow | https://t.co/jdfRoBjvfJ
New York, USA Katılım Şubat 2020
151 Takip Edilen70 Takipçiler
Sabitlenmiş Tweet
Shmuel Berman retweetledi

Stereo depth is highly useful for robots. Meet WAFT-Stereo: #1 on ETH3D (BP-0.5), Middlebury (RMSE), and KITTI (all metrics); 61% less zero-shot ETH3D BP-0.5 error; 1.8-6.7x faster than prior SOTA. Key idea: classify disparity into bins, then iterative high-res warping.🧵1/2
English

@PeterHndrsn big agree! But I am not optimistic that the right policies are in the right people's heads
English

I feel this urgency too. But this is all so utterly avoidable with good policymaking.
No one should be left behind because they didn't accumulate capital in 2026. There are so many people who aren't plugged into these conversations or are simply not in a position to do anything about it.
Single mothers and fathers working three jobs to make ends meet cannot possibly work harder to accumulate capital. They already work hard enough as it is. People in this position should not be "left behind." There should be no "permanent underclass,” as many are worried about.
Even if you're somewhat better off. People also shouldn't have to work themselves to the detriment of their health and families to shield against future labor impacts.
They should be able to trust that their government will think ahead and make good policy.

English

@PeterHndrsn Though maybe it's a bad thing to hide the environmental cost from the user..
English

@PeterHndrsn Totally agree re: robotics. Privacy also makes sense, though I think the past twenty years have shown most consumers don't really care (although maybe that will change!) But power will almost always be cheaper (and cleaner!) in centralized locations
English

I've been thinking that for most consumer use cases there will basically be no reason to run on servers in a few years, with battery life being the main bottleneck. Cool effort to incentivize that direction!
Jon Saad-Falcon@JonSaadFalcon
Personal AI should run on your personal devices. So, we built OpenJarvis: a personal AI that lives, learns, and works on-device. Try it today and top the OpenJarvis Leaderboard for a chance to win a Mac Mini! Collab w/ @Avanika15, John Hennessy, @HazyResearch, and @Azaliamirh. Details in thread.
English
Shmuel Berman retweetledi

1/ NanoGPT Slowrun update: we've hit 8.9x data efficiency, up from 7x last week! Some really cool changes behind this one.
- Ensemble scaling: we train each model in the ensemble with a distillation objective (chain distillation), and scaled to more models (@bishmdl76, @akshayvegesna)
- Looping: replaying transformer layers in later stages of training (@ShmuelBerman, @akshayvegesna)
- Exclusive Self-Attention (XSA): new attention mechanism from @zhaisf (added by @bishmdl76)
GIF
English

@HumansNoContext we need this guy tele-opping robots for training data
English

My wife and I do rock paper scissor to decide stuff a lot, and let me tell you, that game has surprising depth when pushed by two nerds who are determined to win. I have long randomized sequences memorized to throw her off, and we know the conditional probabilities of (naive) follow up moves, and psych each other to push the other to become more predictable and naive. We often go 5, 6 rounds of both mirroring the other before someone outsmarts the other. Everything has more layers than you’d naively assume.
English

Giving good critique is an art, and it's about to get a whole lot harder. But a different viewpooint beats a sycophant.
Read my thoughts on giving impactful feedback here: open.substack.com/pub/thebestwor…
English

@AlexanderSpangh @kevinroose Additionally, we can leverage publically available information about open-source models, such as the datasets it was trained on, the RL/instruction tuning, etc. Frontier models are black boxes- only rarely can we make strong claims about the reasons behind their performance.
English

I'm a former colleague of yours @kevinroose at the NYTimes, now I'm an academic. We chatted a few times back in 2017-2018, I'm not sure if you remember me.
I constantly experience, as I'm sure you do too, academics implying journalists don't know what they're doing — and journalists do the same. This tweet is an example of the latter. I'm not really sure what the point of it is, besides to diminish academics.
1. Some very trustworthy academics (e.g. @chrmanning) in the field have pointed out that actually, in this case, you're wrong. An earlier version of this paper was out back when these models were still SOTA.
2. That being said, even if the authors didn't publish earlier, I dispute that we can't draw ANY insights about current models from past models. While, yes, these models have improved drastically, many of the theoretical fundamentals are the same or, at least, VERY similar. Implying that all work older than ChatGPT's latest release is irrelevant discards a ton of intellectually valuable contributions and is kind of damaging to our collective ability to understand our world and propagate knowledge.
We don't sit around criticizing your existential Bing Chatbot experience from, like, 2024, which I have seen you continue to reference (although more than a few eyebrows were raised, for sure). It still has value. Indeed, maybe Bing Chat is no longer around, but current chat bots still dupe people, lead people into rabbit holes, and worse, literally every day.
It's strange that we're all basically trying to do the same thing, but are getting so turf-y about it.
English


We embedded all 5000+ NeurIPS papers! exa.ai/neurips
Cool queries:
- "new retrieval techniques"
- "the paper that elon would love most"
- "intersection of coding agents and biology, poster session 5"
It uses our in-house model trained for precise semantic retrieval 😌

English

I was very excited to present my work yesterday at #NeurIPS2025! Thank you to everyone who came to my poster. If you are interested in chatting about perception, memory, or long video, please breach out :)

English

Scaling isn’t research?🤣 Scaling is actually some of the most exciting research nowadays.
Yuchen Jin@Yuchenj_UW
“From 2012 to 2020, it was the age of research. From 2020 to 2025, it was the age of scaling. Now, it's back to the age of research again.” I agree.
English

@sarahcat21 Would love to talk about episodic and streaming memory :)
English

I'll be among dozens (hundreds?) of VCs attending NeurIPS this year, but among the few who might be more interested in topics like managing episodic memory with RL, avoiding model collapse when training with synthetic data, and more effectively using base models to guide exploration, than who is leading your seed round at $1B post.
So ping me if you want to chat :)
English

I call this ability "mentalization." I test it and motivate it in this blog post: open.substack.com/pub/thebestwor…
English








