Shmuel Berman

52 posts

Shmuel Berman

Shmuel Berman

@ShmuelBerman

PVL Lab @ Princeton | Memory and Perception | Anthropic Fellow | https://t.co/jdfRoBjvfJ

New York, USA Katılım Şubat 2020
151 Takip Edilen70 Takipçiler
Sabitlenmiş Tweet
Shmuel Berman
Shmuel Berman@ShmuelBerman·
Can Visual Language Models (VLMs) do non-local visual reasoning, i.e., piecing together scattered visual evidence? Humans do this to search images, compare objects, and trace lines. Despite recent advances, our new evaluation suggests most VLMs cannot do these consistently. 1/6
Shmuel Berman tweet media
English
1
4
13
1.5K
Shmuel Berman retweetledi
Princeton Vision & Learning Lab
Stereo depth is highly useful for robots. Meet WAFT-Stereo: #1 on ETH3D (BP-0.5), Middlebury (RMSE), and KITTI (all metrics); 61% less zero-shot ETH3D BP-0.5 error; 1.8-6.7x faster than prior SOTA. Key idea: classify disparity into bins, then iterative high-res warping.🧵1/2
English
3
23
116
7.6K
Shmuel Berman
Shmuel Berman@ShmuelBerman·
I feel like we needed more command line horror in the world... [1/2]
Shmuel Berman tweet media
English
1
0
2
43
Shmuel Berman
Shmuel Berman@ShmuelBerman·
@PeterHndrsn big agree! But I am not optimistic that the right policies are in the right people's heads
English
0
0
2
270
Peter Henderson
Peter Henderson@PeterHndrsn·
I feel this urgency too. But this is all so utterly avoidable with good policymaking. No one should be left behind because they didn't accumulate capital in 2026. There are so many people who aren't plugged into these conversations or are simply not in a position to do anything about it. Single mothers and fathers working three jobs to make ends meet cannot possibly work harder to accumulate capital. They already work hard enough as it is. People in this position should not be "left behind." There should be no "permanent underclass,” as many are worried about. Even if you're somewhat better off. People also shouldn't have to work themselves to the detriment of their health and families to shield against future labor impacts. They should be able to trust that their government will think ahead and make good policy.
Peter Henderson tweet media
English
8
42
335
26K
Shmuel Berman
Shmuel Berman@ShmuelBerman·
@PeterHndrsn Though maybe it's a bad thing to hide the environmental cost from the user..
English
0
0
0
8
Shmuel Berman
Shmuel Berman@ShmuelBerman·
@PeterHndrsn Totally agree re: robotics. Privacy also makes sense, though I think the past twenty years have shown most consumers don't really care (although maybe that will change!) But power will almost always be cheaper (and cleaner!) in centralized locations
English
1
0
0
10
Peter Henderson
Peter Henderson@PeterHndrsn·
I've been thinking that for most consumer use cases there will basically be no reason to run on servers in a few years, with battery life being the main bottleneck. Cool effort to incentivize that direction!
Jon Saad-Falcon@JonSaadFalcon

Personal AI should run on your personal devices. So, we built OpenJarvis: a personal AI that lives, learns, and works on-device. Try it today and top the OpenJarvis Leaderboard for a chance to win a Mac Mini! Collab w/ @Avanika15, John Hennessy, @HazyResearch, and @Azaliamirh. Details in thread.

English
2
1
7
1.2K
Shmuel Berman retweetledi
Samip
Samip@industriaalist·
1/ NanoGPT Slowrun update: we've hit 8.9x data efficiency, up from 7x last week! Some really cool changes behind this one. - Ensemble scaling: we train each model in the ensemble with a distillation objective (chain distillation), and scaled to more models (@bishmdl76, @akshayvegesna) - Looping: replaying transformer layers in later stages of training (@ShmuelBerman, @akshayvegesna) - Exclusive Self-Attention (XSA): new attention mechanism from @zhaisf (added by @bishmdl76)
GIF
English
5
20
99
7.3K
NO CONTEXT HUMANS
NO CONTEXT HUMANS@HumansNoContext·
He knew exactly what he wanted
English
1K
3.7K
98.8K
15.7M
Henrik Karlsson
Henrik Karlsson@phokarlsson·
My wife and I do rock paper scissor to decide stuff a lot, and let me tell you, that game has surprising depth when pushed by two nerds who are determined to win. I have long randomized sequences memorized to throw her off, and we know the conditional probabilities of (naive) follow up moves, and psych each other to push the other to become more predictable and naive. We often go 5, 6 rounds of both mirroring the other before someone outsmarts the other. Everything has more layers than you’d naively assume.
English
72
95
5.4K
292K
Shmuel Berman
Shmuel Berman@ShmuelBerman·
Giving good critique is an art, and it's about to get a whole lot harder. But a different viewpooint beats a sycophant. Read my thoughts on giving impactful feedback here: open.substack.com/pub/thebestwor…
English
0
0
0
40
Shmuel Berman
Shmuel Berman@ShmuelBerman·
ChatGPT reviews our writing; teachers use LLMs to grade. Feedback is being offloaded from humans. Research shows that LLMs are more likely to affirm our beliefs than challenge them. As we see less critique we dislike, I worry we’ll grow less receptive to valid criticism.
English
1
0
1
64
Shmuel Berman
Shmuel Berman@ShmuelBerman·
@AlexanderSpangh @kevinroose Additionally, we can leverage publically available information about open-source models, such as the datasets it was trained on, the RL/instruction tuning, etc. Frontier models are black boxes- only rarely can we make strong claims about the reasons behind their performance.
English
0
0
1
286
Alex Spangher @ Neurips2025
Alex Spangher @ Neurips2025@AlexanderSpangh·
I'm a former colleague of yours @kevinroose at the NYTimes, now I'm an academic. We chatted a few times back in 2017-2018, I'm not sure if you remember me. I constantly experience, as I'm sure you do too, academics implying journalists don't know what they're doing — and journalists do the same. This tweet is an example of the latter. I'm not really sure what the point of it is, besides to diminish academics. 1. Some very trustworthy academics (e.g. @chrmanning) in the field have pointed out that actually, in this case, you're wrong. An earlier version of this paper was out back when these models were still SOTA. 2. That being said, even if the authors didn't publish earlier, I dispute that we can't draw ANY insights about current models from past models. While, yes, these models have improved drastically, many of the theoretical fundamentals are the same or, at least, VERY similar. Implying that all work older than ChatGPT's latest release is irrelevant discards a ton of intellectually valuable contributions and is kind of damaging to our collective ability to understand our world and propagate knowledge. We don't sit around criticizing your existential Bing Chatbot experience from, like, 2024, which I have seen you continue to reference (although more than a few eyebrows were raised, for sure). It still has value. Indeed, maybe Bing Chat is no longer around, but current chat bots still dupe people, lead people into rabbit holes, and worse, literally every day. It's strange that we're all basically trying to do the same thing, but are getting so turf-y about it.
English
1
1
32
6.9K
Kevin Roose
Kevin Roose@kevinroose·
i am begging academics to study AI capabilities using frontier models. the models used in this study (which is going to be cited for years as proof that "AI is bad at health advice") are GPT-4o, Llama 3, and Command R+, two obsolete models and one i've never heard of.
Kevin Roose tweet mediaKevin Roose tweet media
English
110
111
1.6K
329.6K
Shmuel Berman
Shmuel Berman@ShmuelBerman·
First day at Anthropic as a research fellow! Very excited. Please reach out if you want to talk about memory, perception, or safety!
Shmuel Berman tweet media
English
1
0
10
271
Will Bryk
Will Bryk@WilliamBryk·
We embedded all 5000+ NeurIPS papers! exa.ai/neurips Cool queries: - "new retrieval techniques" - "the paper that elon would love most" - "intersection of coding agents and biology, poster session 5" It uses our in-house model trained for precise semantic retrieval 😌
Will Bryk tweet media
English
40
69
725
177.9K
Shmuel Berman
Shmuel Berman@ShmuelBerman·
I was very excited to present my work yesterday at #NeurIPS2025! Thank you to everyone who came to my poster. If you are interested in chatting about perception, memory, or long video, please breach out :)
Shmuel Berman tweet media
English
0
0
4
150
Sarah Catanzaro
Sarah Catanzaro@sarahcat21·
I'll be among dozens (hundreds?) of VCs attending NeurIPS this year, but among the few who might be more interested in topics like managing episodic memory with RL, avoiding model collapse when training with synthetic data, and more effectively using base models to guide exploration, than who is leading your seed round at $1B post. So ping me if you want to chat :)
English
9
6
156
23.4K
Shmuel Berman
Shmuel Berman@ShmuelBerman·
One advantage that LLMs have is that they can be re-instantiated at will. Furthermore, their output distribution is well-defined (even if it is intractable). If an LLM knew what it would do in a given situation, it could work together with itself without synchronization.
English
1
0
0
61
Shmuel Berman
Shmuel Berman@ShmuelBerman·
When many humans work on a project, a large component of the cost is coordination. Even inconsequential decisions (e.g what color to paint a house) can be wrong if different people make different choices. As LLMs are applied to larger projects, can they avoid this issue?
Shmuel Berman tweet media
English
1
0
2
89