Zizhao Chen

45 posts

Zizhao Chen

Zizhao Chen

@ch272h

陈梓昭 phding @cornell_cs @cornell_tech undergrad @uoftengineering, and is actually elsewhere

Katılım Temmuz 2014
164 Takip Edilen123 Takipçiler
Zizhao Chen retweetledi
Cornell Tech
Cornell Tech@cornell_tech·
Today’s AI models can’t even tie their own shoes. New research—led by @ch272h—tests AI models in a 3D environment, finding they perform well at untangling basic knots but cannot tie knots from simple loops or convert one knot to another. @Cornell_Bowers news.cornell.edu/stories/2025/1…
English
0
2
6
545
Zizhao Chen retweetledi
Merriam-Webster
Merriam-Webster@MerriamWebster·
Merriam-Webster’s human editors have chosen ‘slop’ as the 2025 Word of the Year.
English
456
7K
63.3K
3.7M
Zizhao Chen
Zizhao Chen@ch272h·
@yoavartzi I'm presenting on Friday. Details below: Fri, Dec 5, 2025 11:00 AM – 2:00 PM PST Exhibit Hall C,D,E #4505 Pic: knots inside USS midway museum near SD convention center
Zizhao Chen tweet media
English
0
0
7
388
Zizhao Chen
Zizhao Chen@ch272h·
🧩Natural language isn’t all you need. We’re great at evaluating text-based reasoning (MATH, AIME…) but what about long-horizon visual reasoning? Enter 𝗞𝗻𝗼𝘁𝗚𝘆𝗺: a minimalistic testbed for evaluating agents on spatial reasoning along a difficulty ladder
English
1
13
57
16K
Denis Parra
Denis Parra@denisparra·
@ch272h interesting! which day and session, and which poster number do you have ?
English
1
0
1
60
Zizhao Chen
Zizhao Chen@ch272h·
Hi all, I will be at #NeurIPS2025 to present my work on stress-testing looooooong visual reasoning with KnotGym🥨 Let's talk, whether or not your VLM that can see 14 million possible futures like Doctor Strange
English
1
0
4
279
Zizhao Chen retweetledi
Yair Feldman
Yair Feldman@yair_feldman·
🧵 New paper: "Simple Context Compression" - we show that mean-pooling beats the widely-used compression-tokens method for compressing contexts in LLMs, while being simpler and more efficient! with @yoavartzi (1/7)
Yair Feldman tweet media
English
3
13
43
25.9K
Zizhao Chen retweetledi
Yoav Artzi
Yoav Artzi@yoavartzi·
Pushed a big update to LM-class (v2025.2) -- this second version makes a much more mature resource Many refinements of lecture slides + significant improvements to the assignments Many thanks to @ch272h @HuaYilun and @shankarpad8 for their work on the assignments
Yoav Artzi tweet media
English
1
7
24
1.8K
Zizhao Chen retweetledi
Tanya Goyal
Tanya Goyal@tanyaagoyal·
🚨Modeling Abstention via Selective Help-seeking LLMs learn to use search tools to answer questions they would otherwise hallucinate on. But can this also teach them what they know vs not? @momergul_ introduces MASH that trains LLMs for search and gets abstentions for free! 💡Key idea: Reward accuracy but penalize searches during training. Under the right optimization pressure, LLMs learn to invoke search when their parametric knowledge is lacking. At inference, we simply remove this search access and treat any search invocation as a proxy for abstention!
Tanya Goyal tweet media
English
1
22
39
5.5K
Zizhao Chen retweetledi
Haochen Shi
Haochen Shi@HaochenShi74·
ToddlerBot 2.0 is released🥳! Now Toddy can also do cartwheels🤸! We have added so many features since our first release in February; see github.com/hshi74/toddler… for more details. Threads🧵(1/n)
English
14
50
252
29.1K
Yoav Artzi
Yoav Artzi@yoavartzi·
@xhluca @giffmana True. But because it was added on top of a thriving language, someone had to decide either to alienate the entire world or make it optional and mild. They correctly choose the latter route. Java was type-safe-first, and that makes for a very different beast
English
2
0
1
100
Lucas Beyer (bl16)
Lucas Beyer (bl16)@giffmana·
I love codex and claude taking care of all the boilerplate part of coding that wastes time and is booooooring. Come to think of it, maybe Java would in theory be the perfect language for LLM-coding? Extremely verbose boilerplate - very annoying for human, but good for LLM?
English
60
10
386
59K
Yoav Artzi
Yoav Artzi@yoavartzi·
@xhluca @giffmana I do have this internal bet that we will see a prog. lang. that is built for LLMs-first coming up at some point. It will be interesting. But then there's the chicken-and-egg problem of data
English
2
0
2
138
Yoav Artzi
Yoav Artzi@yoavartzi·
Me: the new GPU node is online My students: 💃🕺💃🕺💃 Me: torchrun --standalone --nproc_per_node=8 train.py My students: 🤬🤬🤬🤬🤬
English
6
3
277
34.9K
Zizhao Chen retweetledi
Haochen Shi
Haochen Shi@HaochenShi74·
Time to democratize humanoid robots! Introducing ToddlerBot, a low-cost ($6K), open-source humanoid for robotics and AI research. Watch two ToddlerBots seamlessly chain their loco-manipulation skills to collaborate in tidying up after a toy session. toddlerbot.github.io
English
30
107
566
113.3K