Bin Ren

374 posts

Bin Ren banner
Bin Ren

Bin Ren

@renbbin

Exoplanet researcher. Private pilot. Previous @ObsCotedAzur ←@Caltech ←@JohnsHopkins / @STScI ← @xmuchina.

Sunnyvale, CA เข้าร่วม Nisan 2010
102 กำลังติดตาม150 ผู้ติดตาม
Bin Ren รีทวีตแล้ว
Alex Prompter
Alex Prompter@alex_prompter·
This paper from Harvard and MIT quietly answers the most important AI question nobody benchmarks properly: Can LLMs actually discover science, or are they just good at talking about it? The paper is called “Evaluating Large Language Models in Scientific Discovery”, and instead of asking models trivia questions, it tests something much harder: Can models form hypotheses, design experiments, interpret results, and update beliefs like real scientists? Here’s what the authors did differently 👇 • They evaluate LLMs across the full discovery loop hypothesis → experiment → observation → revision • Tasks span biology, chemistry, and physics, not toy puzzles • Models must work with incomplete data, noisy results, and false leads • Success is measured by scientific progress, not fluency or confidence What they found is sobering. LLMs are decent at suggesting hypotheses, but brittle at everything that follows. ✓ They overfit to surface patterns ✓ They struggle to abandon bad hypotheses even when evidence contradicts them ✓ They confuse correlation for causation ✓ They hallucinate explanations when experiments fail ✓ They optimize for plausibility, not truth Most striking result: `High benchmark scores do not correlate with scientific discovery ability.` Some top models that dominate standard reasoning tests completely fail when forced to run iterative experiments and update theories. Why this matters: Real science is not one-shot reasoning. It’s feedback, failure, revision, and restraint. LLMs today: • Talk like scientists • Write like scientists • But don’t think like scientists yet The paper’s core takeaway: Scientific intelligence is not language intelligence. It requires memory, hypothesis tracking, causal reasoning, and the ability to say “I was wrong.” Until models can reliably do that, claims about “AI scientists” are mostly premature. This paper doesn’t hype AI. It defines the gap we still need to close. And that’s exactly why it’s important.
Alex Prompter tweet media
English
380
2.1K
8.2K
1.2M
Bin Ren รีทวีตแล้ว
Phil Armitage
Phil Armitage@philip_armitage·
With Ralph Pudritz, I've been editing the section of the 2nd edition of the Handbook of Exoplanets focused on "Formation and Evolution of Planets and Planetary Systems". This thread collects together the contributions currently available on astro-ph: (1/5)
Phil Armitage tweet media
English
2
4
14
836
Yinhao WU (吴 寅昊)
Yinhao WU (吴 寅昊)@YinhaoW·
@renbbin @lufthansa I am attending a spring school in a small village, and there is nowhere to buy clothes or a razor. I can only find some basic toiletries.😩
English
1
0
0
71
Yinhao WU (吴 寅昊)
Yinhao WU (吴 寅昊)@YinhaoW·
My delayed luggage has been at Bologna Airport for two days now. When will you be able to deliver it to me? @lufthansa
English
4
0
0
256
Gregory Herczeg
Gregory Herczeg@GregHerczeg·
When in Rome... had my first 0-dollar shopping experience at @cvspharmacy in California! 10/10, but would prefer to not do it again. [not sure what else to do when there's nobody to ring you up, no self-serve machines, and attempts at ringing myself up failed]
English
1
0
1
223
Bin Ren รีทวีตแล้ว
Empire Of Lies
Empire Of Lies@berningman16·
This image is so 2025.
Empire Of Lies tweet media
English
166
1.7K
9.6K
316.5K
Bin Ren รีทวีตแล้ว
Zeke Hausfather
Zeke Hausfather@hausfath·
Great (and scary) visualization of 2024 daily temperatures compared to prior years by the BBC today. Evocative of the iconic Joy Division album cover from 1979: bbc.com/news/articles/…
Zeke Hausfather tweet media
English
227
1K
2.9K
237.7K
Bin Ren รีทวีตแล้ว
Caltech Alumni Association
Caltech Alumni Association@caltechalumni·
In response to this week's devastating Eaton Fire, we have established the Caltech and JPL Together Relief Fund to support our affected colleagues. Please consider making a contribution to this special fund. Thank you from all of us at Caltech. ow.ly/cmTt50UEwFK
Caltech Alumni Association tweet media
English
1
19
45
19.1K
Bin Ren
Bin Ren@renbbin·
@GregHerczeg So that we can check if we had the same driver! That’s what made me want to get back my Maryland drivers license, then convert it to a European one to enjoy die Luft!
English
0
0
0
49
Gregory Herczeg
Gregory Herczeg@GregHerczeg·
@renbbin Fastest I’ve ever been in a car was taxi from MUC to Garching, I think he hit 210. Probably saved 2 minutes. What’s the point?
English
1
0
0
68
Gregory Herczeg
Gregory Herczeg@GregHerczeg·
People who praise Chinese HSR for being quiet don’t appreciate the need for background noise to drown out the very loud snoring of the guy a few rows in front of me.
English
2
0
2
303
Bin Ren รีทวีตแล้ว
Misha Teplitskiy | Science of Science
People usually think replication attempts in science are rare. Journals don't publish replications, so scientists don't do them. In reality there are countless replication attempts (and failures), it's just PhD students assume they did something wrong journals.plos.org/plosone/articl…
Misha Teplitskiy | Science of Science tweet media
English
31
301
1.5K
434.2K
Bin Ren
Bin Ren@renbbin·
@EricLagadec @EditionsduSeuil Le livre est tellement génial, que je savais pas que j'étais l'un des premières à avoir la chance de le voir avant sa sortie!
Français
0
0
1
570
Eric Lagadec✨🌍
Eric Lagadec✨🌍@EricLagadec·
C'est avec un immense plaisir que je vous présente mon nouvel ouvrage, qui sort aujourd'hui. C'est un beau livre qui vous fait voyager dans l'univers avec les sublimes images du télescope spatial James Webb. On va apprendre en rêvant. Bel émerveillement à tout le monde!
Eric Lagadec✨🌍 tweet media
Français
35
171
941
178.6K
Duncan Watts
Duncan Watts@dncnwtts·
Big news! I’ve been awarded an ERC Starting Grant for my project Origins! Thanks to @ERC_Research, I will spend the next five years unveiling the Cosmic Infrared Background, along with my new postdoc and two PhD students. Keep an eye out for job openings at @UniOslo!
English
8
1
37
2.3K
Bin Ren รีทวีตแล้ว
Planetary Plot of the Week
Planetary Plot of the Week@PlanetaryPlots·
This week, we have a plot demonstrating the spectral diversity of the rocks that the Perseverance rover has encountered in Jezero Crater. These spectra show variable iron mineralogy which indicates strong oxidative water-rock interactions in the Jezero delta deposits. (1/2)
Planetary Plot of the Week tweet media
English
2
27
71
4.3K
Bin Ren
Bin Ren@renbbin·
5 most difficult (walkable) stairs to walk? #LINO24
Bin Ren tweet media
English
0
0
4
233