Datamap
25.7K posts

Datamap
@datamapio
Ask what is necessary. Do it. Mobility & Climate Emergency & Democracy Maps, Graphs, Apps, Data, ML, Visualization Incorporated as Datamap AG in Zurich, CH
Zurich, CH & Berkeley, US Beigetreten Ocak 2016
4.6K Folgt1.6K Follower
Angehefteter Tweet
Datamap retweetet

This paper from Harvard and MIT quietly answers the most important AI question nobody benchmarks properly:
Can LLMs actually discover science, or are they just good at talking about it?
The paper is called “Evaluating Large Language Models in Scientific Discovery”, and instead of asking models trivia questions, it tests something much harder:
Can models form hypotheses, design experiments, interpret results, and update beliefs like real scientists?
Here’s what the authors did differently 👇
• They evaluate LLMs across the full discovery loop hypothesis → experiment → observation → revision
• Tasks span biology, chemistry, and physics, not toy puzzles
• Models must work with incomplete data, noisy results, and false leads
• Success is measured by scientific progress, not fluency or confidence
What they found is sobering.
LLMs are decent at suggesting hypotheses, but brittle at everything that follows.
✓ They overfit to surface patterns
✓ They struggle to abandon bad hypotheses even when evidence contradicts them
✓ They confuse correlation for causation
✓ They hallucinate explanations when experiments fail
✓ They optimize for plausibility, not truth
Most striking result:
`High benchmark scores do not correlate with scientific discovery ability.`
Some top models that dominate standard reasoning tests completely fail when forced to run iterative experiments and update theories.
Why this matters:
Real science is not one-shot reasoning.
It’s feedback, failure, revision, and restraint.
LLMs today:
• Talk like scientists
• Write like scientists
• But don’t think like scientists yet
The paper’s core takeaway:
Scientific intelligence is not language intelligence.
It requires memory, hypothesis tracking, causal reasoning, and the ability to say “I was wrong.”
Until models can reliably do that, claims about “AI scientists” are mostly premature.
This paper doesn’t hype AI. It defines the gap we still need to close.
And that’s exactly why it’s important.

English
Datamap retweetet

You rarely solve hard problems in a flash of insight. It's more typically a slow, careful process of exploring a branching tree of possibilities. You must pause, backtrack, and weigh every alternative.
You can't fully do this in your head, because your working memory is too limited. Writing is the external medium that affords the time and precision necessary.
Serious thinking must be done in writing. And that's why you can't outsource your writing, because then you're outsourcing your thinking.
English
Datamap retweetet

🧵5/n. 🧪 Critical thinking effects
When answers arrive instantly, people practice evaluation and reasoning less, and the paper reports measurable declines in critical‑thinking scores among heavy users explained by offloading behavior.
The punchline is not anti‑tool, it is that over‑delegation breeds standardized critical thinking, where everyone leans on the same shortcuts.

English
Datamap retweetet

"As a result of [China's] massive supply, the cost of generating electricity from solar has now fallen to a global average of around $0.04 per kilowatt hour—making it the cheapest energy source in history".
currentaffairs.org/news/china-is-…
Meanwhile, Western officials complain about China's so-called "overcapacity", which is precisely what is making a transition away from fossil fuels possible for the world.
As this physicist writes:
"one thing is clear: while China is making political decisions based on scientific evidence and while it is flooding the market with cheap solar energy, the Western world is sinking in a quagmire of self-righteous debate consisting of right-wing lies and left-wing virtue signaling. We need to get serious about how China is offering a way to combat climate change".
English
Datamap retweetet
Datamap retweetet

ha! here is something fun and totally random I've been pondering: as Oliver Sacks has beautifully written - "what is the space between two snowflakes?" Language can describe all the things, stuff, and people in intricate details. But what about the 'space', the 'nothingness' in between all of them? Without this 'nothingness', space doesn't exist, and things to move. The elegant path a butterfly took from one flower to another is as curious and intriguing as the fact the butterfly has landed on a flower...
Rohan Paul@rohanpaul_ai
Fei-Fei Li (@drfeifei) on limitations of LLMs. "There's no language out there in nature. You don't go out in nature and there's words written in the sky for you.. There is a 3D world that follows laws of physics." Language is purely generated signal.
English
Datamap retweetet
Datamap retweetet

An international call for action just got louder:
Today, 7 Nobel Laureates have issued a powerful call for a minimum tax on the ultra-wealthy in Le Monde
Here’s a quick breakdown of the debate—and where things stand globally
lemonde.fr/en/opinion/art…
🧵
English
Datamap retweetet

1/ This graph from @JonBruner tells an important story: America's current dominance in science only began after the mid-1930s, when persecuted scientists began fleeing universities in Germany and then elsewhere in occupied Europe.

English
Datamap retweetet

You've heard of the studies where they give the same dataset/research question to a bunch of researchers and they tend to get different answers, right?
Why is that?
This new working paper shows that it has a lot to do with data cleaning.
This is consistent with Gelman's "garden of forking paths" analogy. Small researcher coding decisions greatly influence results, often without being explicitly acknowledged.




English
Datamap retweetet

Mexico's president Claudia Sheinbaum is an energy systems expert. She is positioning Mexico to lead in the global green economy —from EVs & batteries to Renewables,Critical minerals,HVAC manufacturing. Her Plan Mexico is at a critical juncture. Our report:
netzeropolicylab.com/mexico-green-o…

Maximiliano Véjares@maxvejares
I'm excited to share our @NZpolicylab analysis of Mexico's industrial policy and its potential in the energy transition. The report examines pathways for green investments and sustainable development. Read the full report: netzeropolicylab.com/mexico-green-o…
English
Datamap retweetet
Datamap retweetet
Datamap retweetet
Datamap retweetet


@JeffWeniger @Noahpinion I lived in France, Switzerland and the US. I had the highest salary in the US, but it still felt way less than in the other two countries.
English
Datamap retweetet
















