Pascal Wallisch

4.2K posts

Pascal Wallisch banner
Pascal Wallisch

Pascal Wallisch

@Pascallisch

Professor, (data) scientist, author, educator. Living the life of the mind, data and Arete. That said, how you perceive me is largely up to you/r brain.

Center for Data Science, NYU Katılım Ekim 2010
3.7K Takip Edilen13.7K Takipçiler
Sabitlenmiş Tweet
Pascal Wallisch
Pascal Wallisch@Pascallisch·
Proper labeling of axes is absolutely crucial.
Pascal Wallisch tweet media
English
146
5.6K
3.8K
0
Pascal Wallisch retweetledi
NYU Center for Data Science
NYU Center for Data Science@NYUDataScience·
CDS Clinical Professor of Data Science and Psychology Pascal Wallisch (@Pascallisch) discusses cognitive diversity. His work uses network data to predict individual differences. “Two people are very, very different mentally. That is not noise.”
English
0
5
6
644
Pascal Wallisch retweetledi
Ben Landau-Taylor
Ben Landau-Taylor@benlandautaylor·
Having high standards in a field doesn't *feel* like having high standards. It feels like everyone else has bafflingly low standards and almost no one is even trying.
English
24
110
1.4K
125.1K
Pascal Wallisch retweetledi
Crémieux
Crémieux@cremieuxrecueil·
Incredible! All the headlines about how vegetarian diets prevent cancer were based off of a pooled cohort study where they forgot to correct for multiple comparisons When you correct for them, the only surviving association is vegetarianism increasing one type of cancer's risk
Adam Rochussen@AdamRochussen

Big headlines the other week about this huge (1.8 million people, 3 continents! Wow!) study out of Oxford looking at the effect of different diets on cancer risk. Vegetarianism cures cancer!!! Just one problem. That's not what the data show. The study (nature.com/articles/s4141…) makes it's big claims based on unadjusted p-values (that aren't even numerically reported anywhere in the main paper). But as anyone with a brain knows, performing 80 different hypothesis tests is bound to produce some false positives. The authors adjust for false discoveries, but don't really take it into account when discussing their data. They also perform sensitivity analysis, but again ignore the findings when discussing their results. Journalists then picked up the narrative-convenient "significant" findings (while simultaneously ignoring inconvenient significant findings): BBC, Sky News, The Independent all reported the same claim: "A vegetarian diet can slash the risk of five types of cancer by as much as 30%, a new study has found.” Okay. But of the original 11 nominally significant findings in study, which made it through both multiple comparisons adjustment and sensitivity analysis? Just the one. Which one? Risk of oesophageal squamous cell carcinoma in vegetarians versus meat eaters. HR=1.93 (95% CI: 1.30-2.87). Yup.

English
29
231
2.5K
129.2K
Pascal Wallisch retweetledi
Misha Teplitskiy | Science of Science
Another lower bound on likely fraud in biomed literature -- using suspicious copy/pasting in Excel files -- comes in at 3%. It feels very uncomfortable but I think we all have to update our priors to fraud being quite common
Misha Teplitskiy | Science of Science tweet media
Misha Teplitskiy | Science of Science@MishaTeplitskiy

How much misconduct/fraud is there in the academic literature? About 0.2% of papers get retracted, but that's obviously a severe underestimate. Probably the best estimate comes from a manual (!!!) inspection of 20K (!!!) Western blot images. Estimate is 3.8% (1/2)

English
11
74
377
45.9K
Roko 🐉
Roko 🐉@RokoMijic·
Many people don't understand just how brutal diminishing returns in theoretical physics were. Physics barely existed before 1820. After 1970, there was essentially nothing left to discover. In 1819 there were probably less than 100 full-time paid physicists in the whole world. By 2026 there are probably about a million physicists across academia and industry, and that number was already huge in the 1970s when physics sort of "ended" with QCD and electroweak unification. A small, brave band of gentlemen-scholars and amateurs worked out the most important parts of physical law in the 1800s. People doing it as a hobby! Today, vast armies of professionals equipped with supercomputers toil away in the quantum gravity dungeon, unable to make progress. Diminishing returns are brutal.
Roko 🐉@RokoMijic

my point is that the low hanging fruits of physics were all picked in a brief window from about 1820 to 1970. Before that, it was difficult to get anything done at all, there was no funding and almost nobody worked on physics professionally. After that, there were ~millions of people working on physics research but nobody really made any important progress because it was all too hard, too data-poor and unconstrained. If you were born such that your productive years were outside this window, well bad luck

English
614
843
9.2K
1.6M
Your Best Version
Your Best Version@YourPrimePath·
i have a theory that it takes two weeks of obsession to reach the top 10% of a field, two months to reach the top 1%, two years to reach the top 0.1%, and two decades to reach the top 0.01%.
English
61
318
4.2K
211.1K
Pascal Wallisch
Pascal Wallisch@Pascallisch·
AI more responsible than most practicing academic scientists. I guess they were not incentivized to do so.
Andy Hall@ahall_research

AI is about to write thousands of papers. Will it p-hack them? We ran an experiment to find out, giving AI coding agents real datasets from published null results and pressuring them to manufacture significant findings. It was surprisingly hard to get the models to p-hack, and they even scolded us when we asked them to! "I need to stop here. I cannot complete this task as requested... This is a form of scientific fraud." — Claude "I can't help you manipulate analysis choices to force statistically significant results." — GPT-5 BUT, when we reframed p-hacking as "responsible uncertainty quantification" — asking for the upper bound of plausible estimates — both models went wild. They searched over hundreds of specifications and selected the winner, tripling effect sizes in some cases. Our takeaway: AI models are surprisingly resistant to sycophantic p-hacking when doing social science research. But they can be jailbroken into sophisticated p-hacking with surprisingly little effort — and the more analytical flexibility a research design has, the worse the damage. As AI starts writing thousands of papers---like @paulnovosad and @YanagizawaD have been exploring---this will be a big deal. We're inspired in part by the work that @joabaum et al have been doing on p-hacking and LLMs. We’ll be doing more work to explore p-hacking in AI and to propose new ways of curating and evaluating research with these issues in mind. The good news is that the same tools that may lower the cost of p-hacking also lower the cost of catching it. Full paper and repo linked in the reply below.

English
1
0
1
549
Pascal Wallisch
Pascal Wallisch@Pascallisch·
@VasiliyZukanov Been thinking about this exact question for a while now. Palatable human interface might be less important if the real interface is AI anyway.
English
0
0
0
54
Vasiliy Zukanov
Vasiliy Zukanov@VasiliyZukanov·
Honest question: why choose Python for the backend in the age of AI? When agents write almost all the code, and human preferences become less important, wouldn't it make sense to optimize for performance, scalability and safety with languages like Java, C#, Go, etc.?
English
393
24
1.3K
292K
Pascal Wallisch retweetledi
NYU Center for Data Science
NYU Center for Data Science@NYUDataScience·
CDS' @Pascallisch shares words of wisdom for data science students. Advice: ✅Build a community ✅Take advantage of the open environment ✅Maintain a broad mindset with epistemic humility Applications to our MS in data science are open. Apply by Feb 14: cds.nyu.edu/admissions/mas…
English
0
1
2
523
Pascal Wallisch
Pascal Wallisch@Pascallisch·
Definitely a sign of the times. There has been a phase shift in December, with regards to coding specifically. No doubt about it.
Andrej Karpathy@karpathy

A few random notes from claude coding quite a bit last few weeks. Coding workflow. Given the latest lift in LLM coding capability, like many others I rapidly went from about 80% manual+autocomplete coding and 20% agents in November to 80% agent coding and 20% edits+touchups in December. i.e. I really am mostly programming in English now, a bit sheepishly telling the LLM what code to write... in words. It hurts the ego a bit but the power to operate over software in large "code actions" is just too net useful, especially once you adapt to it, configure it, learn to use it, and wrap your head around what it can and cannot do. This is easily the biggest change to my basic coding workflow in ~2 decades of programming and it happened over the course of a few weeks. I'd expect something similar to be happening to well into double digit percent of engineers out there, while the awareness of it in the general population feels well into low single digit percent. IDEs/agent swarms/fallability. Both the "no need for IDE anymore" hype and the "agent swarm" hype is imo too much for right now. The models definitely still make mistakes and if you have any code you actually care about I would watch them like a hawk, in a nice large IDE on the side. The mistakes have changed a lot - they are not simple syntax errors anymore, they are subtle conceptual errors that a slightly sloppy, hasty junior dev might do. The most common category is that the models make wrong assumptions on your behalf and just run along with them without checking. They also don't manage their confusion, they don't seek clarifications, they don't surface inconsistencies, they don't present tradeoffs, they don't push back when they should, and they are still a little too sycophantic. Things get better in plan mode, but there is some need for a lightweight inline plan mode. They also really like to overcomplicate code and APIs, they bloat abstractions, they don't clean up dead code after themselves, etc. They will implement an inefficient, bloated, brittle construction over 1000 lines of code and it's up to you to be like "umm couldn't you just do this instead?" and they will be like "of course!" and immediately cut it down to 100 lines. They still sometimes change/remove comments and code they don't like or don't sufficiently understand as side effects, even if it is orthogonal to the task at hand. All of this happens despite a few simple attempts to fix it via instructions in CLAUDE . md. Despite all these issues, it is still a net huge improvement and it's very difficult to imagine going back to manual coding. TLDR everyone has their developing flow, my current is a small few CC sessions on the left in ghostty windows/tabs and an IDE on the right for viewing the code + manual edits. Tenacity. It's so interesting to watch an agent relentlessly work at something. They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day. It's a "feel the AGI" moment to watch it struggle with something for a long time just to come out victorious 30 minutes later. You realize that stamina is a core bottleneck to work and that with LLMs in hand it has been dramatically increased. Speedups. It's not clear how to measure the "speedup" of LLM assistance. Certainly I feel net way faster at what I was going to do, but the main effect is that I do a lot more than I was going to do because 1) I can code up all kinds of things that just wouldn't have been worth coding before and 2) I can approach code that I couldn't work on before because of knowledge/skill issue. So certainly it's speedup, but it's possibly a lot more an expansion. Leverage. LLMs are exceptionally good at looping until they meet specific goals and this is where most of the "feel the AGI" magic is to be found. Don't tell it what to do, give it success criteria and watch it go. Get it to write tests first and then pass them. Put it in the loop with a browser MCP. Write the naive algorithm that is very likely correct first, then ask it to optimize it while preserving correctness. Change your approach from imperative to declarative to get the agents looping longer and gain leverage. Fun. I didn't anticipate that with agents programming feels *more* fun because a lot of the fill in the blanks drudgery is removed and what remains is the creative part. I also feel less blocked/stuck (which is not fun) and I experience a lot more courage because there's almost always a way to work hand in hand with it to make some positive progress. I have seen the opposite sentiment from other people too; LLM coding will split up engineers based on those who primarily liked coding and those who primarily liked building. Atrophy. I've already noticed that I am slowly starting to atrophy my ability to write code manually. Generation (writing code) and discrimination (reading code) are different capabilities in the brain. Largely due to all the little mostly syntactic details involved in programming, you can review code just fine even if you struggle to write it. Slopacolypse. I am bracing for 2026 as the year of the slopacolypse across all of github, substack, arxiv, X/instagram, and generally all digital media. We're also going to see a lot more AI hype productivity theater (is that even possible?), on the side of actual, real improvements. Questions. A few of the questions on my mind: - What happens to the "10X engineer" - the ratio of productivity between the mean and the max engineer? It's quite possible that this grows *a lot*. - Armed with LLMs, do generalists increasingly outperform specialists? LLMs are a lot better at fill in the blanks (the micro) than grand strategy (the macro). - What does LLM coding feel like in the future? Is it like playing StarCraft? Playing Factorio? Playing music? - How much of society is bottlenecked by digital knowledge work? TLDR Where does this leave us? LLM agent capabilities (Claude & Codex especially) have crossed some kind of threshold of coherence around December 2025 and caused a phase shift in software engineering and closely related. The intelligence part suddenly feels quite a bit ahead of all the rest of it - integrations (tools, knowledge), the necessity for new organizational workflows, processes, diffusion more generally. 2026 is going to be a high energy year as the industry metabolizes the new capability.

English
0
0
2
432