
Mark Zobeck
1.2K posts

Mark Zobeck
@MarkZobeck
MD, MPH. Peds Hem/Onc. Data nerd using technology to improve healthcare for kids around the world. Love languages: graphs, references, Bayesian statistics

















In 2016 Geoffrey Hinton said “we should stop training radiologists now" since AI would soon be better at their jobs. He was right: models have outperformed radiologists on benchmarks for ~a decade. Yet radiology jobs are at record highs, with an average salary of $520k. Why?

I still hear people say that LLMs are "just statistical pattern matchers" without grounded understanding. This probably reflects the influential arguments of Bender and Koller, which have a lot of validity. But there are two major reasons we should update these views: 🧵 👇

When you store your knowledge and skills as parametric curves (as all deep learning models do), the only way you can generalize is via interpolation on the curve. The problem is that interpolated points *correlate* with the truth but have no *causal* link to the truth. Hence hallucinations. The fix is to start leveraging causal symbolic graphs as your representation substrate (e.g. computer programs of the kind we write as software engineers). The human-written software stack, with its extremely high degree of reliability despite its massive complexity, is proof of existence of exact truthiness propagation.



There is some confusion among readers of #Bookofwhy regarding the impressive "causal understanding" LLM's, which seems to defy the theoretical prediction of the Ladder of Causation. The Ladder predicts that, regardless of data size, no learning machine could correctly answer queries about interventions and counterfactuals unless supplemented with causal knowledge, external to the data. LLM programs circumvent this prediction by smuggling causal knowledge into the training data; instead of training themselves on observations obtained directly from the environment, they are trained on linguistic texts written by authors who already have causal models of the world. The programs can simply cite information from the text without attending to any of the underlying data. The result is a sequence of linguistic extrapolations which, in some remote and obscure sense, reflect the causal understanding of those authors. @GaryMarcus @eliasbareinboim @soboleffspaces @geoffreyhinton @DavidDeutschOxf

Still better than the NCI payline!














