A must-read: Yudkowsky's gentle, brilliant explanation of Bayesian reasoning and evidence. And why we overestimate breast cancer occurrence 85% of the time.
Happy Fun-with-Evidence Friday. No videos today, just an important lesson told in an entertaining way. Eliezer Yudkowsky is a research fellow at the Singularity Institute for Artifical Intelligence. Boy howdy, can he explain stuff. He's written some great explanations of Bayes' theorem for non-practitioners. [Thanks to @SciData (Mike Will) for linking to this.]
How Bayesian reasoning relies on evidence. Bayes depends on something called 'priors': These are prior probabilities (think of them as 'original' probabilities before additional information becomes available). We use evidence to establish the value of these priors (e.g., the proportion of people with a particular disease or condition). Bayes' theorem is then used to determine revised, or posterior, probabilities given some additional information, such as a test result. (Where do people get priors? "There's a small, cluttered antique shop in a back alley of San Francisco's Chinatown. Don't ask about the bronze rat.")
Yudkowsky opens by saying "Your friends and colleagues are talking about something called 'Bayes' Theorem' or 'Bayes' Rule', or something called Bayesian reasoning. They sound really enthusiastic about it, too, so you google and find a webpage about Bayes' Theorem and... It's this equation. That's all. Just one equation. The page you found gives a definition of it, but it doesn't say what it is, or why it's useful, or why your friends would be interested in it. It looks like this random statistics thing." Then he walks through a simple example about calculating breast cancer risk for a woman with a positive mammography result. Very hands-on, including little calculators like this:
Risk, misunderstood 85% of the time? The scary part is this: "Next, suppose I told you that most doctors get the same wrong answer on this problem - usually, only around 15% of doctors get it right. ("Really? 15%? Is that a real number, or an urban legend based on an Internet poll?" It's a real number. See Casscells, Schoenberger, and Grayboys 1978; Eddy 1982; Gigerenzer and Hoffrage 1995; and many other studies. It's a surprising result which is easy to replicate, so it's been extensively replicated.)"
Evidence slides probability up or down. I especially like Yudkowsky's description of how evidence 'slides' probability in one direction or another. For instance, in the breast cancer example, if a woman receives a positive mammography result, the revised probability of cancer slides from 1% to 7.8%, while a negative result slides the revised probability from 1% to 0.22%.
About priors. Yudkowsky reminds us that "priors are true or false just like the final answer - they reflect reality and can be judged by comparing them against reality. For example, if you think that 920 out of 10,000 women in a sample have breast cancer, and the actual number is 100 out of 10,000, then your priors are wrong. For our particular problem, the priors might have been established by three studies - a study on the case histories of women with breast cancer to see how many of them tested positive on a mammography, a study on women without breast cancer to see how many of them test positive on a mammography, and an epidemiological study on the prevalence of breast cancer in some specific demographic."
The Bayesian discussion references the classic Judgment under uncertainty: Heuristics and biases, edited by D. Kahneman, P. Slovic and A. Tversky. "If it seems to you like human thinking often isn't Bayesian... you're not wrong. This terrifying volume catalogues some of the blatant searing hideous gaping errors that pop up in human cognition."
You must read this. Yudkowky's Bayesian discussion continues with a more in-depth example, eventually leading to a technical explanation of technical explanation. I recommend that one, too.