Skip to main content

An Intuitive Explanation of Bayesian Reasoning

Popularity Report

Total Popularity Score: 0

Loading...
Loading...
Loading...
Loading...
Loading...
Loading...

Rank

Bookmark History

Saved by 81 people (-24 private), first by anonymouse user on 2006-03-02


Public Sticky notes

Bayes' Theorem

Highlighted by j8feng

An Intuitive Explanation of Bayesian Reasoning Bayes' Theorem for the curious and bewildered; an excruciatingly gentle introduction. By Eliezer Yudkowsky Your friends and colleagues are talking about something called "Bayes' Theorem" or "Bayes' Rule", or something called Bayesian reasoning. They sound really enthusiastic about it, too, so you google and find a webpage about Bayes' Theorem and...

Highlighted by tzon02

An Intuitive Explanation of Bayesian Reasoning Bayes' Theorem for the curious and bewildered; an excruciatingly gentle introduction. By Eliezer Yudkowsky Your friends and colleagues are talking about something called "Bayes' Theorem" or "Bayes' Rule", or something called Bayesian reasoning. They sound really enthusiastic about it, too, so you google and find a webpage about Bayes' Theorem and...

Highlighted by tzon02

other." In one sense, this is correct - any correlation, no matter how weak, is fair prey for Bayes' Theorem; but Bayes' Theorem distinguishes between weak and strong evidence. That is, Bayes' Theorem not only tells us what is and isn't evidence, it also describes the strength of evidence. Bayes' Theorem not only tells us when to revise our probabilities, but how much to revise our probabilities. A correlation between hope and biological warfare may exist, but it's a lot weaker than the speaker wants it to be; he is revising his probabilities much too far.

Highlighted by benbtg

Or so they claim. Here you will find an attempt to offer an intuitive explanation of Bayesian reasoning - an excruciatingly gentle introduction that invokes all the human ways of grasping numbers, from natural frequencies to spatial visualization. The intent is to convey, not abstract rules for manipulating numbers, but what the numbers mean, and why the rules are what they are (and cannot possibly be anything else). When you are finished reading this page, you will see Bayesian problems in your dreams.

Highlighted by mkoidin

Your friends and colleagues are talking about something called "Bayes' Theorem" or "Bayes' Rule", or something called Bayesian reasoning. They sound really enthusiastic about it, too, so you google and find a webpage about Bayes' Theorem and... It's this equation. That's all. Just one equation. The page you found gives a definition of it, but it doesn't say what it is, or why it's useful, or why your friends would be interested in it. It looks like this random statistics thing. So you came here. Maybe you don't understand what the equation says. Maybe you understand it in theory, but every time you try to apply it in practice you get mixed up trying to remember the difference between p(a|x) and p(x|a), and whether p(a)*p(x|a) belongs in the numerator or the denominator. Maybe you see the theorem, and you understand the theorem, and you can use the theorem, but you can't understand why your friends and/or research colleagues seem to think it's the secret of the universe. Maybe your friends are all wearing Bayes' Theorem T-shirts, and you're feeling left out. Maybe you're a girl looking for a boyfriend, but the boy you're interested in refuses to date anyone who "isn't Bayesian". What matters is that Bayes is cool, and if you don't know Bayes, you aren't cool. Why does a mathematical concept generate this strange enthusiasm in its students? What is the so-called Bayesian Revolution now sweeping through the sciences, which claims to subsume even the experimental method itself as a special case? What is the secret that the adherents of Bayes know? What is the light that they have seen?

Highlighted by parkalewis

Bayes' Theorem for the curious and bewildered; an excruciatingly gentle introduction.

Highlighted by je1954

Your friends and colleagues are talking about something called "Bayes' Theorem" or "Bayes' Rule", or something called Bayesian reasoning. They sound really enthusiastic about it, too, so you google and find a webpage about Bayes' Theorem and... It's this equation. That's all. Just one equation. The page you found gives a definition of it, but it doesn't say what it is, or why it's useful, or why your friends would be interested in it. It looks like this random statistics thing. So you came here. Maybe you don't understand what the equation says. Maybe you understand it in theory, but every time you try to apply it in practice you get mixed up trying to remember the difference between p(a|x) and p(x|a), and whether p(a)*p(x|a) belongs in the numerator or the denominator. Maybe you see the theorem, and you understand the theorem, and you can use the theorem, but you can't understand why your friends and/or research colleagues seem to think it's the secret of the universe. Maybe your friends are all wearing Bayes' Theorem T-shirts, and you're feeling left out. Maybe you're a girl looking for a boyfriend, but the boy you're interested in refuses to date anyone who "isn't Bayesian". What matters is that Bayes is cool, and if you don't know Bayes, you aren't cool.

Highlighted by leadingzero

Your friends and colleagues are talking about something called "Bayes' Theorem" or "Bayes' Rule", or something called Bayesian reasoning. They sound really enthusiastic about it, too, so you google and find a webpage about Bayes' Theorem and... It's this equation. That's all. Just one equation. The page you found gives a definition of it, but it doesn't say what it is, or why it's useful, or why your friends would be interested in it. It looks like this random statistics thing. So you came here. Maybe you don't understand what the equation says. Maybe you understand it in theory, but every time you try to apply it in practice you get mixed up trying to remember the difference between p(a|x) and p(x|a), and whether p(a)*p(x|a) belongs in the numerator or the denominator. Maybe you see the theorem, and you understand the theorem, and you can use the theorem, but you can't understand why your friends and/or research colleagues seem to think it's the secret of the universe. Maybe your friends are all wearing Bayes' Theorem T-shirts, and you're feeling left out. Maybe you're a girl looking for a boyfriend, but the boy you're interested in refuses to date anyone who "isn't Bayesian". What matters is that Bayes is cool, and if you don't know Bayes, you aren't cool.

Highlighted by leadingzero

p(A|X) = p(X|A)*p(A) p(X|A)*p(A) p(X|~A)*p(~A) Why wait so long to introduce Bayes' Theorem, instead of just showing it at the beginning? Well... because I've tried that before; and what happens, in my experience, is that people get all tangled up in trying to apply Bayes' Theorem as a set of poorly grounded mental rules; instead of the Theorem helping, it becomes one more thing to juggle mentally, so that in addition to trying to remember how many women with breast cancer have positive mammographies, the reader is also trying to remember whether it's p(X|A) in the numerator or p(A|X), and whether a positive mammography result corresponds to A or X, and which side of p(X|A) is the implication, and what the terms are in the denominator, and so on. In this excruciatingly gentle introduction, I tried to show all the workings of Bayesian reasoning without ever introducing the explicit Theorem as something extra to memorize, hopefully reducing the number of factors the reader needed to mentally juggle.

Highlighted by pradeepjacob

Here you will find an attempt to offer an intuitive explanation of Bayesian reasoning - an excruciatingly gentle introduction that invokes all the human ways of grasping numbers, from natural frequencies to spatial visualization. The intent is to convey, not abstract rules for manipulating numbers, but what the numbers mean, and why the rules are what they are (and cannot possibly be anything else). When you are finished reading this page, you will see Bayesian problems in your dreams.

Highlighted by elbitjusticiero

Your friends and colleagues are talking about something called "Bayes' Theorem" or "Bayes' Rule", or something called Bayesian reasoning. They sound really enthusiastic about it, too, so you google and find a webpage about Bayes' Theorem and... It's this equation. That's all. Just one equation. The page you found gives a definition of it, but it doesn't say what it is, or why it's useful, or why your friends would be interested in it. It looks like this random statistics thing.

Highlighted by drkasbd

Your friends and colleagues are talking about something called "Bayes' Theorem" or "Bayes' Rule", or something called Bayesian reasoning. They sound really enthusiastic about it, too, so you google and find a webpage about Bayes' Theorem and... It's this equation. That's all. Just one equation. The page you found gives a definition of it, but it doesn't say what it is, or why it's useful, or why your friends would be interested in it. It looks like this random statistics thing.

Highlighted by drkasbd

An Intuitive Explanation of Bayesian Reasoning Bayes' Theorem for the curious and bewildered; an excruciatingly gentle introduction. By Eliezer Yudkowsky Your friends and colleagues are talking about something called "Bayes' Theorem" or "Bayes' Rule", or something called Bayesian reasoning. They sound really enthusiastic about it, too, so you google and find a webpage about Bayes' Theorem and...

Highlighted by tzon02

Your friends and colleagues are talking about something called "Bayes' Theorem" or "Bayes' Rule", or something called Bayesian reasoning. They sound really enthusiastic about it, too, so you google and find a webpage about Bayes' Theorem and... It's this equation. That's all. Just one equation. The page you found gives a definition of it, but it doesn't say what it is, or why it's useful, or why your friends would be interested in it. It looks like this random statistics thing.

Highlighted by drkasbd

Highlighted by andrewhuang51

Highlighted by dyokum

People do not employ Bayesian reasoning intuitively, find it very difficult to learn Bayesian reasoning when tutored, and rapidly forget Bayesian methods once the tutoring is over

Highlighted by andrewhuang51

Bayesian problems in your dreams.

Highlighted by andrewhuang51

the vast majority of doctors in these studies seem to have thought that if around 80% of women with breast cancer have positive mammographies, then the probability of a women with a positive mammography having breast cancer must be around 80%.

Highlighted by andrewhuang51

Even if mammography in this world detects breast cancer in 8 out of 10 cases, while returning a false positive on a woman without breast cancer in only 1 out of 10 cases, there will still be a hundred thousand false positives for every real case of cancer detected.

Highlighted by rahulg

a positive result on the mammography does increase the estimated probability,

Highlighted by andrewhuang51

These two extreme examples help demonstrate that the mammography result doesn't replace your old information about the patient's chance of having cancer; the mammography slides the estimated probability in the direction of the result.  A positive result slides the original probability upward; a negative result slides the probability downward. 

Highlighted by dyokum

Most people encountering problems of this type for the first time carry out the mental operation of replacing the original 1% probability with the 80% probability that a woman with cancer gets a positive mammography.  It may seem like a good idea, but it just doesn't work. 

Highlighted by dyokum

The chance that a patient with a "positive" result has breast cancer is then the proportion of group A within the combined group A + C, or P*M / [P*M + (1 - P)*M], which, cancelling the common factor M from the numerator and denominator, is P / [P + (1 - P)] or P / 1 or just P.

Highlighted by dyokum

Which is common sense.  Take, for example, the "test" of flipping a coin; if the coin comes up heads, does it tell you anything about whether a patient has breast cancer?  No; the coin has a 50% chance of coming up heads if the patient has breast cancer, and also a 50% chance of coming up heads if the patient does not have breast cancer.  Therefore there is no reason to call either heads or tails a "positive" result.  It's not the probability being "50/50" that makes the coin a bad test; it's that the two probabilities, for "cancer patient turns up heads" and "healthy patient turns up heads", are the same.

Highlighted by dyokum

prior probability

Highlighted by dyokum

conditional probabilities

Highlighted by dyokum

the priors

Highlighted by dyokum

if the two conditional probabilities are equal, the posterior probability equals the prior probability

Highlighted by dyokum

revised probability or the posterior probability

Highlighted by dyokum

One thing that's confusing about this notation is that the order of implication is read right-to-left, as in Hebrew or Arabic

Highlighted by dyokum

Reading from left to right, "|" means "given"; reading from right to left, "|" means "implies" or "leads to".

Highlighted by dyokum

Looking at this applet, it's easier to see why the final answer depends on all three probabilities; it's the differential pressure between the two conditional probabilities,  p(blue|pearl) and p(blue|~pearl), that slides the prior probability p(pearl) to the posterior probability p(pearl|blue).

Highlighted by dyokum

Even when the prior probability changes, the differential pressure of the two conditional probabilities always slides the probability in the same direction.  If you learn the egg is painted blue, the probability the egg contains a pearl always goes up - but it goes up from the prior probability, so you need to know the prior probability in order to calculate the final answer.

Highlighted by dyokum

A study by Gigerenzer and Hoffrage in 1995 showed that some ways of phrasing story problems are much more evocative of correct Bayesian reasoning.  The least evocative phrasing used probabilities.  A slightly more evocative phrasing used frequencies instead of probabilities

Highlighted by dyokum

The most effective presentation found so far is what's known as natural frequencies

Highlighted by dyokum

the information about the prior probability is included in presenting the conditional probabilities

Highlighted by dyokum

In this case, you might as well just say that 30% of eggs are painted blue, since the probability of an egg being painted blue is independent of whether the egg contains a pearl. 

Highlighted by dyokum

If the bottom bar were renormalized to the same length as the top bar, it would look like the left sector had expanded.  This is why the proportion of "women with breast cancer" in the group "women with positive mammographies" is higher than the proportion of "women with breast cancer" in the general population - although the proportion is still not very high.

Highlighted by dyokum

The evidence of the positive mammography slides the prior probability of 1% to the posterior probability of 7.8%.

Highlighted by dyokum

You might intuit that since the test could have returned positive for health, but didn't, then the failure of the test to return positive must mean that the woman has a higher chance of having breast cancer

Highlighted by dyokum

Law of Conservation of Probability - not a standard term, but the conservation rule is exact.  If you take the revised probability of breast cancer after a positive result, times the probability of a positive result, and add that to the revised probability of breast cancer after a negative result, times the probability of a negative result, then you must always arrive at the prior probability. 

Highlighted by dyokum

p(A&B) is the same as p(B&A), but p(A|B) is not the same thing as p(B|A)

Highlighted by dyokum

For example, the two quantities p(cancer) and p(~cancer) have 1 degree of freedom between them, because of the general law p(A) + p(~A) = 1

Highlighted by dyokum

p(positive|cancer) and p(~positive|cancer) also have only one degree of freedom between them; either a woman with breast cancer gets a positive mammography or she doesn't.  On the other hand, p(positive|cancer) and p(positive|~cancer) have two degrees of freedom.  You can have a mammography test that returns positive for 80% of cancerous patients and 9.6% of healthy patients, or that returns positive for 70% of cancerous patients and 2% of healthy patients, or even a health test that returns "positive" for 30% of cancerous patients and 92% of healthy patients. 

Highlighted by dyokum

p(positive&cancer) = p(positive|cancer) * p(cancer)

Highlighted by dyokum

You should recognize this operation from the graph; it's the projection of the top bar into the bottom bar.  p(cancer) is the left sector of the top bar, and p(positive|cancer) determines how much of that sector projects into the bottom bar, and the left sector of the bottom bar is p(positive&cancer).

Highlighted by dyokum

Similarly, if we know the number of patients with breast cancer and positive mammographies, and also the number of patients with breast cancer, we can estimate the chance that a woman with breast cancer gets a positive mammography by dividing: p(positive|cancer) = p(positive&cancer) / p(cancer)

Highlighted by dyokum

What about p(positive&cancer), p(positive&~cancer), p(~positive&cancer), and p(~positive&~cancer)?  You might at first be tempted to think that there are only two degrees of freedom for these four quantities - that you can, for example, get p(positive&~cancer) by multiplying p(positive) * p(~cancer), and thus that all four quantities can be found given only the two quantities p(positive) and p(cancer).  This is not the case!  p(positive&~cancer) = p(positive) * p(~cancer) only if the two probabilities are statistically independent - if the chance that a woman has breast cancer has no bearing on whether she has a positive mammography. 

Highlighted by dyokum

groups A, B, C, and D

Highlighted by dyokum

it follows that the entire set of 16 probabilities contains only three degrees of freedom.  Remember that in our problems we always needed three pieces of information - the prior probability and the two conditional probabilities

Highlighted by dyokum

Highlighted by michalchytil