A pressing need exists to become more evidence-based, and to practice and deliver healthcare accordingly. Given fast-paced technological and regulatory changes, being able to develop a probabilistic outlook as to the range of possible outcomes in decision-making is increasingly important. This material introduces Bayes’ theorem, which is central to reassessing probabilities in light of accumulating evidence.
The insight of Thomas Bayes
Thomas Bayes was a British mathematician and Presbyterian minister, who lived in the early to mid-1700s. He formulated what is known as Bayes’ theorem, published posthumously.
In the early 1700s, problems in probability of events would be solved — ex. given a specified number of white and black balls in a bowl, what is the probability of drawing a black ball?. This is perhaps how most of us still think when trying to calculate probabilities and figure out odds. Bayes’ work, on the other hand, was motivated by trying to answer the inverse probability problem, to wit: given that one or more balls have been drawn from the bowl, what can be said about the number of white and black balls in the bowl?
Bayes’ theorem expresses the conditional probability, or posterior probability, of a hypothesis H (i.e. its probability after evidence E is observed) in terms of:
- the prior probability of H
- the prior probability of E
- the conditional probability of E given H.
The frequentist approach calculates probability based on frequencies of outcomes based on an “infinite” number of trials. For instance, the probability of heads approximates 50% for an unbiased coin and a large enough number of coin tosses. So, frequentists think of random events as proportions of a whole. Many of us are frequentists of sorts, but also tend to believe that a fair coin should turn up heads half the time it is tossed (even on a short run), and that if a run of heads has occurred a run of tails is surely overdue. These are fallacies, as 1) probabilities are only probabilities, and they tend to be more reliable the larger the number of trials, and 2) past outcomes have no bearing on future ones in controlled conditions of fairness.
The Bayesian approach, on the other hand, is related to belief in propositions and hypotheses, which may change with evidence. Bayes’ theorem can be derived as follows:
For events A and B,
P(A | B) = P(A and B) / P(B)
(this is read as: the conditional probability of A given B is equal to the joint probability of A and B divided by the probability of B.)
Similarly, and by symmetry:
P(B | A) = P(A and B) / P(A)
Equating the two numerators on the right and clearing denominators, we get:
P(A | B) P(B) = P(B | A) P(A)
P(A | B) = P(B | A) P(A) / P(B) (this is Bayes’ theorem, which is read as: the probability of A given B is equal to the probability of B given A times the probability of A divided by the probability of B.)
P(A) and P(B) are prior or unconditional probabilities (also known as marginal probabilities.)
P(A | B) and P(B | A) are conditional probabilities (i.e. the probability of A given that B occurs, and the probability of B given that A occurs, respectively).
P(A) = P(A and B) + P(A and not B)
P(A) = P(A | B) P(B) + P(A | not B) P(not B).
On to a simple example.
Say a hospital doing medication reconciliation wants to test patients about to be admitted as to whether they are on a specific medication which the hospital just learned may be recalled and discontinued by the manufacturer because of its dangerous side effects. Say 1% of the population are on this med, and say the test is 98.5% accurate in detecting patients that are on it, and 99.5% accurate in correctly identifying a patient as not being on the med. Researchers and administrators at the hospital would like to know what the likelihood is that a patient who is tested as positive is actually on the med — note the inverse probability question, posed on the basis of evidence.
Given that “M” stands for “patient on the med”, “N” for “patient not on the med”, and “Pos” for “patient has tested positive”:
P(M) = 0.01 (a priori prob.).
P(N) = 1 – P(M) = 0.99 (a priori prob.).
P(Pos | M) = 0.985 (true positive, conditional prob.).
P(Pos | N) = 0.015 (false positive, conditional prob.).
P(Pos) = P(true positive) + P(false positive) = (0.99 x 0.015) + (0.01 x 0.985)
= 0.0247 (a priori prob.).
Calculation of a posteriori probability of a patient being actually on the med given a positive test result:
P(M | Pos) = P(Pos | M) P(M) / P(Pos)
= P(Pos | M) P(M) / (P(Pos | M)P(M) + P(Pos | N)P(N))
= (0.985 x 0.01) / (0.985 x 0.01) + (0.015 x 0.99)
= 0.398 (a posteriori prob.)
This says that the probability of a patient who has tested positive being actually on the med is only 39.8%. In other words, about 6 out of 10 times (60.2%) the hospital is likely to wrongly identify a patient as being on a dangerous med and perhaps believe patients are insincere when they question the test result. Is this something that one might have expected given the apparently quite high accuracy and sensitivity of the test? Tip: with less disparity in the number of people on the med vs. not, the probability of correctly identifying someone as being on the med goes up (ex. if 30% of the people are on the med and 70% are not, as opposed to 1% vs. 99%, P(M | Pos) is close to 96.5%, a result more in line with expectations.)
Note how Bayes allows the reverse probability computation from the a priori probability augmented by the evidence (test result.) Although in this hypothetical instance we discussed screening patients for med usage, a similar test could be thought about for inpatients about to be discharged, if one wanted to ensure their likelihood of readmission in the short term were very low (with similar caveats.)
Complex Bayesian networks can be built for probabilistic inferencing cascading between network nodes. In a Bayesian network of related variables, subsets of variables (states) are recalculated and updated a posteriori as values of observed/evidence variables become available.
Fields of application of Bayesian networks today include bioinformatics (protein structure, gene expression), document retrieval, image processing, decision support, and others.