Recognize the importance of Bayes’ Theorem within a Decision Analysis context
Reproduce Bayes’ within context of sensitivity/specificity/positive & negative predictive values
Solve & revise probabilities using a variation of other methods
Testing is done for:
Screening (primary prevention)
Diagnosis (secondary prevention)
Monitor and guide treatment (tertiary prevention)
Prognosis
Clinicians have a variety of diagnostic information to guide their decision making
Talking to patient (history, symptoms)
Physically examining patient
Screening (cervical cancer) + diagnostic tests (EKGs, Blood tests, X-rays)
Obtaining information can be…
RISKY
EXPENSIVE
ERROR PRONE
ALL THREE
What is the chance that a patient has a disease if a diagnostic test is positive or negative?
What is the chance that a patient has a disease if a diagnostic test is positive or negative?
In other words, what is the probability of disease conditional on the test result? (D+ | T+); (D+ | T-)







Pr(B)

\begin{aligned} Pr(A \& B) &= Pr(A|B) Pr(B)\\ &= Pr(B|A) Pr(A) \end{aligned}

Pr(A|B) = \frac{Pr(A \& B)}{Pr(B)}



















Case example
You are trying to determine what proportion of the population has already been exposed a new communicable disease, in hopes of figuring out if herd immunity is possible.
You decide to do a antibody test to measure the level of antibodies in a sample of 500 participants
Case example
What is the test’s SENSITIVITY?
What is the test’s SPECIFICITY?
What is the test’s FALSE NEGATIVE RATE?
What is the test’s FALSE POSITIVE RATE?
Case example
| D+ | D- | ||
|---|---|---|---|
| T+ | a (TP) | b (FP) | a + b |
| T- | c (FN) | d (TN) | c + d |
| a + c | b + d | a + b + c + d |
| D+ | D- | ||
|---|---|---|---|
| T+ | 125 (a, TP) | 20 (b, FP) | 145 (a + b) |
| T- | 9 (c, FN) | 346 (d, TN) | 355 (c + d) |
| 134 (a + c) | 366 (b + d) | 500 (a + b + c + d) |
Test Sensitivity among those who have or had the virus, 125/134 = 93% (Interpretation: The probability of the screening test correctly identifying diseased subjects was 93%)
| D+ | D- | ||
|---|---|---|---|
| T+ | 125 (a, TP) | 20 (b, FP) | 145 (a + b) |
| T- | 9 (c, FN) | 346 (d, TN) | 355 (c + d) |
| 134 (a + c) | 366 (b + d) | 500 (a + b + c + d) |
Test Specificity among those without the disease at any point, 346/366 = 95% (Interpretation: The probability of the screening test correctly identifying non-diseased subjects was 65%)
False negative rate (1-sensitivity) is the proportion of diseased people with a negative test: c/(a+c)
False negative rate (1-sensitivity) is the proportion of diseased people with a negative test: c/(a+c)
False positive rate (1-specificity) is the proportion of non-diseased people with a positive test: b/(b+d)
False positive rate (1-specificity) is the proportion of non-diseased people with a positive test: b/(b+d)
…Imagine you are discussing the results of a screening test with a patient
(+) If the patient has an abnormal screening test (i.e. it’s POSITIVE), how likely is it that he really has the disease? [how worried should he be?]
(-) If the test was NEGATIVE, how likely is it that he really does not have the disease? [how reassured should he be?]
\begin{aligned} PPV &=Pr(D+|T+) \\ &= \frac{\text{True Positives}}{\text{True Positives} + \text{False Positives}} \end{aligned}
Heart Attack:
| Present | Absent | ||
|---|---|---|---|
| Elevated (+) | 300 (a, TP) | 15 (b, FP) | 315 (a + b) |
| Normal (-) | 35 (c, FN) | 150 (d, TN) | 185 (c + d) |
| 335 (a + c) | 165 (b + d) | 500 (a + b + c + d) |
315 patients in this coronary care unit had “elevated” screening levels; out of those 315, 300 had heart attacks. Of those with “elevated” screening levels, what proportion have had a heart attack? PPV = 300/315 = 95%
\begin{aligned} \text{NPV} &= Pr(D-|T-) \\ &= \frac{\text{True Negatives}}{\text{True Negatives} + \text{False Negatives}} \end{aligned}
| Present | Absent | ||
|---|---|---|---|
| Elevated (+) | 300 (a, TP) | 15 (b, FP) | 315 (a + b) |
| Normal (-) | 35 (c, FN) | 150 (d, TN) | 185 (c + d) |
| 335 (a + c) | 165 (b + d) | 500 (a + b + c + d) |
185 patients in this coronary care unit had normal screening levels; out of those 185, 150 did not have heart attacks. Of those with “normal” screening levels, what proportion did not have heart attacks? NPV = 150/185 = 81%
Predictive values are highly dependent on the prevalence of disease in a sample (whereas prevalence theoretically should NOT impact sensitivity or specificity)
Previously, we had a population of 500 patients in a coronary care unit, most of whom were having heart attacks.
Now, let’s switch to a different sample
- Around 2,100 patients are coming into the ER with chest pain, but most don’t have heart attacks (slightly over 15% have heart attacks)
| Present | Absent | ||
|---|---|---|---|
| Elevated (+) | 300 (a, TP) | 160 (b, FP) | 460 (a + b) |
| Normal (-) | 35 (c, FN) | 1640 (d, TN) | 1675 (c + d) |
| 335 (a + c) | 1800 (b + d) | 2135 (a + b + c + d) |
When we have a different sample, one that has less disease, the PPV falls & the NPV goes up
Before:
| Present | Absent | ||
|---|---|---|---|
| Elevated (+) | 300 (a, TP) | 15 (b, FP) | 315 (a + b) |
| Normal (-) | 35 (c, FN) | 150 (d, TN) | 185 (c + d) |
| 335 (a + c) | 165 (b + d) | 500 (a + b + c + d) |
PPV = 300 / 460 (460 people with “elevated levels” in this sample, only 300 of them are having heart attacks) = 65% (95% in previous sample)
NPV = 1,640 / 1,675 (Of those with “normal levels,” most are not having heart attacks) = 98% (81% in previous sample)
On a screening test, a high PPV is acceptable, implying that false positive outcomes are minimized, under a variety of circumstances:
Reference: Trevethan (2017)
A moderate PPV is acceptable when:
Reference: Trevethan (2017)
A high NPV is acceptable, implying that false negatives are minimized, under a different set of circumstances:
A high NPV is acceptable, implying that false negatives are minimized, under a different set of circumstances:
A moderate NPV could be acceptable when:
Reference: Trevethan (2017)
Generally:
PPV and NPV are dependent on prevalence (Pre-Test Probability)
SENS + SPEC are usually not dependent on prevalence (spectrum/case-mix bias)
Generally:
PPV and NPV are dependent on prevalence (Pre-Test Probability)
SENS + SPEC are usually not dependent on prevalence (spectrum/case-mix bias)
Spectrum bias – Performance of a test may vary in different clinical settings/different mix of patients
The pretest (or prior) probability of disease in the 2x2 table before any testing
= Probability of the presence of the target disease conditional on the available information prior to performing the test under consideration
In other words, the proportion of the total population with the disease: (a+c)/(a+b+c+d). This is disease prevalence
| Present | Absent | ||
|---|---|---|---|
| Elevated (+) | 300 (a, TP) | 160 (b, FP) | 460 (a + b) |
| Normal (-) | 35 (c, FN) | 1640 (d, TN) | 1675 (c + d) |
| 335 (a + c) | 1800 (b + d) | 2135 (a + b + c + d) |
Disease Prevalence
= (a+c) / (a+b+c+d) = 335 / 2,135 = 0.16









Useful for situations in which a quick estimate of revised probabilities is needed
Likelihood that a given test result would be expected in a patient with the target disorder Pr(test result | D+) compared to the likelihood that the same result would be expected in a patient without the target disorder Pr(test result | D-) [A RATIO]
The likelihood ratio (LR) summarizes test sensitivity and specificity into one number:
LR (positive test) = sensitivity/1-specificity (or TPR/FPR)
LR (negative test) = 1-sensitivity/specificity (or FNR/TNR)
Post-test odds = Pretest odds x LR
LR’s are an advance beyond 2x2 tables
To use likelihood ratios, you must be comfortable converting between probabilities of disease and odds of disease
Odds are simply another way of describing the chances that something will (or won’t) happen
Odds of Disease = \frac{\text{Probability}}{\text{1 - Probability}}
Probability = \frac{\text{Odds}}{\text{Odds + 1}}
Odds favoring an event; Odds = p/(1-p) If an event has 0.20 probability of occurrence, the odds favoring the event = 0.2/0.8 = 0.25 (or 1:4)
Odds against (OddA) the event; OddA = (1-p)/p The odds against are 0.8/0.2 = 4 (or 4:1)
| Present | Absent | ||
|---|---|---|---|
| Elevated (+) | 300 (a, TP) | 15 (b, FP) | 315 (a + b) |
| Normal (-) | 35 (c, FN) | 150 (d, TN) | 185 (c + d) |
| 335 (a + c) | 165 (b + d) | 500 (a + b + c + d) |
| Pre-test probability |
| Pre-test odds (0.67 / 1-0.67) |
| Post (+ test) odds of disease = pre-test odds * LR(+) |
| Post (+ test) prob of disease = post-test odds / post-test odds + 1 |
| Post (- test) odds of disease = pre-test odds * LR(-) |
| Post (- test) prob of disease = 0.22 / 1.22 |
Odds LR = \frac{\text{Pr(D+ | test result)}}{\text{Pr(D- | test result)}} = \frac{{Pr(D+)}}{Pr(D-)} * \frac{{lr(D+)}}{lr(D-)}
\frac{{Pr(D+)}}{Pr(D-)} * \frac{{lr(D+)}}{lr(D-)}
The above is the same as:
\frac{\text{Pr(D+ | test result)}}{\text{Pr(D- | test result)}}
Odds LR = \frac{\text{Pr(D+ | test result)}}{\text{Pr(D- | test result)}} = \frac{{Pr(D+)}}{Pr(D-)} * \frac{\text{Pr(test result | D+)}}{\text{Pr (test result | D-)}}
Pre-test odds favoring disease (the prior):
\frac{{Pr(D+)}}{Pr(D-)}
The post-test odds given the test result:
\frac{\text{Pr(D+ | test result)}}{\text{Pr(D- | test result)}}
\frac{\text{Pr(test result | D+)}}{\text{Pr (test result | D-)}} = \frac{{Pr(D-)}}{{Pr (D+)}} * \frac{{(CTN - CFP)}}{(CTP - CFN)}
How to calculate an optimal “cut-off” for a test with categorical or continuous results at the point in which we will optimize the cut-off conditional on the (1) prior probability of disease and (2) the consequences of the scenario we are assessing
Next lecture: Positivity Criterion!
LR (+)
GT 10 Excellent
5-10 Good
2-5 Fair. May be helpful
1-2 Unlikely to be helpful
LR (-)
<0.1 Excellent
0.1-0.2 Good
0.2-0.5 Fair. May be helpful
0.5-1.0 Unlikely to be helpful