COVID antibody testing and conditional probability

In a Slack channel that I frequent and he infrequents, podcasting magnate Lex Friedman linked to this CNN article on the effectiveness of COVID-19 antibody tests:

CNN article that says antibody tests may be wrong half the time

The concern is that people who’ve had a positive result from an antibody (serologic) test may be fooling themselves into thinking they’re immune. Unfortunately, there’s a numerical problem with this:

The CDC explains why testing can be wrong so often. A lot has to do with how common the virus is in the population being tested. “For example, in a population where the prevalence is 5%, a test with 90% sensitivity and 95% specificity will yield a positive predictive value of 49%. In other words, less than half of those testing positive will truly have antibodies,” the CDC said.

Although CNN doesn’t link to it, the quote comes from this CDC page.

Let’s go through the numbers and see how 90% turns into 49%.

First, there are a couple of terms of art we need to understand: sensitivity and specificity. Sensitivity is the accuracy of positive test results; it is the percentage of people who actually have the condition that will test positive for it. Specificity is the accuracy of negative test results; it is the percentage of people who actually don’t have the condition that will test negative for it.

With those definitions in mind, we’ll create a 2×2 contigency table for a population of 100,000 people in which the prevalence, sensitivity, and specificity are as given in the quote.

                  Have      Don't have
               antibodies   antibodies
Test positive    4,500        4,750    |   9,250
Test negative      500       90,250    |  90,750
                 5,000       95,000    | 100,000

5,000 people actually have the antibodies (5% of 100,000) and 95,000 do not. Of the 5,000 with antibodies, 4,500 (90% of 5,000) will test positive for them and the remaining 500 will test negative for them. Of the 95,000 without antibodies, 90,250 (95% of 95,000) will test negative for them and the remaining 4,750 will test positive for them.

So of the 9,250 people who test positive for the antibodies, only 4,500 actually have them. That’s where the 49% figure comes from in the quote.

This is a very common sort of problem to spring on students in a probability class who are learning about conditional probability and Bayes’s Theorem. The key thing is to realize that the probability that you have antibodies given that you had a positive test is definitely not the same as the probability that you had a positive test given that you have antibodies. When you hear about a test’s “accuracy,” it’s usually the latter that’s presented even though you as a patient are only interested in the former. After all, if you already knew you had the condition, you wouldn’t need to have the test.

Although the CNN article makes it sound as if antibody testing is useless, there are ways of using multiple tests (of different types) to achieve a high predictive value:

Algorithms can be designed to maximize overall specificity while retaining maximum sensitivity. For example, in the example above with a population prevalence of 5%, a positive predictive value of 95% can be achieved if samples initially positive are tested with a second different orthogonal assay that also has 90% sensitivity and 95% specificity.

We can see how this works by putting the 9,250 population of positive tests through a second round of independent testing. Here’s the contingency table for the second test:

                  Have      Don't have
               antibodies   antibodies
Test positive    4,050          238    |   4,288
Test negative      450        4,512    |   4,962
                 4,500        4,750    |   9,250

Here we see that of the 4,288 people who test positive in the second test, 4,050 (94%) actually have the antibodies. The second test works so well because it is performed on a population that has a much higher prevalence of people with antibodies. It benefits from the screening done by the first test