Monday, October 24, 2011

HIV tests are better than 99% accurate...

by Gos Blank 

...BUT there's a catch:  If you test positive, "accuracy" is a meaningless term.  No competent and responsible medical professional should EVER use the term "accuracy" to describe a positive HIV test result.  (And yet, 99% of HIV testing professionals do this regularly, suggesting that they are neither competent nor responsible.)

Here's why:


If you tested a large group of randomly-selected Americans for HIV, more than 99% of them would test negative.

Now, just for the sake of argument, let's say, hypothetically speaking, that there were no such thing as HIV.  In such a scenario, all positive results would be false-positives, meaning that a positive test would have ZERO accuracy, since there would be no virus for you to be "positive" for in the first place.  And yet, the overall accuracy of the test would still be >99%, because >99% of the test results would be true-negatives.

Thus, it would be mathematically impossible for an HIV test to be less than 99% accurate,even if there were no such thing as HIV.

Now, imagine if you lived in my little fantasy world where HIV does not exist, and you tested positive, and your doctor told you that the test is more than 99% accurate.  Wouldn't you tend to assume that this meant that your positive test was >99% accurate, indicating a probability of >99% that you actually have HIV?

Of course you would.  You wouldn't think to ask how much of that 99% is made up of people who test negative for this nonexistent virus; you'd simply assume that your doctor knows what he's talking about, and you'd leave his office >99% sure that you were infected.

The lesson to be learned here is that there is a HUGE difference between the accuracy of a positive test and the accuracy of a negative test, and that neither should EVER be confused with overall accuracy.

And yet, (in)competent and (ir)responsible medical professionals make this exact mistakeevery single day.

This is why, when competent and responsible medical professionals want to discuss the accuracy of a given test, they never refer to "accuracy".  Instead, they use the terms "sensitivity", "specificity", "Negative Predictive Value (NPV)", and "Positive Predictive Value (PPV)".

If you should test positive, the most important of these terms are specificity and PPV.  Here are the definitions:

PPV - The accuracy (or predictive value) of a positive test.  Example:  If an HIV test has a PPV of 75% and you test positive, that means that there's a 75% chance that you actually have HIV.

NPV - The accuracy of a negative test.  Example:  If an HIV test has an NPV of 75% and you test negative, that means that there's a 75% chance that you don't have HIV.

Specificity - The ability of the test (expressed as a percentage,) to correctly identify anegative condition.  Example:  If you tested 1,000 people who don't have HIV, using a test that has a specificity of 99%, then 990 of them (99%) would get a true-negative result, leaving 10 (1%) who would get a false-positive result.

Sensitivity - The ability of the test to correctly identify a positive condition.  Example:  If you tested 1,000 people who do have HIV, using a test with 99% sensitivity, then 990 of them (99%) would get a true-positive result, leaving 10 (1%) who would get a false-negative result.

In order to determine the PPV, NPV, and overall accuracy of a given test (not just HIV tests, but drug tests, pregnancy tests, or any sort of test that returns a yes-or-no result,) we use what's called "Bayes' Theorem".

Don't let the name intimidate you.  You DO NOT have to be a rocket scientist to understand Bayes' Theorem.  It seems a little hard at first, but once you get the knack of it, you can do a Bayesian computation standing on your head.

It works like this:  In order to do a Bayesian computation, you first need to know the sensitivity and specificity of the test, and the actual prevalence of the condition that you're testing for.

Let's say, for example, that you are the president of a company that has 100,000 employees, and 15% of them (15,000) smoke marijuana.  Your accountant has told you that labor costs are eating up your profits, and you need to reduce your workforce by 15%.  This gives you the brilliant idea to drug-test your entire staff and fire anyone who tests positive for marijuana.

Starting with the actual prevalence of 15%, let's test them all with a test that has sensitivity of 98% and specificity of 99%.

First, we break the group (100,000) into stoners (15,000) and straights (85,000).

85,000 straights X 99% specificity = 84,150 true negatives, leaving 850 false positives.

15,000 stoners X 98% sensitivity = 14,700 true positives, leaving 300 false negatives.

850 false positives + 14,700 true positives = 15,550 positive results.

14,700 true positives / 15,550 total positives = 94.5% PPV

300 false negatives + 84,150 true negatives = 84,450 negative results
84,150 true negatives / 84,450 total negatives = 99.6% NPV

84,150 true negatives + 14,700 true positives = 98,850 accurate results
98,850 / 100,000 tests = 98.85% overall accuracy.

As you can see in this example, PPV, NPV, and overall accuracy are impressively high (94.5%, 99.6%, and 98.85% respectively.)

However, one aspect of Bayesian mathematics is that PPV, NPV, and overall accuracy are affected by actual prevalence.  If actual prevalence is low, the PPV of the test drops considerably.

If we assume, for example, that rather than 15% of the employees being stoners in the above example, only 0.4% of them use marijuana.

Again, we first break it down:  Stoners = 400 (0.4%), straights = 99,600 (99.6%).  Total group size is again 100,000 of course.

Using the exact same test, with the same sensitivity and specificity:

99,600 straights X 99% specificity = 98,604 true negatives, leaving 996 false positives

400 stoners X 98% sensitivity = 392 true positives, leaving 8 false negatives.

996 false positives + 392 true positives = 1,388 positive results
392 true positives / 1,388 total positives = 28.2% PPV.

8 false negatives + 98,604 true negatives = 98,612 negative results
98,604 true negatives / 98,612 total negatives = 99.9919% NPV.
Note that in this example, there are nearly three times as many false positives as true positives.  If you are one of the 1,388 people who test positive, there is a 71.8% probability that you DO NOT smoke marijuana.

Now, watch this:
98,604 true negatives + 392 true positives = 98,996 accurate results
98,996 / 100,000 tests = 98.996% overall accuracy.

Did you see that?  Even though a positive result was far less accurate in the second example, the overall accuracy of the test actually improved from 98.85% to 98.996%.  The less accurate a positive test is, the more accurate the test is overall, because the accuracy is determined more by NPV than by PPV.  In the second example, someone who tested positive was 3 times more likely to NOT be a marijuana user than to be a stoner, and yet the test was more accurate.

Suddenly, >99% accuracy doesn't sound all that impressive, does it?

What if no one in the group actually used marijuana?  What if, in fact, marijuana didn't exist, and we tested them all using the exact same test?

Again, we break the group (100,000) into stoners (0) and straights (100,000).

100,000 straights X 99% specificity = 99,000 true negatives, leaving 1,000 false positives.

0 stoners X 98% sensitivity = 0 true positives, leaving 0 false negatives.

1,000 false positives + 0 true positives = 1,000 positive results.
0 true positives / 1,000 total positives = 0% PPV

0 false negatives + 99,000 true negatives = 99,000 negative results
99,000 true negatives / 99,000 total negatives = 100% NPV

99,000 true negatives + 0 true positives = 99,000 accurate results
99,000 / 100,000 tests = 99% overall accuracy.

Again, the PPV was even lower (0% this time, indicating that if you test positive, there's zero chance that it's an accurate result,) but the overall accuracy was even higher this time!

And again, this is because the overall accuracy is made up of the high percentage (99%) of accurate negative tests.

So as actual prevalence goes down, PPV plummets, even though overall accuracy actually improves.

Thus, in the third scenario, the test is 99% accurate, but if you test positive, there's zero chance that it's an accurate result.

HIV seroprevalence in the US is estimated at 0.4%.  Worldwide, it is estimated at 0.5%.  This is comparable to our second example, where prevalence of marijuana use was 0.4%.

Now, over the history of the AIDS epidemic, both US seroprevalence and global seroprevalence have historically been overestimated.  As a result, the estimates have been revised downwards over the years, resulting in a rather strange phenomenon:  Since the turn of the millennium, the size of the global HIV pandemic has increased from 42 million infected people to 33 million.

Allow me to repeat that:  Since 2000, the number of HIV positives in the world hasincreased from 42 million to 33 million.

No, that's not a mistake.  That's HIV statistics for you.  It's amazing what happens when global health authorities make up the numbers as they go along, and they always, always,ALWAYS  overestimate, because bigger numbers means more funding.  Then, when the actual data on the ground proves conclusively that their numbers are too high, they are forced to revise them downwards over and over again, so that even though they report increases every single year, the actual numbers always "increase" to lower and lower figures.

Long story short, if you think that actual HIV prevalence is anywhere near the estimates (0.4% US, 0.5% global), there's a bridge in Alaska I'd like to sell you, and you look like just the sucker smart investor to take advantage of ...ummmm... this golden investment opportunity.

Which means that according to Bayesian mathematics, the PPV of HIV tests would be dismally low with high accuracy even if the specificity were high, even if the prevalence estimates were correct.  But since the actual prevalence is almost certainly even lower, the PPV of HIV tests would be even lower too (and overall accuracy would be even higher.)

Which means, ultimately, that if a randomly-selected person -- you, for example -- tested positive on an HIV test, it remains highly likely that this indicates that you probably don'thave HIV, even if the test is highly accurate.

Now, what if HIV doesn't exist in the first place?

Once again, I invite you to join me in my little fantasy world where HIV doesn't exist.  I'm not asking you to believe it, I'm merely asking you to imagine it, so that we can together explore what such a world would look like.

If HIV didn't exist, and we tested people using an HIV test with a specificity of 99.6%, the test would be 99.6% accurate because 99.6% of all people tested would get a true negative test.

The PPV of the test would be zero, indicating that if you test positive, there is zero chance that you actually have HIV, but if you tested positive, your doctor could tell you truthfully that the test is 99.6% accurate, and you'd leave his office 99.6% sure that you actually have this nonexistent virus.

And since 0.4% of all people tested would test positive, (all of them, of course, being false positives), this would tend to create the illusion of the existence of an HIV seroprevalence of 0.4%.

In other words, the HIV numbers in my fantasy world would look more or less exactlylike they do in the real world, with 0.4% seroprevalence established by HIV tests that are 99.6% accurate.

It's enough to make you wonder if we aren't actually living in my fantasy world where HIV does not exist... 

No comments:

Post a Comment