by Gos Blank

If you tested a large group of randomly-selected Americans for HIV, more than 99% of them would test negative.

Now, just for the sake of argument, let's say, hypothetically speaking, that there were no such thing as HIV. In such a scenario,

Thus, it would be mathematically impossible for an HIV test to be less than 99% accurate,

Now, imagine if you lived in my little fantasy world where HIV does not exist, and you tested positive, and your doctor told you that the test is more than 99% accurate. Wouldn't you tend to assume that this meant that your positive test was >99% accurate, indicating a probability of >99% that you actually have HIV?

Of course you would. You wouldn't think to ask how much of that 99% is made up of people who test negative for this nonexistent virus; you'd simply assume that your doctor knows what he's talking about, and you'd leave his office >99% sure that you were infected.

The lesson to be learned here is that there is a HUGE difference between the accuracy of a positive test and the accuracy of a negative test, and that neither should

And yet, (in)competent and (ir)responsible medical professionals make this exact mistake

This is why, when

If you should test positive, the most important of these terms are

In order to determine the PPV, NPV, and overall accuracy of a given test (not just HIV tests, but drug tests, pregnancy tests, or any sort of test that returns a yes-or-no result,) we use what's called "Bayes' Theorem".

Don't let the name intimidate you. You DO NOT have to be a rocket scientist to understand Bayes' Theorem. It seems a little hard at first, but once you get the knack of it, you can do a Bayesian computation standing on your head.

It works like this: In order to do a Bayesian computation, you first need to know the sensitivity and specificity of the test, and the actual prevalence of the condition that you're testing for.

Let's say, for example, that you are the president of a company that has 100,000 employees, and 15% of them (15,000) smoke marijuana. Your accountant has told you that labor costs are eating up your profits, and you need to reduce your workforce by 15%. This gives you the brilliant idea to drug-test your entire staff and fire anyone who tests positive for marijuana.

Starting with the actual prevalence of 15%, let's test them all with a test that has sensitivity of 98% and specificity of 99%.

First, we break the group (100,000) into stoners (15,000) and straights (85,000).

85,000 straights X 99% specificity =

15,000 stoners X 98% sensitivity =

850 false positives + 14,700 true positives = 15,550 positive results.

Now, watch this:

...BUT there's a catch: If you test positive, "accuracy" is a meaningless term. No competent and responsible medical professional should

Here's why:

**EVER**use the term "accuracy" to describe a positive HIV test result. (And yet, 99% of HIV testing professionals do this regularly, suggesting that they are neither competent nor responsible.)Here's why:

If you tested a large group of randomly-selected Americans for HIV, more than 99% of them would test negative.

Now, just for the sake of argument, let's say, hypothetically speaking, that there were no such thing as HIV. In such a scenario,

**all**positive results would be false-positives, meaning that a positive test would have ZERO accuracy, since there would be no virus for you to be "positive" for in the first place. And yet, the**overall**accuracy of the test would still be >99%, because >99% of the test results would be true-negatives.Thus, it would be mathematically impossible for an HIV test to be less than 99% accurate,

**even if there were no such thing as HIV**.Now, imagine if you lived in my little fantasy world where HIV does not exist, and you tested positive, and your doctor told you that the test is more than 99% accurate. Wouldn't you tend to assume that this meant that your positive test was >99% accurate, indicating a probability of >99% that you actually have HIV?

Of course you would. You wouldn't think to ask how much of that 99% is made up of people who test negative for this nonexistent virus; you'd simply assume that your doctor knows what he's talking about, and you'd leave his office >99% sure that you were infected.

The lesson to be learned here is that there is a HUGE difference between the accuracy of a positive test and the accuracy of a negative test, and that neither should

**EVER**be confused with**overall**accuracy.And yet, (in)competent and (ir)responsible medical professionals make this exact mistake

**every single day**.This is why, when

**competent and responsible**medical professionals want to discuss the accuracy of a given test, they never refer to "accuracy". Instead, they use the terms "**sensitivity**", "**specificity**", "**Negative Predictive Value (NPV)**", and "**Positive Predictive Value (PPV)**".If you should test positive, the most important of these terms are

**specificity**and**PPV**. Here are the definitions:**PPV**- The accuracy (or predictive value) of a positive test. Example: If an HIV test has a PPV of 75% and you test positive, that means that there's a 75% chance that you actually have HIV.**NPV**- The accuracy of a negative test. Example: If an HIV test has an NPV of 75% and you test negative, that means that there's a 75% chance that you don't have HIV.**Specificity**- The ability of the test (expressed as a percentage,) to correctly identify a**negative**condition. Example: If you tested 1,000 people who**don't**have HIV, using a test that has a specificity of 99%, then 990 of them (99%) would get a**true-negative**result, leaving 10 (1%) who would get a**false-positive**result.**Sensitivity**- The ability of the test to correctly identify a**positive**condition. Example: If you tested 1,000 people who**do**have HIV, using a test with 99% sensitivity, then 990 of them (99%) would get a**true-positive**result, leaving 10 (1%) who would get a**false-negative**result.In order to determine the PPV, NPV, and overall accuracy of a given test (not just HIV tests, but drug tests, pregnancy tests, or any sort of test that returns a yes-or-no result,) we use what's called "Bayes' Theorem".

Don't let the name intimidate you. You DO NOT have to be a rocket scientist to understand Bayes' Theorem. It seems a little hard at first, but once you get the knack of it, you can do a Bayesian computation standing on your head.

It works like this: In order to do a Bayesian computation, you first need to know the sensitivity and specificity of the test, and the actual prevalence of the condition that you're testing for.

Let's say, for example, that you are the president of a company that has 100,000 employees, and 15% of them (15,000) smoke marijuana. Your accountant has told you that labor costs are eating up your profits, and you need to reduce your workforce by 15%. This gives you the brilliant idea to drug-test your entire staff and fire anyone who tests positive for marijuana.

Starting with the actual prevalence of 15%, let's test them all with a test that has sensitivity of 98% and specificity of 99%.

First, we break the group (100,000) into stoners (15,000) and straights (85,000).

85,000 straights X 99% specificity =

**84,150 true negatives**, leaving**850 false positives**.15,000 stoners X 98% sensitivity =

**14,700 true positives**, leaving**300 false negatives**.850 false positives + 14,700 true positives = 15,550 positive results.

14,700 true positives / 15,550 total positives =

300 false negatives + 84,150 true negatives = 84,450 negative results

84,150 true negatives / 84,450 total negatives =

84,150 true negatives + 14,700 true positives =

**94.5% PPV**300 false negatives + 84,150 true negatives = 84,450 negative results

84,150 true negatives / 84,450 total negatives =

**99.6% NPV**84,150 true negatives + 14,700 true positives =

**98,850 accurate results**98,850 / 100,000 tests =

As you can see in this example, PPV, NPV, and overall accuracy are impressively high (94.5%, 99.6%, and 98.85% respectively.)

However, one aspect of Bayesian mathematics is that PPV, NPV, and overall accuracy are affected by actual prevalence. If actual prevalence is low, the PPV of the test drops considerably.

If we assume, for example, that rather than 15% of the employees being stoners in the above example, only 0.4% of them use marijuana.

Again, we first break it down: Stoners = 400 (0.4%), straights = 99,600 (99.6%). Total group size is again 100,000 of course.

Using the exact same test, with the same sensitivity and specificity:

99,600 straights X 99% specificity =

400 stoners X 98% sensitivity =

996 false positives + 392 true positives = 1,388 positive results

392 true positives / 1,388 total positives =

8 false negatives + 98,604 true negatives = 98,612 negative results

**98.85% overall accuracy.**As you can see in this example, PPV, NPV, and overall accuracy are impressively high (94.5%, 99.6%, and 98.85% respectively.)

However, one aspect of Bayesian mathematics is that PPV, NPV, and overall accuracy are affected by actual prevalence. If actual prevalence is low, the PPV of the test drops considerably.

If we assume, for example, that rather than 15% of the employees being stoners in the above example, only 0.4% of them use marijuana.

Again, we first break it down: Stoners = 400 (0.4%), straights = 99,600 (99.6%). Total group size is again 100,000 of course.

Using the exact same test, with the same sensitivity and specificity:

99,600 straights X 99% specificity =

**98,604 true negatives**, leaving**996 false positives**400 stoners X 98% sensitivity =

**392 true positives**, leaving**8 false negatives**.996 false positives + 392 true positives = 1,388 positive results

392 true positives / 1,388 total positives =

**28.2% PPV**.8 false negatives + 98,604 true negatives = 98,612 negative results

98,604 true negatives / 98,612 total negatives =

**99.9919% NPV**.**Note that in this example, there are nearly three times as many false positives as true positives. If you are one of the 1,388 people who test positive, there is a 71.8% probability that you DO NOT smoke marijuana.**

Now, watch this:

98,604 true negatives + 392 true positives =

98,996 / 100,000 tests =

Did you see that? Even though a positive result was

Suddenly, >99% accuracy doesn't sound all that impressive, does it?

What if

Again, we break the group (100,000) into stoners (0) and straights (100,000).

100,000 straights X 99% specificity =

0 stoners X 98% sensitivity =

1,000 false positives + 0 true positives = 1,000 positive results.

0 true positives / 1,000 total positives =

0 false negatives + 99,000 true negatives = 99,000 negative results

99,000 true negatives / 99,000 total negatives =

99,000 true negatives + 0 true positives =

99,000 / 100,000 tests =

Again, the PPV was even lower (0% this time, indicating that if you test positive, there's zero chance that it's an accurate result,) but the overall accuracy was

And again, this is because the overall accuracy is made up of the high percentage (99%) of accurate

So as actual prevalence goes down, PPV plummets, even though overall accuracy actually improves.

Thus, in the third scenario, the test is 99% accurate, but if you test positive, there's zero chance that it's an accurate result.

HIV seroprevalence in the US is estimated at 0.4%. Worldwide, it is estimated at 0.5%. This is comparable to our second example, where prevalence of marijuana use was 0.4%.

Now, over the history of the AIDS epidemic, both US seroprevalence and global seroprevalence have historically been overestimated. As a result, the estimates have been revised downwards over the years, resulting in a rather strange phenomenon: Since the turn of the millennium, the size of the global HIV pandemic has increased from 42 million infected people to 33 million.

Allow me to repeat that: Since 2000, the number of HIV positives in the world has

No, that's not a mistake. That's HIV statistics for you. It's amazing what happens when global health authorities make up the numbers as they go along, and they always,

Long story short, if you think that actual HIV prevalence is anywhere

Which means that according to Bayesian mathematics, the PPV of HIV tests would be dismally low with high accuracy even if the specificity were high, even if the prevalence estimates were correct. But since the actual prevalence is almost certainly even lower, the PPV of HIV tests would be even lower too (and overall accuracy would be

Which means, ultimately, that if a randomly-selected person -- you, for example -- tested positive on an HIV test, it remains highly likely that this indicates that you

Now, what if HIV doesn't exist in the first place?

Once again, I invite you to join me in my little fantasy world where HIV doesn't exist. I'm not asking you to

If HIV didn't exist, and we tested people using an HIV test with a specificity of 99.6%, the test would be 99.6% accurate because 99.6% of all people tested would get a true negative test.

The PPV of the test would be zero, indicating that if you test positive, there is zero chance that you actually have HIV, but if you tested positive, your doctor could tell you truthfully that the test is 99.6% accurate, and you'd leave his office 99.6% sure that you actually have this nonexistent virus.

And since 0.4% of all people tested would test positive, (all of them, of course, being false positives), this would tend to create the illusion of the existence of an HIV seroprevalence of 0.4%.

In other words, the HIV numbers in my fantasy world would look more or less

It's enough to make you wonder if we aren't actually

**98,996 accurate results**98,996 / 100,000 tests =

**98.996% overall accuracy**.Did you see that? Even though a positive result was

**far**less accurate in the second example, the**overall**accuracy of the test actually**improved**from 98.85% to 98.996%. The less accurate a positive test is, the more accurate the test is overall, because the accuracy is determined more by NPV than by PPV. In the second example, someone who tested positive was 3 times more likely to NOT be a marijuana user than to be a stoner, and yet the test was**more**accurate.Suddenly, >99% accuracy doesn't sound all that impressive, does it?

What if

**no one**in the group actually used marijuana? What if, in fact, marijuana didn't exist, and we tested them all using the exact same test?Again, we break the group (100,000) into stoners (0) and straights (100,000).

100,000 straights X 99% specificity =

**99,000 true negatives**, leaving**1,000 false positives**.0 stoners X 98% sensitivity =

**0 true positives**, leaving**0 false negatives**.1,000 false positives + 0 true positives = 1,000 positive results.

0 true positives / 1,000 total positives =

**0% PPV**0 false negatives + 99,000 true negatives = 99,000 negative results

99,000 true negatives / 99,000 total negatives =

**100% NPV**99,000 true negatives + 0 true positives =

**99,000 accurate results**99,000 / 100,000 tests =

**99% overall accuracy**.Again, the PPV was even lower (0% this time, indicating that if you test positive, there's zero chance that it's an accurate result,) but the overall accuracy was

**even higher**this time!And again, this is because the overall accuracy is made up of the high percentage (99%) of accurate

**negative**tests.So as actual prevalence goes down, PPV plummets, even though overall accuracy actually improves.

Thus, in the third scenario, the test is 99% accurate, but if you test positive, there's zero chance that it's an accurate result.

HIV seroprevalence in the US is estimated at 0.4%. Worldwide, it is estimated at 0.5%. This is comparable to our second example, where prevalence of marijuana use was 0.4%.

Now, over the history of the AIDS epidemic, both US seroprevalence and global seroprevalence have historically been overestimated. As a result, the estimates have been revised downwards over the years, resulting in a rather strange phenomenon: Since the turn of the millennium, the size of the global HIV pandemic has increased from 42 million infected people to 33 million.

Allow me to repeat that: Since 2000, the number of HIV positives in the world has

**increased**from 42 million to 33 million.No, that's not a mistake. That's HIV statistics for you. It's amazing what happens when global health authorities make up the numbers as they go along, and they always,

*always,*overestimate, because bigger numbers means more funding. Then, when the actual data on the ground proves conclusively that their numbers are too high, they are forced to revise them downwards over and over again, so that even though they report increases every single year, the actual numbers always "increase" to lower and lower figures.**ALWAYS**Long story short, if you think that actual HIV prevalence is anywhere

**near**the estimates (0.4% US, 0.5% global), there's a bridge in Alaska I'd like to sell you, and you look like just the sucker**smart investor**to take advantage of ...ummmm... this golden investment opportunity.

Which means that according to Bayesian mathematics, the PPV of HIV tests would be dismally low with high accuracy even if the specificity were high, even if the prevalence estimates were correct. But since the actual prevalence is almost certainly even lower, the PPV of HIV tests would be even lower too (and overall accuracy would be

**even higher**.)

Which means, ultimately, that if a randomly-selected person -- you, for example -- tested positive on an HIV test, it remains highly likely that this indicates that you

**probably don't**have HIV,

**even if the test is highly accurate**.

Now, what if HIV doesn't exist in the first place?

Once again, I invite you to join me in my little fantasy world where HIV doesn't exist. I'm not asking you to

**believe**it, I'm merely asking you to

**imagine**it, so that we can together explore what such a world would look like.

If HIV didn't exist, and we tested people using an HIV test with a specificity of 99.6%, the test would be 99.6% accurate because 99.6% of all people tested would get a true negative test.

The PPV of the test would be zero, indicating that if you test positive, there is zero chance that you actually have HIV, but if you tested positive, your doctor could tell you truthfully that the test is 99.6% accurate, and you'd leave his office 99.6% sure that you actually have this nonexistent virus.

And since 0.4% of all people tested would test positive, (all of them, of course, being false positives), this would tend to create the illusion of the existence of an HIV seroprevalence of 0.4%.

In other words, the HIV numbers in my fantasy world would look more or less

**exactly**like they do in the real world, with 0.4% seroprevalence established by HIV tests that are 99.6% accurate.

It's enough to make you wonder if we aren't actually

**living**in my fantasy world where HIV does not exist...

## No comments:

## Post a Comment