PsychEducation.org (home)

How Sensitivity and Specificity Affect Predictive Values

Presumably you're here because you looked at the predictive values as the the sensitivity of the MDQ changed, and what you saw was not what you expected.  Let me walk you through some calculations I did trying to convince myself I hadn't made a math error.  This is primarily a set of notes in case I ever needed to try to explain this logic to someone -- e.g. to see if it is correct!  

Let's take a hypothetical lab test and vary the sensitivity and specificity to extremes, just to see what's going on.  Remember for the MDQ we saw this pair of graphs for two different studies: 

                                      1. MDQ Hirschfeld 2000  ( sens 0.73, spec 0.90)             2. MDQ Hirschfeld 2003 ( sens 0.28, spec 0.97)

       

Imagine what you'd see if you did the same analysis for: 

Wouldn't you expect to see mirror image effects on positive and negative predictive value?  That's what I expected.  But that's not what happens: 

               Sensitivity 0.5, Specificity 1.0                                        Sensitivity 1.0, Specificity 0.5

  

Why does negative predictive value seem to be so minimally affected by cutting sensitivity in half?  Wasn't "sensitivity" supposed to refer to false negative rate?  Wouldn't that strongly affect negative predictive value? 

And conversely, why does positive predictive value go down so dramatically when specificity is cut in half?  Why is it affected so much more than negative predictive value?

To answer these questions, I found it useful to look at the 2x2 table results from which the graphs were generated.  While we hold prevalence at a middle value of 0.3, compare what happens to the predictive values as the sensitivity and specificity are radically varied.  Scroll down so that you can see examples A and B below on the same screen.  If you need a review of a/a+c versus d/c+d and all that, here's a review of those relationships

A. Sensitivity 0.5, Specificity 1.0

gold standard positive gold standard negative positive predictive value negative predictive value
test positive 150 0 1.00
test negative 150 700 0.82
sum 300 700

B. Sensitivity 1.0, Specificity 0.5

gold standard positive gold standard negative positive predictive value negative predictive value
test positive 300 350 0.46
test negative 0 350 1.00
sum 300 700

Just to make sure we're on the same page, the question we're trying to answer here is why positive predictive value is so much more dramatically affected (0.46 in B) than negative predictive value (0.82 in A).

You've probably already spotted the big difference, and it is hard to explain in words.  In case you can't just "see" the answer to our question here, I'll try to help:   although the prevalence is the same in each example (see that? -- in the "sum" figures?), the way the patients get split up in the 4 cells is very different in A versus B.   I'll let you puzzle it from there, to see why the graphs end up looking like this (it took me a while):

                Sensitivity 0.5, Specificity 1.0                                        Sensitivity 1.0, Specificity 0.5

At least by doing this I was able to satisfy myself that I'd not made some sort of dumb* error.  (If you think so, let me know).  Instead, it looks like: 

Had enough? 

 

*(at one point I thought, ah, you dummy, it's a linear mirror relationship, you're just looking at one side of the graph -- from 0.1 to 0.5; but it's not linear, nor is it quite a mirror image: here's the rest of that curve.  One of the reasons this all looks so odd, above, is that we don't do tests for illnesses that have a prevalence near 1.0! i.e. we wouldn't have a reason to look at that end of this relationship)  

                            

Some math type, who's having a good laugh right now, can tell me later why it's not linear or mirrored).