Autism: The Truth Plus Sensitivity, Specificity and All That Is Decent to Reveal About Predictive Values
The Times is tremendously pleased with itself: Autism: the truth. In rather a classy way, they manage to include all of the flaws in the Observer's recent coverage of a leaked, unpublished report from the Autism Research Centre; they do all of this while refraining from criticism of either its rival or the benighted journalist responsible for the story. Nonetheless, they land some telling blows:
Baron-Cohen says the news story is alarmist and wrong. He does not believe that MMR has anything to do with autism. “We are gobsmacked, really, at how this draft report has got out,” Baron-Cohen says. “It was only in the hands of the authors – about half a dozen people. There are three professors listed, including me, and none of us was contacted. It was also seen by two PhD students for whom I have the utmost respect because they are very careful scientists...Autism Diva and Kevin Leitch have both expressed their thoughts on this article.
The draft report was leaked a week ahead of [the] GMC appearance [of Dr Andrew Wakefield and Profs Walker-Smith and Murch]. Baron-Cohen puts it like this: “We think it [the report] has been used. They’ve picked out the one figure that looks most alarmist.”
New health fears over big surge in autism was the original source for the 1 in 58 figure. Edited: July 24 as original story removed from Observer archive.
I did note that:
It is possible that the one-in-58 figure comes from ARC’s use of the Childhood Asperger’s Syndrome Test (CAST), a questionnaire that parents can use to assess whether their child may have autism. The ARC team has used it on Cambridgeshire children in mainstream schools. However, it does not provide a diagnosis and is known to result in a high number of false positives. Around half the children flagged up by CAST as possibly having autism turn out not to.It is obvious that a test that flags that number of false positives is of limited utility for specific purposes such as diagnosis but may (possibly) have some value if you are trawling for children who might benefit from follow-up screening and more sophisticated evaluation techniques.
Some time later today, I may expand this post to include an explanation of testing and the importance of two (or maybe more) of sensitivity, specificity, the positive predictive value and the negative predictive value. I will only do this if somebody else doesn't come up with a better explanation (virtual chocolate biscuits are on offer; Jaffa Cakes are not out of the question). For anyone who wants an explanation without my dubious intervention, the BMJ offers Understanding sensitivity and specificity with the right side of the brain. Beware, there are several errors in the text and some misleading sub-editing: however, some people find the diagrams helpful (less so if you are red-green colour-blind).
I should state upfront that a diagnosis of autism involves input from many people and several specialist assessments; I'm not aware of anybody that would suggest that a questionnaire might be the primary contribution to the diagnosis of a complex condition. Typically, there are parental, school and specialist assessments, and the behaviours should be present in more than one environment (not just home or school) alongside a slew of other factors. From the little that we know of the ARC report, it seems as if CAST is one of 6 assessment methods that were used. If the false positive rate is reported correctly then that would be in-line with other estimations of around 1 in 116 for children on the autistic spectrum (although this crude arithmetic overlooks the possibility of false negatives).
The word spectrum is also a flag that the researchers were not looking at a simple diagnostic test where the results of an investigation can classify the children into two clear-cut groups on the basis of the presence or absence of symptoms, test scores or behaviours.
True positive is someone who has ASD and tests positive by a test or instrument.
True negative is someone who is non-ASD (for the purposes of this example) and tests negative by a test or instrument.
False positive is someone who tests as positive but who is actually negative.
False negative is someone who tests as negative but who is actually positive.
Sensitivity is the proportion of true positives that are correctly identified by a test or instrument.
Specificity is the proportion of true negatives that are correctly identified by the test or instrument.
Without knowing anything more about the test, its results and the specific population on which it was used there are several possibilities. The test may seem over-sensitive and to flag up too many children as needing further, more sophisticated testing to confirm or rule-out a possible diagnosis of ASD. Conversely, however, it may be that although it flags up many children who are later determined to be false positives, it may have a very high-rate of specificity and be a very good first-pass instrument for ruling-out true negatives from further consideration and it might have very few false negatives.
Sensitivity and specificity are both proportions so it is practical for researchers to calculate confidence intervals for them which quantify their diagnostic range. For any test or instrument however, that is rarely sufficient; in order for a result to be truly useful, the researchers need to know how accurate the test/instrument is at predicting a true abnormality. What is the predictive value of the test?
Positive predictive value is the proportion of children with positive test/instrument results who are correctly identified as ASD.
Negative predictive value is the proportion of children with negative test results who are correctly identified as non-ASD.
The predictive value of the test/instrument will vary according to the population on which it is run, they are not universal. E.g., the predictive value may vary according to the likelihood of the tested condition being present in that population. So, in a cohort of children with identified but not otherwise classified emotional and behavioral disorders, one might expect a high prevalence of some positive test results. However, in a mainstream school, one might expect this prevalence to be lower, in which case, the more unusual it is to expect a positive result, the more certain a researcher can be that a negative test is a true negative and the less sure a researcher can be that a positive result really indicates an true positive.
If the facts are as reported in the Times, then it seems as if the CAST was run in a mainstream school, in which case one might have some confidence that the negatives are true negatives, but less confidence that the positives are true positives. Prof. Baron-Cohen seems to indicate that this may the case.
I may return to the topic of test/instrument results in a separate post. I will only return to the topic when I have fortified myself with the chocolate biscuits that were on offer to anyone who was willing to offer an earlier explanation.
Orac has, of course, posted an excellent discussion of test sensitivity and its impact on diagnosis and assessing whether or not a finding is clinically relevant. Early detection of cancer, part 1: More complex than you think
Superb discussion of screening, its role in early diagnosis and its distortion of results that report efficacy of treatment. Early detection of cancer, part 2: Breast cancer and the never-ending confusion over screening
HIV testing example deals nicely with the changing value of a test, depending on whether you are running it in a general population or a population where you expect the result to be positive or negative. Sensitivity, Specificity and Positive Predictive Value
This example of an infection in different clinical populations is also a good demonstration of the changing sensitivity and specificity and their impact on predictive value