Thursday, May 17, 2007

crimes and misdestimation

Story on an NYT blog today about a woman in the Netherlands who may be serving a life sentence because of an error in statistical reasoning and an example of sampling on the dependent variable.

Although the physicist explaining the error makes an error of his own:
[S]uppose that police pick up a suspect and match his or her DNA to evidence collected at a crime scene. Suppose that the likelihood of a match, purely by chance, is only 1 in 10,000. Is this also the chance that they are innocent? It’s easy to make this leap, but you shouldn’t.

Here’s why. Suppose the city in which the person lives has 500,000 adult inhabitants. Given the 1 in 10,000 likelihood of a random DNA match, you’d expect that about 50 people in the city would have DNA that also matches the sample. So the suspect is only 1 of 50 people who could have been at the crime scene. Based on the DNA evidence only, the person is almost certainly innocent, not certainly guilty.
The error is that if the police had picked the person up as a suspect completely at random and found that their DNA had a 1 in 10,000 match to that found at the scene of the crime, then, yes, the person is most likely innocent. But, police tend to pick up suspects for nonrandom reasons, and the more the nonrandom reason is related to the actual probability that the person is the culprit, the less relevant the 1 in 50 calculation is and the more relevant the 1 in 10,000 probability is. Because there isn't a neat way of synthesizing this into a new probability estimate, people jump from one bad way of reasoning about the problem to another bad way of reasoning about the problem.

No comments:

Post a Comment