What
do many pieces of weak evidence add up to?
18 June 2000
A critic of the Warren Commission who believes that distrusting the
government is enough to invalidate its evidence and conclusions (see essay "Trust
is not enough")
sent me more than twenty examples of how the FBI had allegedly mishandled
evidence in the JFK case. He considered that list sufficient to allow him to
reject any and all evidence coming from the FBI. (Fallacious?) This is but one example of
writers on both sides of the Kennedy case marshalling long lists of evidence in
support of their causes. How should we view such aggregations?
At first glance, long lists of evidence appear impressive. But we must not be automatically swayed by
them
because the probative value of a suite of evidence (its overall strength) depends on much more than numbers. Remembering that writers usually invoke long lists of
evidence when they lack one or two pieces of truly decisive evidence, we must
immediately become suspicious of such lists and go out of our way to evaluate
the individual and collective strengths of their pieces cautiously, skeptically, and critically. The longer the
list, the more we should be wary of it.
Fallacy?
The key to understanding the joint worth of multiple pieces
of evidence is whether you know their individual reliabilities.
In the case mentioned above, all the pieces of evidence were worse than
weak—they were irrelevant because the critic was fallaciously trying to use
them to show that because the FBI had mishandled evidence in the past, the
current pieces of evidence from the FBI that we were discussing should
automatically be considered invalid. It was easy to formulate a reply to
him—the worth of many pieces of irrelevant evidence is no more than the worth
of a single piece of irrelevant evidence—a large number multiplied by zero is
still zero.
Typical lists are different, however, in that their items are
relevant and weak. What is the joint worth of many such
pieces? A simple example is provided by Mark Lane near the beginning of
his classic book Rush to Judgment, where he describes the recollections
of six witnesses who
heard shots coming from the grassy knoll. Lane proclaims when so many many independent
observers
agree, their common denominator must be the
truth. In other words, because these six witnesses all recall hearing shots from
the knoll, shots must have come from there. Lane has an argument of sorts, because observations of witnesses like these
six do possess a certain worth, especially when they all agree. Perhaps Lane had
some simple, intuitive mathematics in mind, such as: if the chance of each
witness being wrong was 50% (0.5), the chance they they were all independently
being wrong was (0.5)6, or 1/64, or 1.6%. In other words, there was
less than a 2% chance that they were all wrong. With more witness agreeing, this
reasoning gives a chance that decreases to exceedingly low numbers. For example,
if 20 witnesses all had heard shots from the depository, this approach would
give a probability of their all being independently wrong was (0.5)20,
or one chance in a million (0.0001%). Because this number is vanishingly small,
the witnesses could not all have been wrong, and therefore shots absolutely come
from the knoll.
Let us examine this kind of argument both narrowly and
broadly. From the narrow perspective, at least two things are wrong with it. The
first, and most obvious, is that Lane chose his witnesses selectively. The
mathematician would say that it Lane did not sample the population randomly,
for there were many witnesses who heard the opposite—shots
from the depository. Applying Lane's reasoning to
six of these witnesses means that their common denominator must be the truth
that shots came from the depository. This scenario of "dueling
witnesses" is obviously flawed by the flawed selection process. So Lane's
conclusion must be rejected.
The second thing wrong with Lane's approach and our
calculations, which are a rough guide to the mentality behind his approach, is
the matter of the reliability of each of the witnesses. We refer to
such testimonial evidence as "weak" because it cannot be validated in
and of itself. We contrast it with "strong" evidence, which can be
validated. A simple example may clarify the difference. The stretcher bullet, CE
399, may be considered to have been fired by the rifle found in the depository.
This claim can be checked from the bullet itself, by comparing its ballistic
markings with those of other bullets fired from that rifle. They match, and
validate the claim about CE 399. But when a witness from Dealey Plaza claims to
have heard shots from the grassy knoll, there is no way to check that statement
because there is no tangible, physical record of what the witness heard. Perhaps
we would should replace "strong" and "weak" with
"testable" and "untestable." (Followers of Karl Popper would
go a step further and use "falsifiable" and "unfalsifiable"
instead.)
Consider further the terms "weak" and
"untestable." They really mean "unknowable," whatever
"knowing" means. They imply several things, the first being that from
testimony alone we cannot know whether that witness is right or wrong. Worse, we
cannot know how close the witness was to being right—we can know nothing about
the validity of the testimony. Moreover, we cannot even know whether the witness
is in generally a good observer or a poor one, that is, we cannot assign a
reliability to the witness. That prevents us from using any formal probability
analysis in conjunction with that witness such as we tried above. It is hard to
overemphasize how little we are left with if we cannot work with the
probabilities of various witnesses being correct. It means that we are left with
nothing—we can say nothing about the meaning of witnesses' observations
individually or collectively, and that is a dire situation indeed.
To see just how dire the situation is, let us return to the question in the
title to this little essay, "What do many pieces of weak evidence add up
to?" and rephrase it more concretely as "Are multiple pieces of weak
evidence stronger than single pieces?" If a piece of weak evidence is
untestable, and if nothing can be known about its validity, then multiple pieces
of such evidence are no different from single pieces—if you can't know
anything about them individually, you can't know anything about them collectively. Since you
can't conclude anything definite from one such piece, you can't do it from
multiple pieces, either. Lane's six witnesses are then no better than any one of
them.
This conclusion has profound ramifications for the worth of
witness testimony in the JFK assassination. For example, we have all heard the
argument that there are so many hints of conspiracy throughout the evidence that
it cannot be denied. (That's like saying that you can prove conspiracy either
with one conclusive pieces of evidence or with many inconclusive ones.) This argument
is fallacious because the joint effect of all those little hints (weak evidence) is
no more or less than of one alone—nothing proved. You can apply the same
reasoning to knowing who the alleged conspirators were. For example, people
often say something like "The fingerprints of the CIA were all over the
assassination," implying that this is proof that the CIA did it. This
reasoning is fallacious—multiple "fingerprints" carry no more weight
than one because none are worth anything. The negative sense of the same
argument is often advanced, as in "We have so many reasons to doubt the
evidence from the autopsy that we must disregard it." This argument is just
as fallacious as its positive sense.
We stress that the qualitative nature of the conclusion about
multiple weak evidence is unaffected by quantitative considerations such as the
number of pieces of weak evidence. To state it most bluntly, no matter
how many pieces of weak evidence one has, their net "worth," or
probative value, is the same as any one of them—little or nothing. One million
pieces of weak evidence is no better than one piece. So even if conspiracists
bring forward all the weak evidence they can find, all the doubts and
uncertainties they can muster, and their case will be no stronger than before.
The true case is built solely on validated evidence.
Anyone who doubts the correctness of this idea need look no
further than the last 36 years of debate. In spite of the vast amounts of weak
evidence advanced in the cause of conspiracy, that case is just where is started
so long ago—much talk and nothing proven. If anything, there is more
dissension in the ranks of the conspiracists than before, as their best efforts
continue to bear no fruit. Students of this case who truly want to get the
answer—and I am not convinced that most of the contemporary conspiracists
really do—must restrict their efforts to testable evidence. Working with
untestable evidence is a waste of time. To the extent that it misleads the
researchers into thinking that they are getting somewhere, it is even worse.
It is important to distinguish between the inherent worth of
a collection of weak evidence and our ability to know that worth. Each piece of
weak evidence bears a well-defined relation to its claim; it's just that we
can't know that relation. For example, consider one of Mark Lane's witnesses who
claimed to hear shots from the knoll. In actuality, there may have been
shots from the knoll, there may have not been shots from the knoll, or there may
have been shots from somewhere close to the knoll. In those cases, the witness
would have been right, wrong, or nearly right. But without something tangible to
evaluate—the equivalent of the markings on the side of the bullet that trace
it to some rifle—we have no way of knowing whether the witness is remembering
correctly. This is why many witnesses are no better than one—we don't know
anything about how accurately any of them observed. But there are two other
cases of multiple witnesses that we can consider: (a) we know each one's accuracy (the opposite of the first case), and (b) we know the general
reliability of each witness (the probability that a given witness will observe a
given kind of event properly). In case (b), we can use the conventional formulas
of probability to understand the validity of their collective observations. For
example, if each of Lane's witnesses could correctly sense the direction of
unexpected gunshots 70% of the time and they all said shots came from the knoll,
the chance that all six were right (that shots did come from the knoll) would be
(0.7)6, or only 0.118 (11.8%). [This kind of calculation is very
interesting in that it shows that in order to conclude much from a group of
witnesses who all agree, they each have to be very accurate observers; even 90%
reliability for each gives only a 53% probability that they are all right.] In
case (a), the collective strength of the witnesses is just that of the best
observer if you are willing to disregard all the other, weaker, observations,
and the average of all the reliabilities if you are not willing to discard any.
In closing we note that collections of weak evidence contrast
starkly with the webs of circumstantial evidence that usually bring convictions
in major crimes where no direct evidence is available. We must be careful not to
confuse indirect evidence with weak evidence, for they are very different.
Indirect evidence is strong evidence that bears directly on some aspect of the
crime other than who committed it. Although this evidence is perfectly
validated, it is often colloquially considered to be weak because it doesn't
tell who did it, but this is a great misperception. It is strong but in a
different direction from the desired one. In great contrast is weak evidence,
which in and of itself is ill-defined. No matter which direction it is
considered from, it is uncertain and untestable. In short, we must never equate
weak and indirect evidence. Indirect evidence is testable and validated, but
applies to something other then the core of the crime Weak evidence is
untestable and unvalidated, no matter what it applies to. The two are hugely
different.
Perhaps the easiest way to see the flaw in Lane’s basic argument is to note that Dealey Plaza also contained at least six witnesses who reported the opposite—no shots from the knoll. By Lane’s own argument, the common denominator of these six witnesses would also be the truth. But this second truth would be incompatible with Lane’s first truth—that from his six witnesses. Obviously, we cannot have two "truths" that are the opposite of each other. Something is very wrong in Lane’s way of determining the truth.