What do many pieces of weak evidence add up to?
18 June 2000

    A critic of the Warren Commission who believes that distrusting the government is enough to invalidate its evidence and conclusions (see essay "Trust is not enough") sent me more than twenty examples of how the FBI had allegedly mishandled evidence in the JFK case. He considered that list sufficient to allow him to reject any and all evidence coming from the FBI. (Fallacious?) This is but one example of writers on both sides of the Kennedy case marshalling long lists of evidence in support of their causes. How should we view such aggregations?
    At first glance, long lists of evidence appear impressive. But we must not be automatically swayed by them because the probative value of a suite of evidence (its overall strength) depends on much more than numbers. Remembering that writers usually invoke long lists of evidence when they lack one or two pieces of truly decisive evidence, we must immediately become suspicious of such lists and go out of our way to evaluate the individual and collective strengths of their pieces cautiously, skeptically, and critically. The longer the list, the more we should be wary of it.
    Fallacy?
    The key to understanding the joint worth of multiple pieces of evidence is whether you know their individual reliabilities. 
    In the case mentioned above, all the pieces of evidence were worse than weak—they were irrelevant because the critic was fallaciously trying to use them to show that because the FBI had mishandled evidence in the past, the current pieces of evidence from the FBI that we were discussing should automatically be considered invalid. It was easy to formulate a reply to him—the worth of many pieces of irrelevant evidence is no more than the worth of a single piece of irrelevant evidence—a large number multiplied by zero is still zero.
    Typical lists are different, however, in that their items are relevant and weak. What is the joint worth of many such pieces? A simple example is provided by Mark Lane near the beginning of his classic book Rush to Judgment, where he describes the recollections of six witnesses who heard shots coming from the grassy knoll. Lane proclaims when so many many independent observers agree, their common denominator must be the truth. In other words, because these six witnesses all recall hearing shots from the knoll, shots must have come from there. Lane has an argument of sorts, because observations of witnesses like these six do possess a certain worth, especially when they all agree. Perhaps Lane had some simple, intuitive mathematics in mind, such as: if the chance of each witness being wrong was 50% (0.5), the chance they they were all independently being wrong was (0.5)6, or 1/64, or 1.6%. In other words, there was less than a 2% chance that they were all wrong. With more witness agreeing, this reasoning gives a chance that decreases to exceedingly low numbers. For example, if 20 witnesses all had heard shots from the depository, this approach would give a probability of their all being independently wrong was (0.5)20, or one chance in a million (0.0001%). Because this number is vanishingly small, the witnesses could not all have been wrong, and therefore shots absolutely come from the knoll.
    Let us examine this kind of argument both narrowly and broadly. From the narrow perspective, at least two things are wrong with it. The first, and most obvious, is that Lane chose his witnesses selectively. The mathematician would say that it Lane did not sample the population randomly, for there were many witnesses who heard the opposite—shots from the depository. Applying Lane's reasoning to six of these witnesses means that their common denominator must be the truth that shots came from the depository. This scenario of "dueling witnesses" is obviously flawed by the flawed selection process. So Lane's conclusion must be rejected.
    The second thing wrong with Lane's approach and our calculations, which are a rough guide to the mentality behind his approach, is the matter of the reliability of each of the witnesses. We refer to such testimonial evidence as "weak" because it cannot be validated in and of itself. We contrast it with "strong" evidence, which can be validated. A simple example may clarify the difference. The stretcher bullet, CE 399, may be considered to have been fired by the rifle found in the depository. This claim can be checked from the bullet itself, by comparing its ballistic markings with those of other bullets fired from that rifle. They match, and validate the claim about CE 399. But when a witness from Dealey Plaza claims to have heard shots from the grassy knoll, there is no way to check that statement because there is no tangible, physical record of what the witness heard. Perhaps we would should replace "strong" and "weak" with "testable" and "untestable." (Followers of Karl Popper would go a step further and use "falsifiable" and "unfalsifiable" instead.)
    Consider further the terms "weak" and "untestable." They really mean "unknowable," whatever "knowing" means. They imply several things, the first being that from testimony alone we cannot know whether that witness is right or wrong. Worse, we cannot know how close the witness was to being right—we can know nothing about the validity of the testimony. Moreover, we cannot even know whether the witness is in generally a good observer or a poor one, that is, we cannot assign a reliability to the witness. That prevents us from using any formal probability analysis in conjunction with that witness such as we tried above. It is hard to overemphasize how little we are left with if we cannot work with the probabilities of various witnesses being correct. It means that we are left with nothing—we can say nothing about the meaning of witnesses' observations individually or collectively, and that is a dire situation indeed.
    To see just how dire the situation is, let us return to the question in the title to this little essay, "What do many pieces of weak evidence add up to?" and rephrase it more concretely as "Are multiple pieces of weak evidence stronger than single pieces?" If a piece of weak evidence is untestable, and if nothing can be known about its validity, then multiple pieces of such evidence are no different from single pieces—if you can't know anything about them individually, you can't know anything about them collectively. Since you can't conclude anything definite from one such piece, you can't do it from multiple pieces, either. Lane's six witnesses are then no better than any one of them.
    This conclusion has profound ramifications for the worth of witness testimony in the JFK assassination. For example, we have all heard the argument that there are so many hints of conspiracy throughout the evidence that it cannot be denied. (That's like saying that you can prove conspiracy either with one conclusive pieces of evidence or with many inconclusive ones.) This argument is fallacious because the joint effect of all those little hints (weak evidence) is no more or less than of one alone—nothing proved. You can apply the same reasoning to knowing who the alleged conspirators were. For example, people often say something like "The fingerprints of the CIA were all over the assassination," implying that this is proof that the CIA did it. This reasoning is fallacious—multiple "fingerprints" carry no more weight than one because none are worth anything. The negative sense of the same argument is often advanced, as in "We have so many reasons to doubt the evidence from the autopsy that we must disregard it." This argument is just as fallacious as its positive sense.
    We stress that the qualitative nature of the conclusion about multiple weak evidence is unaffected by quantitative considerations such as the number of pieces of weak evidence. To state it most bluntly, no matter how many pieces of weak evidence one has, their net "worth," or probative value, is the same as any one of them—little or nothing. One million pieces of weak evidence is no better than one piece. So even if conspiracists bring forward all the weak evidence they can find, all the doubts and uncertainties they can muster, and their case will be no stronger than before. The true case is built solely on validated evidence.
    Anyone who doubts the correctness of this idea need look no further than the last 36 years of debate. In spite of the vast amounts of weak evidence advanced in the cause of conspiracy, that case is just where is started so long ago—much talk and nothing proven. If anything, there is more dissension in the ranks of the conspiracists than before, as their best efforts continue to bear no fruit. Students of this case who truly want to get the answer—and I am not convinced that most of the contemporary conspiracists really do—must restrict their efforts to testable evidence. Working with untestable evidence is a waste of time. To the extent that it misleads the researchers into thinking that they are getting somewhere, it is even worse.
    It is important to distinguish between the inherent worth of a collection of weak evidence and our ability to know that worth. Each piece of weak evidence bears a well-defined relation to its claim; it's just that we can't know that relation. For example, consider one of Mark Lane's witnesses who claimed to hear shots from the knoll.  In actuality, there may have been shots from the knoll, there may have not been shots from the knoll, or there may have been shots from somewhere close to the knoll. In those cases, the witness would have been right, wrong, or nearly right. But without something tangible to evaluate—the equivalent of the markings on the side of the bullet that trace it to some rifle—we have no way of knowing whether the witness is remembering correctly. This is why many witnesses are no better than one—we don't know anything about how accurately any of them observed. But there are two other cases of multiple witnesses that we can consider: (a) we know each one's accuracy (the opposite of the first case), and (b) we know the general reliability of each witness (the probability that a given witness will observe a given kind of event properly). In case (b), we can use the conventional formulas of probability to understand the validity of their collective observations. For example, if each of Lane's witnesses could correctly sense the direction of unexpected gunshots 70% of the time and they all said shots came from the knoll, the chance that all six were right (that shots did come from the knoll) would be (0.7)6, or only 0.118 (11.8%). [This kind of calculation is very interesting in that it shows that in order to conclude much from a group of witnesses who all agree, they each have to be very accurate observers; even 90% reliability for each gives only a 53% probability that they are all right.] In case (a), the collective strength of the witnesses is just that of the best observer if you are willing to disregard all the other, weaker, observations, and the average of all the reliabilities if you are not willing to discard any.
    In closing we note that collections of weak evidence contrast starkly with the webs of circumstantial evidence that usually bring convictions in major crimes where no direct evidence is available. We must be careful not to confuse indirect evidence with weak evidence, for they are very different. Indirect evidence is strong evidence that bears directly on some aspect of the crime other than who committed it. Although this evidence is perfectly validated, it is often colloquially considered to be weak because it doesn't tell who did it, but this is a great misperception. It is strong but in a different direction from the desired one. In great contrast is weak evidence, which in and of itself is ill-defined. No matter which direction it is considered from, it is uncertain and untestable. In short, we must never equate weak and indirect evidence. Indirect evidence is testable and validated, but applies to something other then the core of the crime Weak evidence is untestable and unvalidated, no matter what it applies to. The two are hugely different.

    Perhaps the easiest way to see the flaw in Lane’s basic argument is to note that Dealey Plaza also contained at least six witnesses who reported the opposite—no shots from the knoll. By Lane’s own argument, the common denominator of these six witnesses would also be the truth. But this second truth would be incompatible with Lane’s first truth—that from his six witnesses. Obviously, we cannot have two "truths" that are the opposite of each other. Something is very wrong in Lane’s way of determining the truth.