It’s Defensible To Think Claims Aren’t Evidence
Bro, They’re Just Sentences
If you’ve been on philosophy/religion/epistemology social media, you’ll have noticed a few weeks ago that there was a kerfuffle between atheist YouTuber Matt Dillahunty and Princeton PhD candidate Joe Schmid. The dispute centered around whether claims counted as evidence. I would say it was an interesting back and forth, but I would be lying.
I agree more with Dillahunty on the substance, but I don’t like the way he argued for it. Instead, I want to recommend other good posts published on substack about the issue, many of whom I disagree with the overall claim:
This post is going to be somewhat nitpicky. I am clearly using the term “claim” differently from the above substackers and Joe Schmid. I want to be clear from the outset that I agree that testimony is evidential, even if substantially weaker than other data points. I find the nuances and disagreements about testimony more interesting, and I’m going to talk about them in a future post (my next post? Bro, my life is weird right now, this post was sitting in my drafts for a week mostly done).
So, in this post, I’m not focusing on testimony, but claims. Again, that may seem like a pedantic hair split, but the reason I’m doing this is because I want to outline psychologically how people assess claims, testimony, evidence, and so on.
I want to disentangle what I call vague Bayesianism (i.e Bayesianism that pretends it accounts for causality and relevance) from good reasoning, because there’s way too many undergraduates and philosophy novices annoying me (and name-calling?) in my note comments saying that the Schmid-ish account of evidence is mere Bayesianism. It isn’t.
What you should take away from this post is that a lot of people use terms like claim, testimony, evidence, and Bayesianism in a messy way to the point that you should not take them seriously (to be clear! I’m not talking about anyone I’m criticizing by name). A proper accounting of these terms will have us updating our beliefs to where we (almost obviously) don’t assess claims (declarative sentences) as evidence.
The more interesting conversation is about the nature of testimony, and I’m going to write about that soon. Let’s dive in.
Why Claims Can Be Evidence
First, let’s look at things in a vague, a priori Bayesian sense. In that manner, yes claims can be evidence.
Specifically:
“A piece of data E is evidence for a hypothesis H if learning1 E raises the probability for H.”
So, if Jesus was resurrected from the dead, hearing about the testimony of him rising from the dead or someone out in the public uttering “Jesus rose from the dead” are both evidential. In a world where Jesus rose from the dead, you’d expect people testifying for it, and the claim “Jesus rose from the dead” to circulate in an oral culture.
The reason why is that, again, those data points are expected, given the truth of the hypothesis. They improve the probability of the hypothesis.
So yes, in a vague (notice I didn’t say strict) Bayesian sense, the statement “claims are evidence” is correct. But I won’t even go so far to say this is correct in a general sense. Because it’s not. There’s more to assessing evidence than mere Bayesianism! What’s more, there seems to be more evidence against the idea that claims are evidence.
Why Claims Aren’t Evidence
Bayesian Probability Is One Tool Among Many
Bayesian probability is one reasoning tool in a suite of tools. It’s not sufficient on its own! It helps us organize and update our beliefs given new data, but it doesn’t tell us what evidence is relevant or explanatory (i.e has an adequate account of causality).2
Bayesian probability can’t tell us which evidence should update our beliefs significantly, and how to update our beliefs, given new evidence that we subjectively assess.
I look at Bayesian probability through a Humean lens. It works as a matter of fact, not logical necessity. It could be the case that updating your beliefs using this formula leads to less reliable beliefs or models about the world. Still, it is one tool among many tools. Bayesian inference is not the totality of evaluating evidence, philosophy of science, or (I would imagine) conditional probabilities).
Not everyone who uses Bayesian inference agrees on all the terms and concepts, so it’s really annoying when undergrads who saw one Huemer video harass me for disagreeing with Joe Schmid. BUT ANYWAY!
Do Churches Cause Car Crashes?
This is important because sometimes variables or conditions in a model are indicators of other, more causally relevant variables and conditions. How we assess which variables and conditions are more relevant is often subjective.
My favorite example from this came from my research methods class in grad school:
If we look at various data points when comparing two cities of different sizes, we may find that there are more traffic accidents in the city with more churches. If we put in “number of churches” in a city in some model, that model will be somewhat more accurate at predicting where more crashes are going to happen than just your intuition.
So churches obviously cause more car crashes, right? No! Both the number of churches and the number of car crashes will increase as a function of the population, as well as other factors such as wealth. Though using churches as a variable or data point when trying to predict car crashes isn’t strictly false, it’s an inferior model to models informed by better causal explanations.
Sure, after the fact, one can update models and choose the better variable for something like population increases causing more car accidents. But for many of the points of philosophical disputes, we don’t have the same level of rigorous data to derive precise measurements as we do for car accidents.
Bayesian probability helps us mathematically update our models of the world, given what those models say about causal relationships within the world. But it does not give us a science of causality on its own. For matters of dispute where we don’t have strict measurements (like the existence of God), we’re close to hopeless to resolving disagreements about evidence without an independent philosophy of causality.
You Should Actually Hypothesis Test
What’s more, this vague understanding of evidence seems to forget about the hypothesis testing aspect of Bayesian inference. Namely, absent a theory of causality and competing hypotheses as a point of comparison, this vague Bayesian framework (i.e. “A piece of data E is evidence for a hypothesis H if learning E raises the probability fo H.”) is just confirmation bias where any trivial data point can appear as evidence.
Let’s take the hypothesis “My grandmother painted the sky blue.” The fact that the sky is blue, under the vague Bayesian framework, is evidence that my grandmother painted the sky. The problem with this formulation is that across all hypotheses about why the sky is blue, all of them are going to incorporate the data point that the sky is blue!
If we’re having an argument for why the sky is blue, it would be silly to say that the mere fact that the sky is blue is evidence for your theory. Indeed, it would feel like you have very little convincing evidence that your grandmother painted the sky blue if you pointed to something so trivial.
What’s more, if you started listing random data points and attributed them to supporting your hypotheses without explanation, that would also appear ad hoc and desperate, simulating philosophical rigor rather than demonstrating it.
In this way, when you have a bad or non-existent theory of causality and no alternative hypotheses, using a Bayesian framework just looks like confirmation bias. Any data point can support a theory, or be seen as substantial evidence merely because it’s observed. It’s cold outside. Abraham Lincoln was once president. Grass is green. All of these support my hypothesis!
The line between “evidence” and “all other data” becomes blurred to non-existent. And at that point, we’re not doing philosophy, science, or any sort of rational discernment anymore, but assuming all of our observations are naively what we believe or hope them to be.
We Need Better Terms
To overcome this problem, we have to have a screening mechanism to separate good evidence from mere irrelevant data. We need to have an account for at least three different kinds of data points:
Data that are compatible with multiple conflicting hypotheses (to answer the hypotheses testing vulnerability). I call this data dormant evidence. Some may call this circumstantial evidence, but I prefer the dormant label because I feel “circumstantial” is more about the circumstances of a specific crime case, not about the relation the data has relative to a hypothesis. Your mileage may vary!
Data that have no causal implication on a hypothesis (to answer the vulnerability of lacking causality/relevance). I call this data irrelevant data.
Data that move the needle significantly in terms of proving hypotheses. I call this data good or compelling evidence.
Notice, this screening process does not supplant or contradict Bayesian inference, so much as it complements or enhances it. In the same way that the rules of logic by themselves do not tell us which propositions are correct, Bayesian probability doesn’t tell us what is true or likely to be true by itself. The nature of the evidence, as well as the subjective evaluation of the evidence does that.
Dormant/Circumstantial Evidence
To illustrate what this looks like, let’s look at a crime scenario.
Let’s say that Sarah has been murdered. We have four suspects: Jimmy, Ronald, George, and Bill. Forensic investigation shows that Sarah was killed by a specific kind of gun that all four of the suspects possess.
In a vague Bayesian way, yes, the fact that all four of them own this firearm is evidence that all of them committed the crime. But it’s not good or compelling evidence because there are mutually exclusive candidate hypotheses that accommodate for the data point of having that firearm. In other words, the fact that Jimmy owns the firearm is evidence that Jimmy committed the crime, that Ronald owns the firearm is evidence that Ronald committed the crime, and so on.
When good or compelling evidence is found (motive, forensics, location of the suspects, etc.) the gun ownership data point will become good evidence to support a hypothesis. But until that happens, this evidence point is circumstantial or dormant. That doesn’t mean it’s bad evidence or irrelevant, just that there are multiple causal theories that account for it, and so we should treat this data as anything special.
Irrelevant Data
Irrelevant data is data that is causally irrelevant to a hypothesis. Going back to the church example, the number of churches in a city are causally irrelevant to car crashes.3 There are many places that have lots of car crashes and very few churches. The reason why we have to make a distinction about relevance is because, without a causal theory for why the world works, you can (basically) hack Bayesian calculations with irrelevant data points, just by vaguely gesturing that it’s expected under a hypothesis.
This also goes back to why hypothesis testing (or comparison) is important. If we are just evaluating one hypothesis we can basically mine for any data points, say that they’re expected under a hypothesis, and artificially increase our belief accordingly. Going to the “Grandma painted the sky” hypothesis, such a hypothesis seems really compelling if that’s the only hypothesis you have. But when others come to the fold, the grandma hypothesis seems much less convincing.
Without screening for irrelevant data that’s not causally accounted for or hypothesis testing, we can use “show trial” logic. That is, we can come to conclusions using methods that aren’t oriented to the truth. In the murder example, we can convict Ronald (he probably had it coming, if you’re smart enough to spot the irrelevant pattern for me naming my suspects) by not looking at other suspects or having a rigorous explanation for why data points support a hypothesis: I had tea this morning. The sky is blue. Duke lost. My wife is beautiful. All of these are expected under the “Ronald killed Sarah” hypotheses. He’s guilty.
To repeat: Bayesian probability by itself cannot tell us that this is bad reasoning. It just tells us how to update our beliefs, given our understanding of relevance and causality.
Good/Compelling Evidence
The difference between good/compelling evidence and dormant/circumstantial evidence and irrelevant data, is that (1) it’s data that is causally relevant to explaining a phenomenon and (2) it’s unaccounted for by other competing hypotheses.
If I have video evidence of Jimmy shooting his gun at Sarah, that is good evidence that he at least tried to murder her. It activates dormant evidence (that he owned at least a similar gun as the murder weapon, that he was close to her at the night of the murder, that he dislikes her personally, etc.) as good evidence. This is that dormant data that was compatible with other hypotheses is now more relevant, given the existence of good/compelling evidence.
Together, Sarah being shot dead, the murder weapon being the same gun that Jimmy owns, Jimmy’s gun not having any ammo, and video surveillance of him shooting at Sarah are all data points that vary from dormant to compelling evidence. But the compelling evidence (the video surveillance) is of a different nature and quality than the other dormant evidence because of how clearly (with high probability) it supports the “Jimmy is guilty” hypothesis relative to others.
What Of Claims And Testimony?
So let’s take this framework and apply it to both testimony and claims. What kind of data are they?
A claim is just a declarative sentence. I don’t want to say “a sentence asserting a fact” because that’s redundant. Claims are clearly irrelevant data points. The reason why is because the human imagination can come up with any combination of words for a declarative sentence that has no relationship with reality. There are many instances when mistaken people came up with false declarative sentences as well.
Many declarative sentences lack causal explanation within their structure. Let’s take the claim “Draymond Green is the best philosopher in New York.” Though the nouns in this sentence, as far as we know, are real (Draymond Green, philosopher, New York), there is no causal explanation embedded in the sentence that connects them to reality. This is not evidence. It’s just a sentence I made up.
A claim can be updated to evidence, when it embeds itself in a chain of custody of causality. At that point, it becomes testimony. Testimony is a form of a claim that is embedded within an additional form of evidence, specifically sense perception.
For example, the claim “I read in the New York Times that Draymond Green is the best philosopher in New York” is evidence because it’s a form of testimony. Someone perceived something, reported it, and it was reported to other people through a publication. I can draw a line of perception from me reading the claim to the person who perceived it.
Admittedly, this example is not the best because it conflates subjective assessments with intersubjective perception (it’s harder, if not a category error, to “perceive” someone as the best philosopher than to perceive them as a professional basketball player). In this way, testimony itself can be low quality because of the quality of the data within the testimony or the limitations of the one doing the testimony. “Better” testimony would be to make a claim and support it with additional data other than a single perceptual data point.
For example: “I read in the New York Times that Draymond Green is the best philosopher in New York because of his countless articles published in the best philosophy journals about aesthetics.”
This kind of testimony is of higher quality because it tells you the perceptual chain of custody, was perceived (not just made up), and points to evidence external to the perception that could itself be falsified.
Admittedly, not all testimony will be of this quality, and not all of it needs a falsification criteria. Indeed, many testimonial accounts are just false. My point here is that claims are different from testimony, and that there’s a variety of quality of testimony. In a future post, I’ll talk about testimony in depth, but for now, that’s all we need to know.
So Are Claims Evidence?
Given the above, I feel justified saying that claims aren’t evidence. They are not evidential because they are just as likely irrelevant data points in relation to a hypothesis as they are dormant or good evidence.
Put in hypothesis testing terms: The existence of claims is compatible with both the hypotheses that they are truth-tracking, and that they are not-truth-tracking. However, the truth-tracking hypothesis does not accommodate for the data that humans have wild imaginations and that many claims (declarative sentences) are merely false.
Thus, if the purpose of using Bayesian reasoning, logic, etc is to shape more accurate reasoning to accomplish goals (i.e. “to make us more rational”) those that treat claims as mere declarative sentences of no evidential quality will achieve that purpose better than those who treat claims as evidence.
This is an a posteriori assessment of claims, testimony, evidence, etc. I assume the vague Bayesian assessment of data, evidence, etc. point it back at Bayesianism, and then update my beliefs accordingly.
This assessment could be wrong, but if it is, it’s not wrong in a priori logic, but in the facts that inform and update my assessment. If you want to prove this wrong, simply show that the existence of a declarative sentence is close to being true merely by being a declarative sentence. It is logically possible to make this argument, but most philosophical thinkers, being shaped by experience, are probably not going to do so.
Though I consider myself a Humean, this argument is still compatible with Bayesianism. A true Bayesian, after all, updates their credence when new evidence arises, and that includes best methods of assessing evidence.
To Wrap Up
For most people outside of a philosophy classroom, claims are obviously not evidence. It may be the case that in a vague, broad Bayesian formulation a priori that claims are evidential, but when we update our beliefs about the world and refine our definition and use of terms, claims are not evidence.
One need only look at how we use these terms where people have “skin in the game,” about the truth of a matter. In courtrooms and criminal investigations, claims (again, not to be confused with testimony) are not evidence. For every claim you make, you have to support it with evidence. Using substantiated claims to form a hypothesis, you persuade a jury that your hypothesis is correct instead of the other party’s.
In the education system, when you’re teaching children to write persuasively, claims are a separate species of sentence than evidence. Usually the first sentence of a paragraph is a claim, with the following sentence considered the evidence and the analysis of that evidence. You’ll fail a high school English class if you treat claims as evidence.
Again, it may be the case that the vague Bayesian account of claims is correct a priori, but the a posteriori factual experience of a society discerning truth says that this assessment is either wrong or inferior to the a posteriori model. Put more succinctly: As a society, we colloquially understand claims and evidence separate from each other for good reason!
I prefer this to online philosophy of religion discourse. It operates differently in this weird zone where hypotheses are vaguely outlined, competing hypotheses are straw manned, and relevance is rarely discussed. Some of us are taken aback at how low the standards of evidence are for theistic and theistic-sympathetic thinkers, and how they arbitrarily pull irrelevant data points to support their hypotheses while vaguely gesturing to how naturalistic explanations don’t account for the data (usually for false reasons they invented).
Anyone who has had to use these truth-tracking tools in the real world (with consequences on the line) for a prolonged amount of time realizes how cheap these tactics are. They’re not really using Bayesian tools correctly within a suite of philosophical tools, but neglecting those other tools, using Bayesian reasoning poorly, all while using the vague language of Bayesianism.
To be fair to those who disagree with me, there’s a serious philosophical disagreement about the nature of testimony, trusting your intuitions, and so on. In my opinion, this is the real disagreement between people like Schmid and people like Dillahunty. I’ll talk about that in a future post.
The word “learning” in this sentence is doing a lot of heavy lifting!
This was pointed out by the philosopher Gilbert Harman.
In fact, what causes there to be more churches often causes there to be more car wrecks. People!




