sotto voce

The Inherited Tools

What’s new isn’t the question—it’s the machine.

The question of when to trust a source of information has been worked on for a long time. Epistemology—the study of knowledge, what it is, how we get it, when our beliefs are justified—has spent decades building frameworks for understanding how we come to know things through processes we don’t fully control, through the words of others, through tools and instruments. These frameworks weren’t designed with AI in mind. But they offer resources.

Some philosophers have begun applying them— Floridi on distributed knowledge, Coeckelbergh on AI testimony, Goldberg on extended epistemic agency. But the work remains scattered across different literatures, and the frameworks are rarely brought together to see where they converge or conflict. The frameworks are rarely brought into direct conversation with each other, and even more rarely applied systematically to see what picture emerges when they’re used together.

This post maps four of them: reliabilism, testimony theory, inferentialism, and model epistemology. These four aren’t exhaustive—virtue epistemology, social epistemology, and Bayesian approaches all have resources to offer—but they cover the core questions: process reliability, source testimony, conceptual content, and artifact trust. Think of this as an inventory of inherited tools before we try to use them.

One dimension this selection sets aside: collective and institutional effects. Social epistemology asks what happens when AI becomes shared epistemic infrastructure—how it affects the distribution of cognitive labor, whether it concentrates or diversifies epistemic sources, what it means for AI to mediate knowledge at population scale. Those are real questions, but they’re downstream of the one this project starts with: whether trust in AI outputs is warranted at all. The collective-level analysis comes later, once the individual-level foundations are in place.


Reliabilism: The Process That Produces the Belief

The most influential framework in contemporary epistemology is reliabilism, developed primarily by Alvin Goldman beginning in the 1970s. Its core claim is simple: what makes a belief justified is the reliability of the process that produced it.

This departs from older views that focused on the believer’s reasons. Traditional epistemology asked: can you give good reasons for what you believe? Do you have evidence? Can you justify it? Reliabilism asks something different: was your belief produced by a process that tends to get things right?

Consider perception. When I look out my window and form the belief that it’s raining, I typically don’t reason my way to that belief. I just see the rain. The belief forms automatically. On traditional views, this raised puzzles: where’s the inference? Where’s the evidence? Reliabilism dissolves the puzzle. What matters isn’t whether I reasoned well, but whether perception—the process that produced the belief—is reliable. And it is. Perception, under normal conditions, tends to produce true beliefs about the environment. That’s what justifies my belief that it’s raining.

The framework extends naturally to other cognitive processes: memory, logical inference, even certain forms of intuition. If a process tends to produce true beliefs, then beliefs produced by that process are justified, at least to that extent.

What Reliabilism Offers

Reliabilism provides a way to evaluate belief-forming processes without requiring that the believer have conscious access to evidence or reasons. This matters because much of our knowledge comes through processes we don’t fully understand or control. I don’t know how my visual system works. I can’t give you the algorithm. But I can still have justified beliefs through vision because vision is reliable.

For AI outputs, the relevant question becomes: is the process that produced this output reliable? We don’t need to ask whether the AI “understood” or “reasoned” or had “evidence.” We just need to ask whether the process tends to produce truth.

The Problems It Faces

The most persistent challenge to reliabilism is the generality problem. Any belief-forming process can be described at multiple levels of generality. When I see rain through my window, I’m using: perception; visual perception; visual perception in daylight; visual perception through glass; visual perception through this particular window—and so on.

These descriptions pick out different process types with different reliability profiles. Perception in general is highly reliable. Perception through dirty glass in fog is less so. Which description is the right one for assessing the belief?

The problem is that reliabilism needs to pick out a process type at a specific level of generality, but the theory itself doesn’t tell us which level. Without a principled answer, reliability assessments become arbitrary.

For AI, this problem becomes acute. Is the relevant process “language model inference”? “LLM inference”? “LLM inference on medical questions”? “LLM inference on cardiology questions asked in English by users who provide relevant history”? The reliability of these process types varies dramatically, and reliabilism alone doesn’t tell us which level matters.


Testimony: Learning from Words

Testimony theory asks how we gain knowledge from the words of others. Most of what any individual knows comes from testimony—from teachers, books, news sources, conversations.

The classic debate concerns the role of reasons. Reductionists hold that testimony reduces to other sources: we’re justified in believing what others tell us only when we have independent reasons to think they’re reliable. Anti-reductionists hold that testimony is a basic source: we’re entitled to believe what we’re told unless we have specific reasons for doubt.

More recent work, particularly by Jennifer Lackey, has moved beyond this debate to ask more structural questions about testimony itself—what it requires from both speaker and hearer.

The Statement View

Lackey argues that what we learn from testimony is tied to the statement made, not the speaker’s internal states. On this view, a speaker can transmit knowledge even if they don’t believe what they’re saying, as long as their statement is reliably connected to the truth.

Her example is a creationist science teacher. Suppose a teacher personally believes in young-earth creationism but reliably teaches evolutionary biology because she’s a professional who follows the curriculum. Her students can gain knowledge of evolution from her testimony, even though she doesn’t believe what she’s teaching. What matters is that her statements are reliable, not that she personally endorses them.

Lackey’s full view is dualist, imposing conditions on both speaker and hearer—norms of reasonableness and epistemic propriety that go beyond mere statement reliability. But the statement view is the component most relevant for AI, since AI straightforwardly fails the mental-state conditions traditional views require. The interesting question is whether statement reliability can suffice when the source lacks belief entirely.

The Hearer’s Conditions

Lackey also emphasizes what’s required of the hearer. Testimony doesn’t produce knowledge automatically. The hearer needs to be paying attention, needs to understand what’s being said, needs to not have defeaters—reasons to doubt the testimony. Mere exposure to true statements isn’t enough.

This raises questions about epistemic responsibility: how much work must the hearer do to assess testimony? Some views require active evaluation. Others require only the absence of specific reasons for doubt. The difference matters for how we should approach AI outputs.

What Testimony Theory Offers

For AI outputs, testimony theory offers a framework that focuses on the reliability of statements rather than the mental states of the source. If what matters is whether the output is reliably connected to the truth—rather than whether the AI “knows” or “believes” or “understands”—then testimony theory provides resources for thinking about AI outputs as testimony-like, even if AI lacks the mental states we associate with human testifiers.

The creationist teacher is suggestive. Perhaps AI is relevantly similar: producing reliable statements without the internal endorsement that characterizes human belief.

The Problems It Faces

The central question is whether AI outputs count as testimony at all. Testimony, on most accounts, involves assertion—an act performed by an agent who takes responsibility for what they say. Does AI assert? Does it take responsibility?

Some philosophers argue that testimony requires interpersonal relations—trust, accountability, the possibility of being held responsible for falsehood. If testimony is essentially social in this way, AI outputs might not qualify, no matter how reliable they are. The framework may illuminate some aspects of AI outputs while going silent on others.


Inferentialism: Content as Inferential Role

Inferentialism, developed most fully by Robert Brandom, takes a different approach. Where reliabilism asks about processes and testimony theory asks about statements, inferentialism asks about conceptual content—what it means to grasp a concept at all.

The core claim: conceptual content is constituted by inferential role. What a concept means is determined by what follows from it and what it follows from. To grasp a concept is to have practical mastery of these inferential relations.

What does it mean to grasp the concept “red”? On one view, it means being able to apply the word to red things—having a reliable disposition to say “red” in the presence of red objects. Brandom argues this isn’t enough. A parrot trained to say “red” in the presence of red things doesn’t grasp the concept red.

What’s missing? The inferential connections. Someone who grasps the concept red understands that red things are colored, that red is incompatible with green in the same place at the same time, that red is more specific than colored, that calling something red commits you to certain consequences and is licensed by certain conditions. These inferential relations are what make it a concept at all.

Material Inference

Brandom distinguishes formal inference (logic) from material inference (content). The inference from “it’s raining” to “the streets will be wet” is materially valid—its validity depends on the content of the concepts involved, not on logical form. Material inferences are what give concepts their content.

This matters because it means conceptual understanding can’t be reduced to pattern-matching or reliable response. Producing outputs that follow the right patterns isn’t the same as grasping the concepts involved.

The Space of Reasons

Brandom draws on Sellars’s idea of “the space of reasons”—the domain of normative assessment, where claims are evaluated as justified or unjustified, where giving a reason commits you to its consequences. To have a concept is to be in the space of reasons: to be able to give and ask for reasons, to recognize when one claim supports or undermines another.

This is a high bar. It’s not enough to produce outputs that match the patterns of correct inference. One must be in the game of giving and asking for reasons—participating in a practice governed by norms.

What Inferentialism Offers

Inferentialism provides a framework for asking whether AI outputs constitute genuine conceptual activity or sophisticated mimicry. The question isn’t whether outputs are reliable or whether statements track truth. It’s whether the system has genuine mastery of inferential relations—whether it’s in the space of reasons at all.

This is not about the quality of outputs but about what kind of thing is producing them.

If AI lacks genuine inferential mastery, then its outputs aren’t assertions in Brandom’s sense—they’re sophisticated emissions. That matters for whether AI can be held accountable, for whether its outputs can serve as reasons in our own reasoning, for whether we’re in a discourse with AI or just receiving patterns. These aren’t academic distinctions. They affect how we should relate to AI outputs epistemically.

The Problems It Faces

The challenge is determining whether AI systems have inferential mastery. Language models often produce outputs that look inferentially connected—they note contradictions, draw consequences, respond to reasons given by users. But is this genuine inferential mastery or pattern-matching that mimics it?

This is hard to settle because the surface behavior might be identical. A parrot’s “red” sounds like a human’s “red.” A language model’s inference might be structurally identical to a human’s inference while being produced by a fundamentally different process. Inferentialism raises the question sharply but may not provide the tools to answer it.


Model Epistemology: Learning from Artifacts

The fourth framework comes from philosophy of science, particularly work on computer simulation and epistemic artifacts. Two bodies of work anchor it. Eric Winsberg’s work on simulation epistemology addresses how we gain knowledge from computational models. C. Thi Nguyen’s work on trust and epistemic artifacts addresses what it means to trust the tools we use for inquiry.

Between Theory and Experiment

Winsberg’s starting observation: simulations are everywhere in contemporary science. Climate models, economic forecasts, epidemiological projections—major decisions rest on simulation outputs. But simulations are strange epistemic objects. They’re not experiments (they don’t manipulate physical reality). They’re not pure theory (they involve discretization, approximation, and choices that go beyond the underlying equations). They’re something else.

Winsberg argues that simulations occupy a distinctive epistemic position. Like theory, they start from mathematical frameworks. Like experiment, they produce “data” that requires analysis and interpretation. But they’re not reducible to either.

The reliability of a simulation can’t be assessed by purely mathematical means (too complex) or by direct comparison with reality (often we’re simulating what we can’t observe directly). Assessment requires holistic judgment based on background knowledge, track record, robustness checks, and domain expertise.

Fictions in Models

One of Winsberg’s striking claims is that simulations often involve fictions—components known to be false and not intended to represent reality, but that contribute to overall reliability.

In multi-scale simulations where different physical theories apply at different scales, the transition regions sometimes use fictional hybrid entities—atoms with impossible properties, say, or forces that don’t correspond to anything physical—that make the mathematics tractable while representing nothing real. These fictions improve overall reliability while being locally unreliable. A simulation’s accuracy can’t be assessed component by component.

This challenges simple truth-tracking views. A model can be reliable overall while containing parts that aren’t reliable guides to anything.

Trust in Artifacts

Nguyen addresses what it means to trust epistemic artifacts—tools we use for inquiry. He distinguishes trust from mere reliance. I rely on my bookshelf to hold my books; I don’t feel betrayed when it collapses, just annoyed. But I trust my climbing rope in a deeper way—I adopt an unquestioning attitude toward it, letting its reliability drop out of conscious monitoring. When trusted things fail, we feel something sharper than disappointment.

The unquestioning attitude is Nguyen’s key concept. To trust something is to suspend deliberation about its reliability—to give it a direct pipeline into cognition and action. This creates efficiency (we can’t question everything) but also vulnerability (we’re exposed when trusted resources fail).

Functional Integration

Nguyen argues that we can trust objects, not just agents. What grounds the reaction of betrayal isn’t that the object made a commitment; it’s that we had integrated it into our functioning. The ground, the climbing rope, our memory—these become part of how we operate. When they fail, we feel alienated, not just disappointed.

This integration has costs. Nguyen describes agential gullibility—integrating external resources into our agency too hastily, without adequate consideration of whether they deserve the trust we’re extending.

What Model Epistemology Offers

For AI, model epistemology offers a frame that doesn’t require treating AI as a testifier or asking whether it has genuine conceptual mastery. AI becomes an epistemic artifact—a tool we use for inquiry, whose reliability must be assessed holistically, and toward which we may develop something like trust.

This frame might fit AI better than testimony. We don’t typically ask whether our simulations “believe” their outputs or whether our instruments “know” what they’re measuring. We ask whether they’re reliable in the relevant domain. Perhaps AI is more like a simulation than a speaker.

The Problems It Faces

The main challenge is calibration. Winsberg’s framework assumes domain experts can assess simulation reliability through background knowledge and holistic judgment. But most users of AI aren’t domain experts. They can’t assess whether a language model’s medical output is reliable the way a climate scientist can assess a climate model.

Nguyen’s framework raises the worry directly. If we’re integrating AI into our cognitive functioning—letting its outputs flow into our beliefs and actions—but lack the expertise to assess its reliability, we may be extending trust that isn’t warranted. Agential gullibility becomes a structural risk.


What These Frameworks Share

Reading across these four approaches, one theme dominates: all of them take seriously that we are epistemically dependent on processes, sources, and tools beyond ourselves—and that this dependence doesn’t require the knower to have direct access to evidence or reasons.

And all of them face versions of the same challenge when applied to AI: specifying what kind of source AI is. Is it a process (reliabilism)? A testifier (testimony)? A participant in the space of reasons (inferentialism)? An epistemic artifact (model epistemology)? The frameworks offer different resources, but each requires deciding how AI fits its categories—and that decision isn’t obvious.


What Differs

The frameworks differ in what they ask.

Reliabilism focuses on truth-conduciveness. The question is whether the process tends to produce true outputs. Understanding, meaning, and conceptual content are beside the point.

Testimony theory focuses on the transmission of knowledge through statements. It raises questions about what’s required of sources and hearers, and whether AI can play the relevant roles.

Inferentialism focuses on what it means to have conceptual content at all. The question isn’t whether outputs are true but whether they’re the product of genuine conceptual activity.

Model epistemology focuses on epistemic tools and our relationship to them. The question isn’t about the artifact’s internal states but about how we should assess and relate to artifacts epistemically.

A system could be reliable without having conceptual mastery. It could have the surface features of testimony without being a genuine testifier. It could be an artifact we trust without our trust being well-calibrated. These tensions are part of what the subsequent posts will explore.


What Comes Next

The next four posts apply each framework to AI outputs and human-machine interaction. The question for each: what does this framework illuminate, and where does it go silent?

After that comes integration—bringing findings together to see where frameworks converge, where they conflict, and what the conflicts reveal. The first test is reliabilism: what happens when we ask whether a process we didn’t design and don’t fully understand can be “reliable” in the sense that matters?