Learning from Models

20 Jan, 2026

The frameworks so far share an assumption: AI outputs should be evaluated as coming from a source. Reliabilism asks whether the source is reliable. Testimony theory asks whether it can testify. Inferentialism asks whether it can assert.

Model epistemology reframes the question. Instead of treating AI as a source—something that knows, believes, or commits—it treats AI as an artifact. Something we build, validate, and use for epistemic purposes. The questions shift accordingly. Not "does AI know?" but "how was it built?" Not "can AI testify?" but "what was it validated against?" Not "does AI assert?" but "how should its outputs be integrated into reasoning?"

This reframe offers relief from certain puzzles. But it creates others.

A Third Mode

Eric Winsberg argues that computer simulations occupy a distinctive epistemic position. They are not pure theory—they do not derive conclusions from first principles alone. They are not experiments—they do not manipulate material reality. They are, in his phrase, "a new mode of science, between theory and experiment."

This matters because it determines how we assess reliability. Theoretical conclusions are assessed by checking derivations. Experimental results are assessed by checking procedures and material similarity to the target. Simulations require something else: background knowledge about modeling, domain expertise, and holistic judgment about whether the model captures what matters.

AI systems fit this third mode. They are not theories. They are not experiments. They are not testimony from agents with beliefs and intentions. They are artifacts—built, trained, deployed—whose outputs we use for epistemic purposes.

The artifact framing dissolves some puzzles from earlier posts. No testifier? Expected—artifacts do not testify. No commitment? Expected—artifacts do not assert. No accountability located in the system? The accountability shifts to builders and users, not the artifact itself.

But the framing raises its own questions.

Verification and Validation

Simulationists distinguish two forms of assessment.

Verification asks: does the simulation correctly implement the mathematical equations it is supposed to solve? This is about internal correctness—do the numerical methods converge to the true solutions of the equations?

Validation asks: are the equations themselves appropriate for the target system? Does the simulation output match reality in the relevant respects?

These might seem independent. In practice, Winsberg argues, they are deeply entangled. You often cannot verify without validating, or validate without verifying. The models are too complex for purely mathematical assessment. The empirical data may not match the simulation's domain precisely.

Consider a climate modeler assessing ocean heat uptake. She cannot simply check the equations—parameterizations of sub-grid processes, coupling between component models, all involve approximations whose correctness depends on whether outputs match observed climate behavior. But she cannot purely validate against observations either—the observational record is incomplete, covers different timescales, measures different quantities. Judgment fills the gap, moving between mathematical analysis and empirical comparison, neither sufficient alone.

For AI, this entanglement is severe—and differently structured. There are no equations to verify. The system learns weights from data. Verification in the traditional sense is impossible. What remains is validation: does the output match expectations in held-out data? Does it perform on benchmarks? Does it generalize?

But the validation is narrower than users assume. Held-out data comes from the training distribution. Benchmarks test specific capabilities. Generalization beyond these is uncertain and often unknown.

There is a further asymmetry. Winsberg notes that experiments are epistemologically prior: simulation knowledge depends on experimental knowledge, not vice versa (though in mature science the relationship involves mutual dependence—simulations guide experimental design even as experiments ground simulations). For AI, what plays this grounding role? Training data, presumably. Human annotations. The quality of whatever the system learned from. Most users have no access to this ground and no way to assess it.

Background Knowledge and Expertise

Winsberg's central claim about simulation reliability:

"If the quality of the background knowledge is high (and it is put to use skillfully), the resulting knowledge—whether from simulations or experiments—will be reliable."

The key phrase is "put to use skillfully." Reliability assessment requires domain experts who understand the model's assumptions, scope limits, and failure modes. For climate models, this means climate scientists who can interrogate the physics, probe the parameterizations, identify where assumptions hold and where they break.

For AI, most users lack this background knowledge. They cannot assess training data quality. They do not understand architecture choices or optimization targets. They cannot identify scope limits because those limits are not documented—and may not be known even to the builders.

Winsberg identifies a "palette" of techniques scientists use to assess simulations: comparison with prior data, comparison with observations, theoretical analysis of model components, robustness checks across different modeling choices, track record in engineering applications.

Some of these transfer to AI. Benchmarks provide comparison with prior data. Red-teaming probes for failures. Robustness checks test whether outputs hold under perturbation. But theoretical analysis of components does not transfer—there is no explicit theory to analyze, only learned weights. Comparison with observations is limited to domains where ground truth is available.

The honest conclusion: assessment remains holistic and requires judgment. There is no checklist. The expertise required is substantial, and most users do not have it.

The implication is uncomfortable. If Winsberg's account is right, then users who lack background knowledge cannot reliably assess whether AI outputs deserve trust. Most users lack this knowledge. Much AI use is epistemically unjustified in ways users don't recognize—not because the outputs are wrong, but because users have no grounded way to tell whether they are right. The framework identifies this problem. It does not tell us what to do about it.

Fictions

Winsberg's most striking claim is that simulations often contain fictions—components "offered with no promises of a broad domain of reliability."

His examples are vivid. In multi-scale simulations, "hand-shaking" regions between different theoretical frameworks use hybrid entities—silogen atoms, silicon-hydrogen hybrids—that do not exist and are not meant to represent anything real. Artificial viscosity adds non-physical viscosity to stabilize simulations. These are not idealizations (simplifications that approximate truth). They are deliberately fictional—known to be false, not intended to represent reality, but essential to making the simulation work.

Yet simulations containing such fictions can be reliable. The fictions contribute to overall reliability even though they are not reliable guides to the phenomena in their local domain.

Do AI systems contain analogous fictions?

Consider: RLHF fine-tuning induces behaviors without understanding. The system produces helpful-sounding outputs not because it grasps helpfulness but because such outputs were reinforced. Synthetic training data may not represent any real distribution. Architectural choices—attention mechanisms, layer structures—do not correspond to anything in the target domain.

These might be fictions in Winsberg's sense: components that contribute to overall reliability while being, in some sense, false or ungrounded. Whether they qualify as fictions in his precise sense is a harder question. Winsberg's fictions are deliberately introduced by modelers who understand their role—silogen atoms are chosen, not discovered. No one decides to add RLHF-induced agreeableness the way a physicist decides to add artificial viscosity. AI's analogous components are emergent rather than intentional, artifacts of optimization rather than explicit modeling choices. The parallel is suggestive but imperfect.

There is a further concern. When optimization targets include human approval—when the system is fine-tuned to produce outputs that receive positive feedback—the optimization pressure may diverge from epistemic reliability. A system optimized for approval produces outputs that seem helpful and feel right while lacking grounding. The artifact framing asks: what was this built to do? For AI, the answer may be "produce outputs humans affirm," which is not the same as "produce outputs humans should believe."

Users cannot see these fictions. They cannot distinguish which outputs rest on reliable components and which rest on fictions that happen to work. The system's reliability is holistic, but the user's access is piecemeal.

Trust as Integration

C. Thi Nguyen offers a different angle on our relationship to epistemic artifacts. His concern is not reliability assessment but trust—and what trust actually involves.

The standard view distinguishes reliance (depending on something) from trust (depending in a normatively loaded way, where betrayal is possible). On this view, we can rely on objects but only trust agents.

Nguyen rejects this. Trust, he argues, is an unquestioning attitude:

"What it is to trust, in this sense, is not simply to rely on something, but to rely on it unquestioningly. It is to rely on a resource while suspending deliberation over its reliability."

To trust is to let something inside—to give it a direct pipeline into cognition and action.

This attitude has a two-tiered structure. First-order: I accept what the resource delivers without deliberation. Second-order: I deflect questioning about this acceptance. Trust has inertia. It resists being disrupted by minor considerations. Only significant evidence triggers reconsideration.

We can hold this attitude toward objects, not just agents. We trust the ground when we walk. We trust our climbing rope. We trust our smartphone. And when these fail—when the ground gives way in an earthquake, when the rope frays, when the phone betrays our expectations—we feel not just disappointment but something closer to betrayal.

"There is something about being betrayed by the ground underneath you that feels like the ultimate treachery."

This is not metaphor. The phenomenology is real. When we have integrated something into our functioning and it fails, we experience alienation—the loss of something that had become part of us.

Nguyen draws a parallel to self-trust. When my past self reached a conclusion, my present self typically does not re-deliberate. I accept the remembered conclusion as having conclusive force, even if I do not recall the evidence. This is how memory works. And this is how we treat trusted external resources: their outputs enter directly into our cognitive network, the way our own prior conclusions do.

When users accept AI outputs without deliberation—when the output enters directly into belief and action—they have adopted the unquestioning attitude. They have integrated AI into their cognitive functioning. The parallel to memory is not incidental. We are extending ourselves with tools we trust as we trust our own past judgments.

Efficiency and Vulnerability

Integration creates efficiency. I cannot constantly monitor everything I rely on. Trust lets me function quickly—outsourcing cognition to resources I do not have to second-guess.

But integration creates what Nguyen calls "exquisite vulnerability." When I have welded open pipelines from external resources into my cognition, errors propagate easily. Tightly integrated systems do not fail well.

Nguyen identifies a distinctive failure mode: agential gullibility.

"Many of our relationships to emerging technologies—search algorithms, smartphones, social media networks—are marked by such agential gullibility."

Agential gullibility is integrating external resources into one's agency too readily, without adequate reflection on whether they deserve trust.

For AI, the concern is pointed. Users integrate AI into cognition and action without understanding how it was built, without knowing its validation scope, without recognizing its fictions, without calibrating trust to domain. This is not irrational—we must trust many things to function. But the calibration may be wrong. And once trust is established, its inertia resists revision.

There is something strange about integrating an artifact this opaque—optimized for approval, shaped by unknown training data, with unknown scope limits—into our cognitive functioning as readily as we integrate our own memory. We are making it part of ourselves without knowing what it is.

The Responsiveness Problem

Climate models do not talk back. They take inputs and produce outputs. The epistemic relationship is one-directional.

AI systems respond to queries, adjust to context, produce outputs that resemble dialogue. This responsiveness is why source-framings seemed apt. It feels like conversation, not instrument-reading.

The artifact framing captures construction and validation. It may not fully capture the interactive dimension—the sense that we are in conversation with the artifact rather than merely using it.

This gap is not minor. The responsiveness is precisely what motivated the earlier frameworks. Testimony theory applied because AI outputs arrive in the form of statements offered in response to questions—the surface grammar of testimonial exchange. Inferentialism applied because AI outputs participate in something resembling the game of giving and asking for reasons—challenges met with apparent justifications, inconsistencies with apparent corrections. The artifact framing dissolves these puzzles by denying the premise: there is no testifier, no asserter, only a tool. But if the framing cannot account for why the tool so persistently mimics a source, something remains unexplained.

Two positions seem available. The first: responsiveness is sophisticated mimicry that changes nothing epistemically. The artifact produces source-like outputs, but this is a feature of the interface, not the epistemology. A chatbot is still an artifact, however conversational its outputs. What matters is construction and validation, not surface presentation. Users are misled by the interface into treating tools as sources.

The second: responsiveness indicates something the artifact framing does not fully reach. Artifacts traditionally do not adjust to challenge, do not produce contextually sensitive justifications, do not engage in what resembles dialogue. AI occupies genuinely novel territory—more responsive than instruments, less accountable than asserters, requiring an epistemology we do not yet have.

I find the first position more parsimonious. But the second takes seriously the phenomenology of interaction—that engaging with AI feels different from reading a thermometer in ways that may be epistemically significant. If the artifact framing is adequate, then model epistemology provides what we need: attend to construction, validation, and the dangers of agential gullibility. If inadequate, we face a harder problem—a new kind of epistemic object that borrows features from sources and artifacts both, fitting neither category cleanly.

What Model Epistemology Does Not Settle

The framework raises the right questions. It does not provide answers for AI specifically.

It does not tell us how to validate AI when traditional methods—comparison to observations, theoretical analysis of components—do not transfer cleanly.

It does not tell us how to calibrate trust when opacity is deep and expertise unavailable.

It does not tell us whether "artifact" fully captures what AI is, given its responsiveness.

It does not tell us what appropriate trust looks like when we cannot assess the background knowledge that would ground it.

What the framework does provide is a reframe. Stop asking whether AI is a good testifier or a reliable asserter. Ask instead: what kind of artifact is this? How was it built? What was it validated against? What are we integrating into our agency when we trust it?

These questions do not have easy answers. But they are the right questions.

What Comes Next

Four frameworks now applied.

Reliabilism asks whether the process is reliable. It illuminates the conditions for warranted belief but struggles with the generality problem and calibration.

Testimony theory asks whether AI can testify. It illuminates accountability but finds no testifier to be accountable.

Inferentialism asks whether AI can assert. It illuminates commitment and participation but cannot settle whether functional behavior suffices.

Model epistemology asks how we should use this artifact. It illuminates validation, integration, and trust but goes quiet on responsiveness and the expertise gap.

Each framework captures something. None captures everything. But they converge on one point: users need something they do not have. Expertise to assess reliability. A testifier to hold accountable. A committed asserter who can be challenged. Validation they can actually evaluate. The next post asks whether this convergence reveals a genuine gap in our epistemic practices—a new kind of dependence we are not equipped for—or whether the frameworks are asking for more than knowledge has ever required.