🌻 A simple measure of the goodness of fit of a causal theory to a text corpus

Abstract#

Suppose an evaluation team has a corpus of interviews and progress reports, plus (at least) two candidate theories of change (ToCs): an original one and a revised one. A practical question is: which ToC better fits the narrative evidence?

With almost-automated causal coding as described in (Powell & Cabral, 2025); (Powell et al., 2025), we can turn that into a simple set of coverage-style diagnostics: how much of the coded causal evidence can be expressed in the vocabulary of each ToC.

See also: Intro; Minimalist coding for causal mapping; Magnetisation.

Intended audience: evaluators and applied researchers comparing candidate ToCs (or other causal frameworks) against narrative evidence, who want a transparent “fit” diagnostic that does not pretend to be causal inference.

Unique contribution (what this paper adds):

1. The core idea: “coverage” of evidence by a codebook#

In ordinary QDA (thematic coding), researchers often look at how widely a codebook or set of themes is instantiated across a dataset: which codes appear, how frequently, and whether adding more data still yields new codes (saturation). Counting is not the whole of qualitative analysis, but it is a common, explicitly discussed support for judgement and transparency (Saldaña, 2015). Critiques of turning saturation into a mechanical rule-of-thumb are also well known (Braun & Clarke, 2019).

Our twist is: because we are coding causal links (not just themes), we can define coverage over causal evidence rather than over text volume.

2. Minimal definitions#

3. Coverage measures you can compute#

Assume we have a baseline set of coded links L (from open coding), and a ToC codebook C (as magnets / targets).

Link coverage = proportion of coded links whose endpoints can be expressed in the ToC vocabulary.

Two variants (pick one and state it explicitly):

If your dataset has multiple citations per bundle (or you have Citation_Count), compute coverage over citations, not just distinct links:

This answers: “what proportion of the evidence volume is expressible in this ToC?”

3.3 Source coverage (breadth)#

Source coverage = number (or proportion) of sources for which at least (k) links are covered by the ToC vocabulary.

This answers: “does this ToC vocabulary work across many sources, or only a small subset?”

4. Protocol (how to use it)#

For each candidate ToC:

  1. Build a ToC codebook C (ideally keep candidate codebooks similar in size and specificity, otherwise you are partly measuring codebook granularity).
  2. Map raw labels to C (hard recode or soft recode).
  3. Compute:
  4. link coverage (both-ends and/or one-end),
  5. citation coverage (if available),
  6. source coverage (with an explicit (k)).
  7. Inspect the leftovers (uncovered labels/links): what important evidence is the ToC not even able to name?

5. How this relates to “coverage” in mainstream qualitative methods#

The word “coverage” is used in a few nearby ways in qualitative methodology:

What we are doing here is closer to: how much of the coded evidence can be expressed in the language of a candidate theory, which is a “fit” diagnostic rather than a claim about truth.

6. Caveats#

References

Braun, & Clarke (2019). To Saturate or Not to Saturate? Questioning Data Saturation as a Useful Concept for Thematic Analysis and Sample-Size Rationales. https://doi.org/10.1080/2159676X.2019.1704846.

Powell, Cabral, & Mishan (2025). A Workflow for Collecting and Understanding Stories at Scale, Supported by Artificial Intelligence. SAGE PublicationsSage UK: London, England. https://doi.org/10.1177/13563890251328640.

Powell, & Cabral (2025). AI-assisted Causal Mapping: A Validation Study. Routledge. https://www.tandfonline.com/doi/abs/10.1080/13645579.2025.2591157.

Saldaña (2015). The Coding Manual for Qualitative Researchers. Sage.