In lieu of an abstract, here is a brief excerpt of the content:

  • Evidence for Doubting the Evidence?
  • Brent M. Kious (bio)

Clinical research is difficult. It confronts massive heterogeneity in its participants, who are real people bumping around the world in complex ways. Clinical research in psychology is doubly difficult, since it tries systematically to study conditions that are inherently difficult to systematize. In their thoughtful and closely argued article, Truijens et al. (2021) emphasize these difficulties, and describe a novel challenge to psychotherapy research: that the support for many evidence-based therapies (EBTs) is weaker than previously recognized because it relies on patient-reported outcome measures (PROMs) that often yield invalid results. Despite the initial plausibility of their position, however, their conclusion is overstated. It is not, itself, rooted in the evidence.

Here is what I take to be their argument. The evidence supporting many psychotherapeutic interventions is based partly on PROMs. But providing responses to PROMs requires interpretation: study participants have both to interpret what question is being asked, and then to interpret their experiences to make them responsive to that question. To further complicate matters, study participants may have reactions to the process of data collection itself, which may skew their responses one way or another. These "hermeneutic" processes—or more simply, "measurement effects"—can undermine the validity of data col lection, where "validity" is "the legitimacy of conclusions based on premises" (p. 124). To bolster this point, Truijens et al. provide three cases, each illustrating a different measurement effect that seems to make the study participant's responses invalid. They conclude that, insofar as this is a general problem in psychotherapy research, we should question the evidence for EBTs.

Truijens et al. should be commended for identifying an intriguing complication in psychotherapy trials, which they rightly suggest is something to which researchers should attend. They have also indirectly illustrated an under-recognized problem in mental-health research generally: People know they are the subjects of research, and this knowledge, arguably more than in other fields, can change how they conceptualize their symptoms in ways that may have dramatic effects on study results.

There are, though, at least two important reasons to doubt their premise that measurement effects typically invalidate individual PROMs data. First, they give us little reason to think their three cases are representative; it remains possible that the measurement effects exemplified are rare. Second, it is not clear that the cases show what the authors think they do. True, the participants themselves did not believe their responses to the study instruments were valid, and Truijens et al. accept this. But how do the participants know? [End Page 129] It is possible that, despite their thoughts about how their ratings were biased, they just had poor insight, and their ratings still reflected exactly what they were intended to measure. Indeed, Truijens et al. allude to something like this in the case of John, who seemed to show improvement even though he did not think he was improving (p. 121).

Even granting that the cases do exhibit measurement effects that problematized the individual participants' data, Truijens et al. still do not succeed in casting doubt on EBTs. This is also for two reasons. First, as the authors note, PROMs are generally validated measures. This means researchers have previously analyzed their results and been satisfied that, in aggregate—and despite some individual variation—they measure what they purport to measure. Even if PROMs are frequently beset by measurement effects, their validation equally involved interactions with real patients in complex settings where measurement effects could have occurred (Frost et al., 2007). In short, PROMs are validated in spite of possibly idiosyncratic responses from patients.

Truijens et al. seem aware of this objection: "[T]he validation process of measures did cover administration processes. … However, the measure as such is not capable of distinguishing between 'accurate' reporting of experienced symptoms and the effect that measurement has on these experiences, nor to indicate empirically whether it regards random or systematic noise. … [T]he validation of a measure is no guarantee that it accurately measures the intended construct in a concrete situation" (p. 125). This response does not address the objection, however. It is true that PROMs, though validated, are usually not able...

pdf

Share