Elsevier

Consciousness and Cognition

Volume 28, August 2014, Pages 126-140
Consciousness and Cognition

A comparison between a visual analogue scale and a four point scale as measures of conscious experience of motion

https://doi.org/10.1016/j.concog.2014.06.012Get rights and content

Highlights

  • A visual analogue scale is examined as a measure of motion experience.

  • The visual analogue scale and a four-point scale had the same type 2 sensitivity.

  • The visual analogue scales recorded a greater amount of information.

  • The visual analogue scale and the four-point scale were both internally consistent.

Abstract

Can participants make use of the large number of response alternatives of visual analogue scales (VAS) when reporting their subjective experience of motion? In a new paradigm, participants adjusted a comparison according to random dot kinematograms with the direction of motion varying between 0° and 360°. After each discrimination response, they reported how clearly they experienced the global motion either using a VAS or a discrete scale with four scale steps. We observed that both scales were internally consistent and were used gradually. The visual analogue scale was more efficient in predicting discrimination error but this effect was mediated by longer report times and was no longer observed when the VAS was discretized into four bins. These observations are consistent with the interpretation that VAS and discrete scales are associated with a comparable degree of metacognitive sensitivity, although the VAS provides a greater amount of information.

Introduction

The lack of an established measurement for conscious experience is a key challenge to the prosperity of an empirical science of consciousness (Chalmers, 1998). The choice of an adequate measure is delicate because different theoretical perspectives on consciousness can imply different measurements. Some theorists are critical about the use of subjective reports because they assume participants might have conscious experiences they are unable to report (Block, 2005) or they do not report because their criterion is too conservative (Hannula, Simons, & Cohen, 2005). In contrast, proponents of higher-order thought theories often argue that subjective reports are more valid than objective measures because unconscious processes might drive objective performance as well (Dienes, 2004, Lau, 2008). However, as subjective experiences cannot be observed from the third-person point of view (Jackson, 1982, Nagel, 1974), it is impossible to test empirically whether subjective measures of consciousness leave out conscious experiences that observers are unable to report, or whether objective measures suggest falsely that performance in a task is conscious. However, some researchers decide a priori to adopt a perspective that requires the use of subjective reports, either because they endorse a higher-order perspective on consciousness (Cleeremans, 2011, Lau and Rosenthal, 2011), or because they consider subjective reports themselves as the subject of their scientific investigations (Dennett, 2003, Dennett, 2007); if they do so, the empirical question arises how a scale needs to be designed given the metacognitive abilities of humans to obtain as much information from participants as possible.

Subjective scales designed to measure conscious experience are constituted out of at least two components: (i) the question participants are instructed to answer and (ii) the way participants deliver their subjective report. Concerning the question, we proposed a classification of subjective scales on the event in the world subjective reports refer to, specifically whether subjective reports refer to the stimulus or to the discrimination response (Zehetleitner & Rausch, 2013). Examples for stimulus-related scales would be to ask participants how visible the stimulus was (Sergent & Dehaene, 2004), to rate clarity of the response defining feature (Zehetleitner & Rausch, 2013), or to report both the experience of specific features as well as feelings of something being shown (Ramsøy & Overgaard, 2004, p. 12). Decision-related scales may ask participants to report how confident they are about the preceding objective task response (Peirce & Jastrow, 1884), whether they attribute their objective task response to guessing, intuition, memory, or knowledge (Dienes & Scott, 2005), how much money they would wager on the accuracy of the objective task response (Persaud, McLeod, & Cowey, 2007), or whether they experienced a “feeling-of-warmth” with respect to the previous task response (Wierzchoń, Asanowicz, Paulewicz, & Cleeremans, 2012).

Several studies compared subjective scales with different questions participants were asked to respond to: Dienes and Seth (2010) reported that wagering was biased by the participants’ risk-aversion, but there were no differences between confidence and wagering after the possibility of loss had been eliminated from wagering. Sandberg, Timmermans, Overgaard, and Cleeremans (2010) observed in a masked object identification task that the perceptual awareness scale (PAS) predicted task performance more efficiently than confidence and wagering did. In an artificial grammar task, it was reported that confidence ratings predicted objective performance more efficiently than ratings of awareness of the artificial grammar rule (Wierzchoń et al., 2012). Szczepanowski, Traczyk, Wierzchoń, and Cleeremans (2013) reported that confidence ratings were more closely correlated with performance than ratings of subjective awareness and wagering, although a recent reanalysis of the data found no significant differences between subjective awareness and confidence (Sandberg, Bibby, & Overgaard, 2013). Finally, subjective reports of visual experience were less strongly correlated with objective performance in masked orientation discrimination tasks or random motion discrimination tasks, but no substantial differences were observed in a masked form discrimination task. In addition, confidence ratings were associated with more liberal thresholds than reports of visual experience across all three visual tasks, and confidence and wagering were more strongly correlated with each other than with reports of visual experience (Zehetleitner & Rausch, 2013).

Four different lines of interpretation for empirical differences between subjective scales with different questions have been suggested: First, it has been assumed (at least for the purpose of a comparison between measurements) that different kinds of subjective reports are equal except the sensitivity (Dienes & Seth, 2010) and the exhaustiveness of the scale (Sandberg et al., 2010). The second suggestion was that different scales might encourage participants to access their conscious contents in different ways: In introspective judgments, participants just directly report their conscious experiences a s they have them; in metacognitive judgments however, participant use their conscious experiences to make more complex cognitive judgments about processes engaged in the objective task (Overgaard & Sandberg, 2012). Third, it has been proposed that different subjective scales might alter the quality of conscious experience itself: Some scales such as wagering might be more motivating for the participants, making them more attentive, and thus cause participants to experience the stimulus more distinctively (Szczepanowski et al., 2013). Finally, it was suggested that different questions may relate to different processes during the task: Stimulus-related reports may be informed by processes involved in stimulus representation, and decision-related reports by processes involved in decision making (Zehetleitner & Rausch, 2013).

The present study investigated the response format as the second component of subjective scales, specifically whether responses to the same question are more conveniently recorded by a discrete scale or a visual analogue scale (VAS). From the viewpoint of information theory (Shannon, 1948), subjective reports should be collected with a maximum number of scale steps because the maximal amount of information recorded by one report is bounded by number of options provided to the participant. Specifically, as the maximum information is computed as the binary logarithm of the number of options, a binary scale records the information of 1 bit in one trial, 4 scale points 2 bits, 8 scale points 3 bits, etc. The information conveyed by a VAS, where the response is selected along a continuum, would theoretically depend on the number of scale positions differentiated by the equipment (between 28 and 216 with custom joysticks), but is in practice limited by the number of positions that participants can differentiate on the continuum, which classical studies estimated to be at least 10 positions (Hake & Garner, 1951).

From the viewpoint of signal detection theory (SDT) (Green and Swets, 1966, Macmillan and Creelman, 2005, Wickens, 2002) however, the use of a high number of scale steps is only feasible if two requirements are met: (i) participants need to be able to maintain a sufficient number of criteria, and (ii) participants’ type 2 sensitivity (Galvin, Podd, Drga, & Whitmore, 2003), i.e. their degree of access to their own task performance, should not be impaired by a great number of options. The recent literature has raised doubts about both requirements for high-precision usage of VASs: Overgaard, Rote, Mouridsen, and Ramsøy (2006) proposed that VASs tend to be used like binary judgments: As only the extreme ends of the scale are labelled, reports may be dragged towards the extremes, reducing the number of criteria participants effectively use to two. In addition, they argued as there are no definitions for each experience along the continuum of the VAS, VAS could confuse participants and result in less accurate reports.

Only one study so far has empirically compared a VAS and discrete scale: Wierzchoń et al. (2012) compared subjective reports of rule awareness with four scale steps against a VAS of rule awareness in a 2AFC artificial grammar classification task and observed a tendency that the four-point scale predicted performance more efficiently than the VAS (irrespective of whether the VAS was binned into four scale steps or not), although the statistics were not significant. Wierzchoń et al. (2012) also found that rule awareness measured by a VAS was worse than wagering and feeling-of-warmth both measured by a discrete scale, although there was no significant difference between discrete rule awareness and these two scales; however, these findings are hard to interpret because the content of the scales and the response format are confounded in these comparisons. In domains other than awareness, VASs have been demonstrated to be adequate measurements for state anxiety (Davey, Barret, Butow, & Deeks, 2007), vertigo (Dannenbaum, Chilingaryan, & Fung, 2011), quality of live (de Boer et al., 2004), group cohesiveness (Hornsey, Olsen, Barlow, & Oei, 2012), mood (Kontou, Thomas, & Lincoln, 2012), thermal perception (Leon, Koscheyev, & Stone, 2008), and depression (Rampling et al., 2012), indicated by a strong correlation with an established multi-item questionnaire or by a high reliability of VASs, suggesting that participants are in principle able to make meaningful reports using VASs (although it should be noted that these studies did not compare VASs and discrete scales directly). As VASs were shown to be adequate measurements for a considerable number of different psychological constructs, it is reasonable to hypothesize that a VAS might be a convenient measurement of visual experience as well. Apart from that, it was argued that a VAS may induce more careful responses because it signals to the participant that an exact response is important, while a discrete scale might convey the message that a rough answer is sufficient (Funke & Reips, 2012).

In summary, although VASs are in principle suited to record a large amount of information, it is an open empirical question whether participants are able to use a VAS with a sufficient number of criteria and without loss of type 2 sensitivity, so employing a VAS is feasible.

While the study by Wierzchoń et al. (2012) contrasted subjective reports and objective performance in a 2AFC discrimination task, the recent development of continuous discrimination tasks (Bays and Husain, 2008, Zhang and Luck, 2008, Zokaei et al., 2011) offers the opportunity to conduct a more powerful test of the amount of information recorded by a VAS. For example, in a typical 2AFC task, participants might be instructed to report whether a previously presented bar is tilted towards left or right. The set of possible stimulus features is two (left or right) and so is the set of possible responses. This paradigm can be changed into a continuous discrimination task by allowing the bar to have any of all possible orientation and asking the participant to indicate the orientation of the bar via a response set of the same cardinality. Errors, defined as the deviation of stimulus and response, are binary in a 2AFC paradigm: either the response corresponds to the stimulus (i.e., is “correct”), or it does not (i.e., is “incorrect”). For continuous tasks however, the deviance between stimulus and response is a continuous variable: When for instance the stimulus consists of a vertical bar, the response may deviate from the true orientation by any angle between 0° and 90°.

The number of task response alternatives is relevant for comparing different scales because the information recorded by a scale depends on the entropy of metacognition, which in turn depends on the entropy of discrimination performance: When there are only two levels of accuracy, i. e. “correct” and “incorrect”, there will be a comparably small number of metacognitive states, and consequently, a smaller number of scale steps might perform well to categorize these states. In contrast, when participants are required to adjust a comparison continuously according to a specific stimulus feature, there is a large number of different possibilities how accurate discrimination performance can be, and thus a large number of possible metacognitive states. Consequently, a scale with a larger number of response alternatives might perform better than a discrete scale when the number of response alternatives is large.

In general, performance in a continuous adjustment task can be described mathematically by a combination of a von Mises and a uniform distribution (Bays et al., 2009, Zokaei et al., 2011): If participants had to rely completely on guessing, their responses should be evenly distributed across the whole range of possible responses. However, if performance is better than chance, their responses would form a bell-shaped distribution centred at the correct response, with the spread of the distribution indicating the precision of the response. A continuous task for the purpose of the current study would be characterized by a continuous relationship between task difficulty and the precision parameter as well as the guessing parameter. Previous studies suggested that subjective reports are associated with both the precision parameter as well as the probability of guessing in working memory tasks (Rademaker, Tredway, & Tong, 2012), but to our knowledge, no study has so far introduced continuous tasks in the study of visual consciousness.

As the current experiments entails a comparison between scales with a different number of scale steps, special attention should be paid to the choice of operationally defined criteria to evaluate the scales. We propose to employ three criteria of comparison: (i) the correlation with discrimination performance, (ii) the internal consistency, and (iii) the distribution of ratings.

The correlation with discrimination performance as well as internal consistency come with two very different interpretations depending on whether the amount of information collected with one report is controlled or not. When VAS judgements are binned into the same number of scale steps as the discrete scale and thus the amount of information recorded by the two scales is balanced, the correlation of subjective reports with discrimination performance is indicative of type 2 sensitivity (Galvin et al., 2003), the ability to discriminate between correct and incorrect trials. This is the rationale of numerous previous studies (Dienes and Seth, 2010, Sandberg et al., 2010, Szczepanowski et al., 2013, Wierzchoń et al., 2012) and is analogous to the term resolution in the confidence literature (Baranski & Petrusic, 1994). In contrast, under the assumption that the type 2 sensitivity of participants is comparable, a comparison between the association of the full VAS and objective performance on the one hand and the association between the discrete scale and performance shows whether the VAS is able differentiate between levels of performance that fall equally on the same scale step with the discrete scale and is thus indicative of the amount of information recorded by the scale.

The second criterion we took into account was the internal consistency of subjective reports within experimental conditions: A scale should provide maximally stable estimates of averages of the subjective reports across a number of data points. Again, the comparison between the discretized VAS and a discrete scale shows whether one scale is corrupted from noise unrelated to the number of scale steps; while a comparison between the internal consistency of full VAS and discrete scales shows whether participants can make use of the additional resolution provided by the VAS, i. e. it examines whether VAS reports differentiate between trials that fall on the same scale step at the discrete scale.

Third, another characteristic of subjective scales that has been extensively discussed is the distribution of subjective reports when collected with different scales: Are subjective scales of consciousness used gradually or are they used in a binary fashion? While some scales might be designed in a way that all scale steps are used with relatively equal probability, other scales might induce binary responses (Overgaard et al., 2006). This empirical question is related to the theoretical proposals that consciousness is either dichotomous (Dehaene & Changeux, 2011) or a gradual phenomenon (Cleeremans, 2011). If stimulus consciousness varies binarily (i. e. stimuli are always either conscious or unconscious), an observers would only use the ends of the scale, resulting in a U-shaped distribution of ratings. If stimuli however can be more or less conscious, all points of the scale are potentially used, when stimulus strength increases, resulting in a uniform distribution when averaged across stimulus strength. However, in order to investigate the issue whether consciousness varies gradually or binarily, a scale is required where participants in principle use the intermediate scale steps as well; otherwise a U-shaped distribution would be observed no matter whether consciousness in a specific task in fact gradual or dichotomous (Sergent & Dehaene, 2004).

The aim of the present study was to investigate whether participants can make use of the high resolution offered by VASs when measuring visual experience of motion. To address this issue, we compared a VAS and a discrete scale with respect to the criteria discussed in 1.4. As stimuli, we presented random dot kinematograms (RDKs), because RDKs allow for a fine-grained manipulation of task difficulty on a metric scale (by manipulating the percentage of coherently moving dots). For the objective task, we assessed objective performance as a continuous variable rather than just as correct or false; a procedure that ensured a binary use of subjective reports was not due to binary task performance. To obtain a continuous measurement of task performance, we asked participants to report the orientation of motion by adjusting a clock-hand to point into the direction of the perceived motion, and measured the discrimination error as the angle between clock-handle and direction of motion. For the subjective scales, we asked participants always to report their degree of experience of the coherent motion, which was the same instruction as we used in a previous study (Zehetleitner & Rausch, 2013), and different from the established Perceptual Awareness Scale (PAS, Ramsøy & Overgaard, 2004) in that no instruction to report feelings of something being shown was given.

The experiment was designed to investigate the following three hypotheses:

  • i.

    If the participants are able to make use of the additional resolution provided by VASs, the full VAS should predict the discrimination error more efficiently than the discrete scale. In addition, the internal consistency of the full VAS should be better, because the larger amount of data transmitted by each single subjective report would allow for more reproducible statistics based on the same number of trials.

  • ii.

    If VAS reduced the type 2 sensitivity of subjective reports, we would expect that the discrete scale would be more efficient in predicting discrimination error and would produce more consistent estimates than the discretized VAS.

  • iii.

    If participants are biased by the anchors of the VAS in a way that reports are given binarily, the ratings on the VAS but not on the discrete scale should form a U-shaped distribution. In addition, the discrete scale should outperform both the full and the discretized VAS in predicting discrimination error.

Section snippets

Participants

Twenty participants (5 male, 1 left-handed) took part in the experiment. The age of the participants ranged between 19 and 32 years, with a median age of 24. All participants reported to have normal or corrected-to-normal vision, confirmed that that they did not suffer from epilepsy or seizures and gave written-informed consent

Apparatus and stimuli

The experiment was performed with a Mac with OS X 10.7 as operating system and a Diamond Pro 2070 SB (Mitsubishi) monitor with 24 in. screen size. Stimuli were presented at

Discrimination performance

The mean discrimination error was 55.6° (SEM = 2.2) when participants were using the VAS and 56.3° (SEM = 2.2) when the discrete scale was used and ranged from 87.7° (SEM = 1.7) for the lowest to 13.7° (SEM = 1.6) for the highest level of coherence. The relative frequencies of orientation responses and the estimated distributions are shown in Fig. 2. The estimated parameters as well as bootstrapped confidence intervals are shown in Fig. 3. The probability of guessing trials ranged between .94 at the

Discussion

The present experiment investigated whether participants are able to use the high number of response alternatives provided by visual analogue scales appropriately when reporting visual experience of motion. We hypothesized that if a VAS allowed to retrieve a larger amount of information from participants’ reports than discrete scales, the full VAS should be more efficient in predicting the discrimination error, and should be more internally consistent. Second, if a VAS reduced the type 2

Conclusion

We present data that both visual analogue scales as well as discrete scales are reliable measures of subjective reports of global motion experience. We found no evidence that the type 2 sensitivity is decreased or the pattern of reports is binary when participants are provided with a large number of scale steps. The data is consistent with the interpretation that participants are able to maintain a sufficient large number of meaningful criteria so that a VAS retrieves a larger amount of

Authors note

This research was supported by the German-Israeli Foundation for Scientific Research and Development (GIF) grant 1130-158 and the Deutsche Forschungsgesellschaft (DFG, i. e. German Research Council) grant ZE 887/3-1 (both to M.Z.). The funders had no role in study design, data collection, analysis, decision to publish, or preparation of the manuscript. Correspondence concerning this article can be addressed to Manuel Rausch or to Michael Zehetleitner.

References (57)

  • P.M. Bays et al.

    Dynamic shifts of limited working memory resources in human vision

    Science

    (2008)
  • D.H. Brainard

    The psychophysics toolbox

    Spatial Vision

    (1997)
  • D. Chalmers

    On the search of neural correlates of consciousness

  • Christensen, R. B. (2013). Analysis of ordinal data with cumulative link models – stimation with the R-package ordinal....
  • A. Cleeremans

    The radical plasticity thesis: How the brain learns to be conscious

    Frontiers in Psychology

    (2011)
  • L.J. Cronbach

    Coefficient alpha and the internal structure of tests

    Psychometrica

    (1951)
  • E. Dannenbaum et al.

    Visual vertigo analogue scale: An assessment questionnaire for visual vertigo

    Journal of Vestibular Research

    (2011)
  • H.M. Davey et al.

    A one-item question with a Likert or Visual Analog Scale adequately measured current anxiety

    Journal of Clinical Epidemiology

    (2007)
  • A.G.E.M. de Boer et al.

    Is a single-item visual analogue scale as valid, reliable and responsive as multi-item scales in measuring quality of life?

    Quality of Life Research

    (2004)
  • D.C. Dennett

    Who’s on first? Heterophenomenology explained

    Journal of Consciousness Studies

    (2003)
  • D.C. Dennett

    Heterophenomenology reconsidered

    Phenomenology and Cognitive Science

    (2007)
  • Z. Dienes

    Assumptions of subjective measures of unconscious mental states: Higher order thoughts and bias

    Journal of Consciousness Studies

    (2004)
  • Z. Dienes et al.

    Measuring unconscious knowledge: Distinguishing structural knowledge and judgment knowledge

    Psychological Research

    (2005)
  • F. Funke et al.

    Why semantic differentials in web-based research should be made from visual analogue scales and not from 5-point scales

    Field Methods

    (2012)
  • S.J. Galvin et al.

    Type 2 tasks in the theory of signal detectability: Discrimination between correct and incorrect decisions

    Psychonomic Bulletin & Review

    (2003)
  • D. Green et al.

    Signal detection theory and psychophysics

    (1966)
  • H.W. Hake et al.

    The effect of presenting various numbers of discrete steps on scale reading accuracy

    Journal of Experimental Psychology

    (1951)
  • D.E. Hannula et al.

    Imaging implicit perception: Promise and pitfalls

    Nature Reviews Neuroscience

    (2005)
  • Cited by (0)

    View full text