Skip to content
BY 4.0 license Open Access Published by De Gruyter November 28, 2020

A Two-Stage Joint Modeling Method for Causal Mediation Analysis in the Presence of Treatment Noncompliance

  • Soojin Park EMAIL logo and Esra Kürüm

Abstract

Estimating the effect of a randomized treatment and the effect that is transmitted through a mediator is often complicated by treatment noncompliance. In literature, an instrumental variable (IV)-based method has been developed to study causal mediation effects in the presence of treatment noncompliance. Existing studies based on the IV-based method focus on identifying the mediated portion of the intention-to-treat effect, which relies on several identification assumptions. However, little attention has been given to assessing the sensitivity of the identification assumptions or mitigating the impact of violating these assumptions. This study proposes a two-stage joint modeling method for conducting causal mediation analysis in the presence of treatment noncompliance, in which modeling assumptions can be employed to decrease the sensitivity to violation of some identification assumptions. The use of a joint modeling method is also conducive to conducting sensitivity analyses to the violation of identification assumptions. We demonstrate our approach using the Jobs II data, in which the effect of job training on job seekers’ mental health is examined.

MSC 2010: 62D20

1 Introduction

In randomized experiments, the interest is often not only in the effect of a randomized treatment but also in the effect transmitted through a mediator. This is because investigating mediating mechanisms provides a complete explanation of the effect of the treatment. Recently, there have been many studies on causal mediation analysis, which focuses on how to identify and estimate the average effect of a treatment transmitted through a mediator (See, e.g., [1, 2, 3, 4]). One complication that arises when conducting this type of analysis is non-ignorability of a mediator because mediators are seldom randomized, even in randomized experiments.

Another complication in randomized experiments is that some participants do not adhere to the assigned treatment. In this article, we refer this non-adherence to the assigned treatment to treatment noncompliance. In the presence of treatment noncompliance, the treatment receipt status is no longer random even when the treatment is assigned randomly because participants self-select to adhere to the treatment or not. One analytical option to address this issue is to focus on the effect of the assigned treatment, namely, intention-to-treat (ITT) effects. Under a randomized treatment and the stable unit treatment value assumption (SUTVA), [1] the ITT effect is identified as the difference in the average outcome value between those who are assigned to the treatment and those who are not. ITT analysis avoids the problem of treatment noncompliance because inference relies only on the randomization of the treatment [5].

Further challenge arises when identifying the mediated portion of this ITT effect. Simply employing the mediation formula [6] with the assigned treatment (as if the assigned treatment is actual receipt of the treatment) does not provide a valid result [7] because this approach violates an important assumption of causal mediation analysis: no treatment-induced mediator and outcome confounding [1, 3, 6]. In the presence of treatment noncompliance, the actual treatment receipt status impacts both mediator and outcome and is influenced by the assigned treatment (i.e., treatment-induced mediator and outcome confounding). One way of circumventing this issue is to identify this mediated portion of the ITT effects on the basis of the average causal mediation effect (ACME) among compliers. Among compliers, the assigned treatment always coincides with the treatment received and thus, the ACME can be estimated without this issue of treatment induced mediator and outcome confounding. Yamamoto [7] proposed this way of identifying the mediated portion of the ITT effect using the instrumental variables (IV) approach.

While the IV approach successfully addresses the issue of identifying the mediated ITT effect, a concern remains. Estimating the mediated ITT effect on the basis of the ACME among compliers requires multiple identification assumptions. Due to these multiple identification assumptions, validating results in the IV approach is often challenging. Previous research by Yamamoto [7] left assessing the validity of results to the violation of identification assumptions to future study. Another study by Park and Kürüm [8] assessed the validity of results by assuming a worst case scenario but failed to assess the sensitivity of the results systematically to all possible scenarios. Therefore, it is necessary to develop an approach that can mitigate the impact of violations of identification assumptions and/or be more conducive to conducting sensitivity analyses.

In this article, we propose a two-stage joint modeling method to estimate the mediated ITT effect because of its potential benefit of employing modeling assumptions such as distributional assumptions [9] and additional covariates [10, 11] that can mitigate the impact of the violation of some identification assumptions. Another benefit of this method is that it provides a relatively convenient setting to conduct sensitivity analyses to the violation of identification assumptions, compared to the IV-based method. These benefits are demonstrated using the JOBS II data, in which the effect of job training on job-seekers’ mental health is examined.

The rest of the article is organized as follows. We introduce our motivating example in Section 2. In Section 3, we present the identification result of the mediated and unmediated ITT effects. In Section 4, we propose a two-stage joint modeling estimation method, which is followed by a simulation study that examines the role of modeling assumptions when identification assumptions are violated (Section 5). In Section 6, we propose sensitivity analyses based on the proposed joint modeling method. In Section 7, we show how sensitivity analyses to the violation of identification assumptions can be conducted in the context of our example. We conclude with a discussion.

2 JOBS II Intervention Project

This study is motivated by the JOBS Search Intervention Study (JOBS II) [12]. Job loss can lead to harmful effects on a worker’s mental, physical, and social health [13, 14, 15]. The JOBS II study was designed as a randomized trial to examine the effects of a job training intervention on unemployed individuals’ mental health. The goal of this intervention was to prevent the negative effects of job loss by equipping job seekers with efficient job search strategies. The randomized treatment group was assigned to five half-day job searching seminars. Both treatment and control groups received a booklet describing job searching skills. In the JOBS II study, the job training intervention seminars were only available to subjects in the treatment group; subjects in the control group had no way of participating in the seminars. In line with many previous studies [16, 17, 18], we define the treatment receipt status as attending at least one out of five job-searching seminars. Forty-eight percent of those who were assigned to the treatment did not attend any job searching seminars.

Project recruitment consisted of a short screening questionnaire (T0) to determine eligibility, resulting in 1,801 participants. The pre-treatment survey was mailed (T1), and follow-up surveys were mailed two months (T2), six months (T3), and two years (T4) after the week of job training seminars. Data collected in this study included demographic variables such as age, gender, race, and marital status, as well as measures of depression, self-esteem, job-search efficacy, internal control orientation, and reemployment status. Descriptive statistics for the variables used in our analysis are presented in Table 1.

Table 1

Descriptive Statistics for JOBSII data

VariablesMeanVarianceMin.1Max. 2
Depression (post)31.7550.44214.9
Sense of Mastery2.5911.21114
Sex (X1)0.5314-01
Motivation (X2)5.2280.73216.5
Nonwhite (X3)0.7625-01
Marital (X4)1.1421.26504
Education (X5)1.8451.17104
Assertiveness (X6)3.4530.83815
Age (X7)36.1218.816.576.9
Depression (pre)6 (X8)1.8630.33113.5
Economic Hardship (X9)3.0920.97915
  1. 1. Minimum value, 2. maximum value, 3. depression levels measured after (T3) the training, 4. represent the ratio of males in our data, 5. represent the ratio of nonwhite subjects in our data, and 6. depression levels measured before the training.

Previous analysis of JOBS II data showed that the job-training intervention produced beneficial effects, including increased reemployment rates and improved mental health [8, 19, 20]. More specifically, Price et al. [21] showed that the intervention had beneficial effects on those who were identified as being at high risk for experiencing mental health setbacks such as episodes of depression. They also identified sense of mastery as a mediator for the relationship between the intervention and depression. Our analysis will differ from these analyses in that we will investigate the association between job-training seminars and depression in the presence of the mediator, sense of mastery, and by addressing treatment noncompliance using a two-stage joint method, which provides a convenient setting for systematic analyses of sensitivity to the violation of identification assumptions. The outcome variable, depression, was measured using responses to an 11-item list based on the Hopkins Symptom Checklist [22]. The mediator variable, sense of mastery, was computed as the mean score of job-search efficacy, self-esteem, and internal control orientation.

3 Identification

In order to precisely define the effects of interest, consider an experimental setting that mimics the JOBS II project, where some subjects did not comply with the assigned treatment. Let Zi represent the assigned treatment, where Zi = 0 if individual i is assigned to the control condition and Zi = 1 otherwise; let Ti represent the actual treatment received, where Ti = 0 if individual i did not receive the treatment and Ti = 1 if individual i attended at least one job training seminar; Mi and Yi represent the mediator and outcome, respectively; and X is a vector of multiple observed pre-treatment covariates. The supports of the distributions of X i,Mi, and Yi are represented as 𝒳, 𝓜 and 𝒴, respectively. Under the SUTVA, Ti(z) represents the treatment receipt status if individual i was assigned to Z i = z; M i(z) represents the potential mediator of M under Z i = z; Yi(z, m) represents the potential outcome Y under Zi = z, and Mi = m for individual i for z ∈ {0, 1} and m ∈ 𝓜. Pi is an indicator for compliance type that includes compliers (Pi = c) and never takers (Pi = n).

Throughout the paper, we assume the randomization of the treatment assignment.

Assumption 1

Randomization. Treatment assignment is random.

Effects of Interest. Our primary effects of interest are the mediated and unmediated portion of the ITT effect. These are the average effect of offering the treatment on the outcome transmitted through (mediated ITT) or not through (unmediated ITT) a mediator. Since the decomposition is based on the average effect of offering the treatment, we include both those who did and did not comply with the assigned treatment in the analysis. In other words, ITT analysis tests the effectiveness of a randomized intervention regardless whether the subjects actually received the treatment or not. Therefore, the mediated and unmediated ITT effects are of interest for those who want to evaluate the overall effect of an intervention and investigate underlying mechanisms of the effect in a usual setting, in which not every subject complied with the treatment. Throughout this paper, we focus on the mediated and unmediated portion of the ITT effects that include both compliers and non compliers.

Following Yamamoto [7], the mediated and unmediated portion of the ITT effect will be identified and estimated on the basis of the ACME and average natural direct effect among compliers, respectively. Therefore, we first define the complier average causal mediation effect (CACME) and complier average natural direct effect (CANDE), as

(1)δc(z)E[Yi(z,Mi(1))Yi(z,Mi(0))Pi=c] and ζc(z)E[Yi(1,Mi(z))Yi(0,Mi(z))Pi=c],

where z ∈ {0, 1}. In our example, δc(1) indicates among the compliers to what degree the level of depressive symptoms has changed in response to the change in the sense of mastery (from the value that would have resulted under the training to the value that would have resulted under the control) under the job training condition. Likewise, ζc(1) indicates among compliers the average change in the level of depressive symptoms in response to the change in treatment status (that is, from being assigned to job training vs no training), while holding the mediator at the value under the job training condition.

In order to obtain the CACME and CANDE, distributions of mediator and outcome need to be modeled. We use the likelihood to model the distribution of Y, M, and T, given X and Z. For t ∈ {0, 1} and z ∈ {0, 1}, let StzTZdenote a set of observations with T = t and Z = z. Under assumption 1, the likelihood is

(2)L(α,β,λdata)=t,ziStzTzf{Y(zi,M(zi))=yi,M(zi)=mi,T(zi)=tiZ=zi,X=xi;βtz,αtz,λ}=t,ziStzTZf{Y(zi,mi)=yiM(zi)=mi,T(zi)=ti,Z=zi,X=xi;βtz}×f{M(zi)=miT=ti,Z=zi,X=xi;αtz}f{T(zi)=tiX=xi;λ},

where f (·|·) is a conditional probability density function of a random variable of M and Y; αtz and βtz are the vectors of coefficients in the mediator and outcome models, respectively, when T = t and Z = z; and λ is the vector of coefficients for treatment receipt status.

From this likelihood, however, it is not possible to model the distributions within the subpopulation of compliers because compliance type is unknown. According to Angrist et al. [23], an individual compliance type can be expressed as the difference in the actual treatment receipt status that would have been observed under the treatment and control conditions. For example, compliers are those who adhere to their assigned treatment (that is, Ti(1) − Ti(0) = 1). Always takers are those who receive the treatment regardless of assignment, and never takers are those who do not receive the treatment regardless of assignment (that is, Ti(1) − Ti(0) = 0). Defiers are those who do not comply with the treatment protocol and do the opposite of what they are assigned to (that is, Ti(1) − Ti(0) = −1). The compliance type for each individual is unknown because subjects are assigned to either the treatment or control condition but not to both (that is, Ti(1) or Ti(0)). Therefore, we need to invoke more assumptions to identify the distributions of mediator and outcome by compliance type, which are strong monotonicity and exclusion restriction for never takers.

Assumption 2

Strong Monotonicity [23]. This assumption states that there are no defiers or always takers. Formally, Ti(0) = 0 for all i.

In a study where program protocol prohibits subjects in the control group from having access to the intervention, Ti(0) = 0 for all i. This implies that we can rule out the possibility of defiers and always takers. After excluding defiers, those who are assigned to the training but did not attend (Ti(1) = 0) are uniquely identified as never takers. After excluding always takers, those who are assigned to the training and attended (Ti(1) = 1) are uniquely identified as compliers. However, the compliance type for those who are assigned to the control group is still not identified. Therefore, we make the exclusion restriction assumption for never takers.

Assumption 3

Exclusion restriction (ER) for never takers. This assumption was discussed by Little and Yau [16] in the absence of a mediator, and we extend it to a mediation setting. This assumption states that the never-taker distribution in terms of the mediator (or the outcome) is the same under either assignment, given covariates. In formal expression,

(3)f(M(z)P=n,X=x;αnz)=f(M(z)P=n,X=x;αnz), and f(Y(z,m)M=m,P=n,X=x;βnz)=f(Y(z,m)M=m,P=n,X=x;βnz),

for z ∈ {0, 1}, z′ = 1 − z, m ∈ M, and x ∈ 𝒳, where αpz and βpz are the vector of coefficients in the mediator and outcome models, respectively, when P = p and Z = z.

This assumption implies that the direct and indirect effects are allowed only for compliers (but not for never takers), given baseline covariates. This assumption enables us to identify the complier distributions of the mediator and the outcome by fixing the parameters for never-taker distributions at the same value under either assignment, given covariates.

The plausibility of this assumption is often questionable due to psychological effects unless a double-blind design was used to prevent these effects. For example, this assumption would be violated if those who are assigned to but did not receive the job training (i.e., never takers) regretted their failure to take advantage of the intervention and improved job-searching skills by reading a book. Therefore, we develop a sensitivity analysis to assess the effect of violating this assumption for studies in which this assumption might be violated or not plausible, and we demonstrate this sensitivity analysis approach in the JOBS II example.

Under assumptions 1-3, the likelihood can be rewritten as

(4)L(β,α,λ|data)=iS11TZf{Y(zi,mi)=yi|M(zi)=mi,P=c,X=xi;βc1}f{M(zi)=mi|P=c,X=xi;αc1}πc(xi;λ)×iS01TZf{Y(zi,mi)=yi|M(zi)=mi,P=n,X=xi;βn1)f{M(zi)=mi|P=n,X=xi;αn1}πn(xi;λ)×iS00TZ[f{Y(zi,mi)=yi|M(z)=mi,P=c,X=xi;βc0)f{M(z)=mi|P=c,X=xi;αc0}πc(xi;λ)+f{Y(zi,mi)=yi|M(zi)=mi,P=n,X=xi;βn1}f{M(zi)=mi|P=n,X=xi;αn1}πn(xi;λ)],

where πp is the probability of Pi = p, given covariates. We offer four remarks regarding this likelihood. First, the compliance type for those who are assigned to the treatment is uniquely identified under strong monotonicity. Second, even with strong monotonicity, the compliance type for those who are assigned to the control condition is not uniquely identified. Therefore, the likelihood is expressed as the mixture between complier and never taker distributions, as shown in the last two lines of equation (4). Third, under the exclusion restriction for never takers, parameters for never-taker distributions are fixed to αn1 and βn1 under either assignment. Fourth, parameters by compliance type for the mediators and outcome models can be consistently estimated from this likelihood although the estimates may not be necessarily given a causal interpretation. Since Assumption 4 (LSI) is not assumed, the estimates are obtained given the correlation between the errors in the mediator and outcome models generated from the data.

Based on the parameters among compliers obtained from the likelihood, we can write the following linear structural equation models (LSEM) with varying coefficients as

(5)Yi(z)=γc,i+γcz,iz+γx,iXi+ec1,iMi(z)=αc,i+αcz,iz+αx,iXi+ec2,iYi(z,m)=βc,i+βcz,iz+βcm,im+βczm,izm+βx,iXi+ec3,i,

for z ∈ {0, 1} and m ∈ M, where ecj,i~N(0,σcj),in which j ∈ {1, 2, 3}. We define γcE(γc,i), γczE(γcz,i), γxE(γx,i), αcE(αc,i), αczE(αcz,i), αxE(αx,i), βcE(βc,i), βczE(βcz,i), βcmE(βcm,i), βczmE(βczm,i), and βxE(βx,i) where these terms are the mean parameters of corresponding varying coefficients.

Under assumption 1, we can causally identify the complier average effect of treatment on the mediator (i.e., αcz) and on the outcome (i.e., γcz). However, the complier average effect of mediator on the outcome (i.e., βcm and βczm) is not causally identified due to possible confounding in the mediator and outcome relationship among compliers. Therefore, we need to additionally invoke the local sequential ignorability assumption.

Assumption 4

Local sequential ignorability (LSI) [7]. This assumption asserts ignorability of the mediator with respect to the potential outcome among compliers, given treatment and pretreatment covariates. This assumption implies that 1) among compliers, there is no pre-treatment confounding between M and Y, given baseline covariates and 2) among compliers, there is no treatment-induced confounding in the M and Y relationship, given baseline covariates. In formal expression,

Yi(z,m)Mi(z)Zi=z,Pi=c,Xi=x,

for z ∈ {0, 1}, z′ = 1 − z, and m ∈ 𝓜.

Instead of requiring no unmeasured confounding in the MY relationship for every participant as in standard causal mediation literature, the local sequential ignorability assumption requires the unconfoundness between the mediator and outcome to be met only for compliers. Although LSI is required for a smaller subset of participants, this assumption is still challenging to meet in practice. Therefore, it is essential to examine the sensitivity of results against this assumption.

Under assumptions 1-4 and given the LSEM, we can identify the CACME and CANDE as δc(z) = αcz × (βcm + βczmz) and ζc(z) = βcz + βczm(αc + αczz), respectively [2]. Under assumptions 2 and 3, the mediated and unmediated ITT effects are estimated by multiplying the proportion of compliers to the CACME and CANDE estimate respectively, as δ(z) = δc(z) × πc and ζ (z) = ζc(z) × πc. The proof is provided in Appendix A.

4 Estimation

In this section, we propose a two-stage estimation method based on a joint modeling approach, in which distributional assumptions or additional covariates can be used to reduce the impact of violating some identification assumptions. The proposed estimation method consists of two stages. In the first stage, using joint modeling, we estimate the densities of f(y|m, x, p; βpz) and f(m|x, p; αpz), which depend on parameters βpz and αpz, respectively; and the probability of compliers πc(x, λ), which depend on parameters λ. In the second stage, the CACME and CANDE are estimated based on the identification results presented in the previous section. Subsequently, the mediated and unmediated ITT effects are estimated on the basis of the CACME and CANDE estimates, respectively.

First Stage. In the first stage, we use joint modeling, which has been used for estimating the complier-average causal effect (CACE) [16, 23]. We generalize that work by formulating and fitting a model to investigate CACME and CANDE. The estimation procedure of this joint modeling approach is based on the expectation-maximization (EM) algorithm, in which the unobserved compliance type for each subject in the control group is treated as missing data. The E-step computes the expected values of sufficient statistics, given data and current estimates, and the M-step maximizes the likelihood shown in equation (4), given the updated sufficient statistics obtained from the E-step. These steps iterate until the estimates of the parameters become stabilized (See [11, 16, 24, 25] for further details on this procedure).

Using the EM algorithm,we can obtain the probability of compliers. We assume that the distribution of Pi given covariates is assumed to have a Bernoulli distribution with a probability of compliance πc(xi; λ), where

(6)πc(xi;λ)=exp(xiλ)1+exp(xiλ), and πn(xi;λ)=1πc(xi;λ), for xx,

where λ is a vector of logistic regression coefficients. Compared to the previous IV-based method [7, 8], the proposed method provides additional information about the probability of compliers. This information will be used to create a pseudo-population of compliers in order to conduct a sensitivity analysis to violation of LSI.

The conditional probability density functions of random variables Mand Y are obtained using the following parametric models. Given that we have two compliance types (compliers and never takers), the mediator and outcome models can be expressed as a mixture distribution between these two compliance types as

(7)Mi=Niαn+Ciαc+NiαnzZi+CiαczZi+αxXi+Nien2,i+Ciec2,i, and Yi=Niβn+Ciβc+NiβnzZi+CiβczZi+NiβnmMi+CiβcmMi+NiβnzmZiMi+CiβczmZiMi+βxXi+Nien3,i+Ciec3,i,

where C i and N i are indicators for compliers and never takers, respectively; αp, and αpz are the mean parameters of the mediator model coefficients; and βp, βpz, βpm, and βpzm are the mean parameters of the outcome model coefficients when p ∈ {c, n}. The error terms for the mediator and outcome models are ep2,i and ep3,ifor p ∈ {c, n}, respectively. These error terms follow a bivariate normal distribution with a mean of zero and covariance ofp=(σp22ρpσp2σp3ρpσp3σp3σp32),where ρp is the correlation between ep2,i and ep3,i;and σp2 and σp3are standard deviations of the two error terms.

To impose ER, we fixed the effect of treatment on the mediator and the outcome among never takers to zero (that is, αnz = βnz = βnmz = 0) thus not allowing a treatment effect among never takers. To impose LSI, we fixed the the covariance among compliers between errors obtained from mediator and outcome models to be zero asc=(σ2c200σ3c2).

Second Stage. Based on parameter estimates obtained from the first stage, the CACME and CANDE can be estimated asδ^c(z)=α^cz×(β^cm+β^czmz) and ζ^c(z)=β^cz+β^czm(α^c+α^czz),respectively. The mediated and unmediated ITT effects are estimated by multiplying the proportion of compliers to the CACME and CANDE estimates respectively, asδ^(z)=δ^c(z)×π^c and ζ^(z)=ζ^c(z)×π^c. Two-stage estimation is known to be inefficient in terms of standard errors [24], so we employed a bootstrap procedure to obtain correct standard errors for mediated and unmediated ITT effects.

5 Simulation Study

The purpose of this simulation study is to 1) assess the performance of the proposed joint modeling method and 2) examine statistical power in the method. In addition, we examine the sensitivity of the estimates to violations of identification assumptions and we explore changes in this sensitivity when the normality assumption is met or when a strong predictor of compliance exists. In the context of CACE, the impact of violation of ER can be mitigated by using additional covariates [11]. However, the reliance on modeling assumptions in case of violating the ER assumption is not well known in a mediation setting. This will be addressed in our simulation study. For simplicity, we focus on the decomposition of τ = δ(1) + ζ (0) in this simulation study.

Data Generation. Our simulation results are based on 1000 replications with the sample sizes of 200, 400, and 600. The assigned treatment Z is a binary variable that takes the value of 1 or 0. The two values of Z are randomly assigned for each observation with the proportion of 0.5. In line with the JOBS II data, we assume that there are two compliance types: compliers and never takers. The compliance type for each observation is determined by a pretreatment covariate following the logistic regression shown in equation (6), in which the pretreatment covariate (X) is generated to follow a standard normal distribution. The true ratio of compliers and never takers is 50 : 50. The mediator (M) and outcome (Y) are generated for each compliance type following the regression shown in equation (7). For simplicity, the average complier treatment effect on the mediator is set to αcz = 1, and the average complier mediator effect and its interaction with the treatment on the outcome are set to βcm = βczm = 1, respectively. Thus, the true values of the mediated and unmediated ITT effects are assumed to be δ(1) = ζ (0) = 1. The true residual variance is 1 for compliers and never takers  (i.e., σp22=σp32=1,where p ∈ {c, n}).

One of the important conditions that we vary is the strength of the predictor (X) of compliance. In order to reflect the strong, medium, and small impact of the predictor, we vary the true values of λn = {2.3, 1.2, and 0.7}, which are equivalent to the odds ratios of 0.1, 0.3, and 0.5. This setting is in line with Jo and Stuart [25] and Stuart and Jo [26], which investigated the impact of predictors of compliance on estimating treatment effects conditional on compliance types.

In addition, we generated three types of data in which 1) both mediator and outcome follow a normal distribution, 2) the outcome follows a normal distribution but the mediator does not, and 3) the mediator follows a normal distribution but the outcome does not. For the case in which both mediator and outcome follow a normal distribution, we generated errors for the mediator and outcome from the standard normal distribution. When either the mediator or the outcome violated the normality assumption, we generated two normal distributions that follow N (−1, 1) and N (3, 1) separately and combined them, which generates a bimodal distribution.

In order to create a situation in which the ER is violated, the effect of the treatment on the mediator and outcome among never takers is varied to αnz = βnz = βnzm = {−0.5, 0.25, 0, 0.25, 0.5}. Since the residual variance is 1, these deviations of the ER can be considered as standard deviation (SD) units. We chose these ranges of values because the treatment effect on the mediator and the outcome for compliers is set to 1. We set the maximum values of αnz and βnz to half the size of the corresponding complier effect (i.e., αcz and βcz) because never takers did not actually receive the treatment. In the analytical model in which we estimate mediated and unmediated ITT effects using the generated model, we assumed the ER and LSI. The rest of the parameters are specified as follows: αx = βx = βnm = 1.

To assess the performance of the proposed method in various settings, we first examine the bias of the probability of compliers. This is crucial because this information will be used for sensitivity analysis in the later section. Then, we examine the percent bias (%bias), the percent normalized root mean square errors (%nRMSE), and coverage rate for the mediated and unmediated ITT effects to summarize our simulation results. The %bias measures the difference between the average of estimates and the true value relative to the true value. The %nRMSE measures the square root of the average of squared difference between the estimate and the true value relative to the true value. The coverage rate is defined as the proportion of replications where the true value is covered by the 95% confidence interval out of 1000 replications. To examine the statistical power in the method, we calculate the power under different sample sizes and distributions of the mediator and the outcome. The power is defined as the proportion of replications where the effect estimate is significantly different from zero (α = 0.05) out of 1000 replications.

Simulation Results. The simulation results are summarized in Figures 1a-1c. The top plots present the bias of P(c) as well as %bias, %nRMSE, and 95% confidence interval coverage rates of δ(1) with a normally distributed mediator and outcome. The middle and bottom plots present the same quantities with a non-normally distributed mediator and a non-normally distributed outcome, respectively.

Figure 1 Sensitivity of the estimates (P(c) and δ(1)) when the ER is violatedNote. 1) True effect: P(c) = 0.5 and δ(1) = 1, sample size: 600. 2) The results for ζ (0) are similar to the ones for δ(1). Given the similarity, we present the results for ζ (0) in the e-Appendix.
Figure 1

Sensitivity of the estimates (P(c) and δ(1)) when the ER is violated

Note. 1) True effect: P(c) = 0.5 and δ(1) = 1, sample size: 600. 2) The results for ζ (0) are similar to the ones for δ(1). Given the similarity, we present the results for ζ (0) in the e-Appendix.

The estimates of P(c) under the deviation of zero from ER are unbiased regardless of whether or not normality holds. The estimates of P(c) tends to be biased when the data deviate from ER although the bias is relatively small. Even when the ER is violated by the 0.5 S.D, the bias is less than 0.07 with the small impact of covariates (OR of 0.5).

Not surprisingly, the estimates of δ(1) under the deviation of zero from ER are unbiased, and the 95% coverage rate reaches the nominal level even when normality is not met. Although the bias is almost zero regardless of whether or not normality holds, the nRMSE tends to be large if the normality does not hold for either the mediator or outcome distribution. When normality holds, the nRMSE is less than 19% with a strong predictor of compliance. With the same setting, the nRMSEs are 32% and 22%, respectively, when normality is violated for the mediator and outcome. This indicates that standard errors tend to be large if normality is violated for the mediator or outcome distribution when all identification assumptions are met.

As expected, the effect estimates of δ(1) become biased when the data deviate from ER regardless of whether or not normality holds. If normality does not hold, the nRMSE becomes larger. When normality is met for both the mediator and outcome and the ER is violated by the 0.25 S.D, the bias is less than 10% and the nRMSE is 21% with the medium impact of covariate (OR of 0.3) (Figure 1a). With the same setting but when the normality is violated for the mediator, the bias is same as 10% but the nRMSE is 35% (Figure 1b). The nRMSE is also larger (24%) when normality is violated for the outcome (Figure 1c).

Also, the bias is smaller in cases with a stronger predictor of compliance. In cases with a covariate with a strong effect size (OR of 0.1), the biases are about half what they are with a covariate with a medium effect size. In the same setting (with a covariate with a medium effect size), the bias is less than 5% when normality is met (Figure 1a), and the bias is same when normality is not met for the mediator and outcome (Figure 1b and Figure 1c).

In summary, when normality is met and a strong predictor of compliance exists, the bias due to the relatively smaller deviation from ER (one fourth of the complier average effect) may be negligible given that the bias is less than 5% of the true value. However, when normality is violated for either the mediator or outcome, the nRMSE becomes larger, which will result in large standard errors.

The statistical power for the mediated ITT effect (δ(1)) under different sample sizes and distributions of the mediator and the oucome is shown in Figure 2. The figure illustrates that statistical power to detect the mediated ITT effect is greatly influenced by whether or not normality holds (Figure 2a). For example, if normality holds, statistical power is greater than 0.8 regardless of whether strong or small impact of covariates were used. If normality in the mediator does not hold, statistical power ranges from 0.4 (sample size of 200) to 0.9 (sample size of 600) (Figure 2b). Statistical power does not appear to be different if normality in the outcome does not hold. In summary, statistical power to detect the mediated ITT effect reaches a desirable level if normality holds even with a small sample size (N=200).

Figure 2 Statistical power of δ(1)Note. 1) True effect: δ(1) = 1. 2) The results for ζ (0) are similar to the ones for δ(1). Given the similarity, we present the results for ζ (0) in the e-Appendix.
Figure 2

Statistical power of δ(1)

Note. 1) True effect: δ(1) = 1. 2) The results for ζ (0) are similar to the ones for δ(1). Given the similarity, we present the results for ζ (0) in the e-Appendix.

6 Joint Modeling-based Sensitivity Analysis

In this section, we propose sensitivity analyses that can assess the validity of results to a possible violation of ER for never takers and LSI. We focus on sensitivity analyses with respect to these two assumptions because the identification of the mediated and unmediated ITT effects crucially rely on them. The proposed sensitivity analyses can be employed when investigating a mediating mechanism with any randomized experiments that suffer from treatment noncompliance, in which access to the treatment is prohibited for those who are assigned to the control condition.

Sensitivity analysis for ER for never takers. The ER assumption for never takers requires that there is no effect of the assigned treatment on the mediator (or on the outcome) and, hence, the treatment effect is zero for never takers. As shown in our simulation study, the impact of violation of ER is smaller if there is a strong predictor of compliance and the normality assumption is met. However, the validity of results may still be questioned if these modeling assumptions do not hold and/or the degree to which ER is violated could be severe.

Although many sensitivity analyses have been developed for ER, very few sensitivity analyses are available for a mediation setting. For example, an alternative sensitivity analysis technique has been developed by Park and Kürüm [8] on the basis of the IV-based method. This technique involves specifying a ratio of the predicted outcome (mediator) value given Z=1 to the predicted outcome (mediator) value given Z=0 among never takers relative to a corresponding ratio among compliers. This approach is similar to our proposed sensitivity analysis technique. However, an IV-based sensitivity analysis technique does not have any means to decrease the impact of violating ER and thus provides a relatively large range of estimates for the change in the sensitivity parameters. In contrast, our proposed sensitivity analysis technique provides a smaller range of results for the change in the sensitivity parameters when normality is met or additional covariates exist.

If ER is violated, we can no longer assume that the distributions of mediator and outcome among never takers are the same under either assignment. Therefore, our sensitivity parameters are based on expected difference in the mediator and outcome distributions among never takers between those who are assigned to the treatment and control conditions. Specifically, let ϵm be the expected difference in the mediator value among never takers between those who are assigned to the treatment and control conditions, given covariates. Let ϵy1 + ϵy2m be the expected difference in the outcome value among never takers between those who are assigned to the treatment and control conditions, given covariates for every m ∈ 𝓜. Formally,

ϵm=E[M(1)M(0)P=n,X=x] and ϵy1+ϵy2m=E[Y(1,m)Y(0,m)P=n,X=x], for all mM and xχ.

Suppose that ER is violated but other assumptions are met. Then, given particular values of ϵm, ϵy1, and ϵy2, the mediated and unmediated ITT effects are identified, respectively, as

(8)δ(z)=π˜c{α˜cz×(β˜cm+β˜czmz)}+π˜n{ϵm×(β˜nm+ϵy2z)} and ζ(z)=π˜c{β˜cz+β˜czm(α˜c+α˜czz)}+π˜n{ϵy1+ϵy2(α˜n+ϵmz)},

where π˜c,π˜n,α˜n,α˜cz,β˜cz,β˜cm,β˜nm, and β˜czmare obtained from the maximized complete-data likelihood given particular values of ϵm, ϵy1, and ϵy2. The proof of this result is provided in Appendix B.

Sensitivity analysis for LSI. The first part of assumption 2 states that among compliers, there is no unmeasured confounding in the mediator and outcome relationship given baseline covariates. In many cases, the more covariates we observe, the more plausible the assumption is. However, we may not be able to measure all the covariates to remove confounding between the mediator and outcome among compliers. Many studies have addressed this issue of unmeasured mediator and outcome confounding when perfect compliance was assumed (e.g., [1, 2, 29, 30]). However, very few studies have addressed this issue when perfect compliance was not assumed. The previous study based on the IV-based method [8] examined the sensitivity of the results to the violation of LSI by assuming the worst case scenario. In this study, we provide a systematic sensitivity analysis technique that can be used for all possible scenarios of unobserved confounding between the mediator and the outcome.

Imai et al. [1] identified the ACME given a value of the correlation between two error terms obtained from the mediator and outcome models when perfect compliance to the treatment was assumed. However, we cannot apply this approach in the presence of treatment noncompliance because the previously developed IV-based method does not provide any information on individual compliance status. Unlike the IV-based method, the joint modeling method provides the probability of an individual being a complier and this information can be used to assess the sensitivity to a possible violation of LSI.

Development of the sensitivity analysis for LSI relies on using an individual’s probability of being a complier as a weight to create a pseudo-population of compliers. The term “pseudo-population” is often used in the field of survey sampling that mimics the original population by replicating sample units based on the probability of being sampled. Here, we define pseudo-population as the original population of compliers, which is partially observed. For the treatment group, those who attended the job training will be assigned a weight of 1, and those who did not attend the training will be assigned a weight of 0 because the probability of being a complier is measured without any error under strong monotonicity. For the control group, we cannot uniquely identify compliance types for each individuals because they are not observed; yet, we can create a weighted sample based on the probability of compliers. Each individual will be assigned a weight of πc(x)/πc , where πc(x) is the probability of being a complier given pretreatment covariates from equation (6) and πc is the proportion of compliers. By giving a weight of πc(x), those who have a high chance of being a complier will be given more weight and those who have low chance of being a complier will be given less weight. By dividing the weight by the proportion of compliers (πc), we can recover the total sample size of the control group. For example, an individual in the control group with the probability of compliers of πc(x) = 0.8 will be replicated 0.80.5=1.6times (when πc = 0.5), delivering 1.6 clones for the pseudo-population. The same logic was used in Ding and Lu [27].

Based on this pseudo-population of compliers, the sensitivity of the results will be examined across the varying values of the correlation between the errors obtained from the mediator and the outcome models as in Imai et al. [1].

Suppose that LSI is violated, but the other assumptions are met. Let the correlation between the error terms from the mediator and outcome models fitted among the pseudo-population of compliers be denoted as ρc. Then, given a value of ρc, the mediated and unmediated ITT effects are identified as

δ(z)=πcαcz{σc1σc2(ρ˜czρc1ρ˜cz21ρc2)} and ζ(z)=ITTδ(z),

where z′ = 1 − z for z ∈ {0, 1}. The term ρ˜czis the correlation between the error terms ϵc1,i and ϵc2,i(from equations (5)) when Zi = z; and σc1 and σc2are standard deviations of the error terms, respectively, which are fixed to be constant across the values of Zi. The proof of this result is given in Appendix C.

7 Application to Jobs II Study

Our question of interest is whether the effect of the JOBS II intervention on reducing job-seekers’ depression is transmitted through increased sense of mastery. To answer this question, we estimate the mediated and unmediated portion of the ITT effect via sense of mastery using the proposed joint modeling method. We then show how the sensitivity of the estimated mediated and unmediated ITT effects to the violation of ER and LSI can be investigated using the results from the previous section.

Results. Table 2 shows the estimates of the mediated and unmediated ITT effects given assumptions 1-4. The difference in the outcome value between treatment and control subjects of -0.07 estimates the ITT estimand. The mediated portion of the ITT effect for treated and controlled conditions are negatively significant as -0.03 and -0.04, which occupy the 43.1% and 61.1% of the ITT effect, respectively. In contrast, the unmediated ITT effects for the treated and controlled conditions are not significant. This implies that the mediating mechanism through which the job training impacts job-seekers’ depression includes enhanced sense of mastery under assumptions 1-4.

Table 2

Estimates of the mediated and unmediated ITT effects

Compliers effectsITT effects
ParameterEst.S.E.P-ValueParameterEst.S.EP-Value
δc(1)−0.0570.0200.005δ(1)−0.0310.0110.005
δc(0)−0.0810.0320.012δ(0)−0.0440.0170.012
ζc(1)−0.0530.0540.324ζ (1)−0.0290.0290.322
ζc(0)−0.0770.0660.244ζ (0)−0.0410.0350.242
CACE−0.1340.0690.052ITT−0.0720.0370.050
  1. Note. Est.=estimates; S.E.=standard errors; CACE=compliers average causal effect; ITT= intention-to-treat effect

However, for a valid causal interpretation of the estimates, it is crucial to examine the sensitivity of the estimates to a violation of the identification assumptions. We require randomization, strong monotonicity, ER for never takers, and LSI. Randomization is satisfied because job training is assigned randomly. Strong monotonicity is also guaranteed to be met because program protocol prohibits subjects in the control group to have access to the job search seminar. However, ER for never takers is controversial. ER might be violated due to psychological effects. For example, some participants who were assigned to the job training but failed to attend (never takers) may feel more depressed, which violates ER. Another controversial assumption is LSI because there could be unobserved confounding between sense of mastery and depression given the treatment level and pretreatment covariates. Therefore, we conduct sensitivity analyses for ER and LSI.

Sensitivity analysis for ER. In our study, we assume that this psychological effect is unlikely to be large because never takers did not actually attend the training. Hence, we limit the violation of ER to be at most half the size of the complier average effect. The sensitivity parameters of ϵm, ϵy1, and ϵy2 were given a value of one fourth (0.25) or half (0.5) the size of the corresponding complier average effect.

Table 3 shows the adjusted estimates of the mediated ITT effect by varying values of ϵm, ϵy1, and ϵy2. It appears that the mediated ITT effect for those who are assigned to the treatment (δ(1)) is robust to the violation of ER with respect to both mediator and outcome. For example, the mediated ITT effect for those who are assigned to the treatment is still negative and significant when the treatment effect among never takers is half the size of the corresponding complier average effect for either mediator and outcome model (ϵm = 0.5, or ϵy1 = ϵy2 = 0.5). In contrast, the mediated ITT effect for those who are assigned to the control (δ(0)) is relatively vulnerable to the violation of ER with respect to both mediator and outcome. The mediated ITT effect for those who are assigned to the control is still negative but loses its significance when the treatment effect among never takers is the one fourth of the size of the complier average effect for either mediator and outcome model (ϵm = 0.25 or ϵy1 = ϵy2 = 0.25).

Table 3

Sensitivity of estimates with the deviation from the ER

ϵY=0× c.e.ϵY=0.25× c.e.ϵY=0.5× c.e.
ϵMδ(1)δ(0)δ(1)δ(0)δ(1)δ(0)
0× c.e.-0.031**-0.044*-0.026**-0.033*-0.025*-0.026
(0.011)(0.017)(0.009)(0.016)(0.010)(0.015)
0.25× c.e.-0.032**-0.031*-0.031**-0.026-0.030**-0.018
(0.009)(0.014)(0.010)(0.016)(0.010)(0.013)
0.5 × c.e.-0.037**-0.025-0.037**-0.022-0.036**-0.014
(0.010)(0.014)(0.011)(0.018)(0.011)(0.013)
  1. Note. 1) Standard errors are in parentheses. 2) c.e.=corresponding complier-average effect. 3) **: p<0.01, and *: p<0.05

Sensitivity analysis for LSI. We next examine whether our conclusion about the mediated ITT effect changes if there are unmeasured pre-treatment covariates between the mediators and outcome among compliers while assuming other assumptions are satisfied.

Figures 3a and 3b show the sensitivity of the mediated ITT effect estimates under treatment and control conditions, respectively, to the violation of the LSI while assuming other assumptions are satisfied. These figures show how the change in ρc affects the mediated ITT effect estimates. The sensitivity parameter ρc represents the correlation among compliers between the errors obtained from the mediator and outcome model, and a non-zero value of ρc indicates the existence of unmeasured confounding among compliers in the mediator and outcome relationship. The bold line in the middle represents the changed mediated ITT effect estimates depending on the value of ρc, and the solid lines represent the lower and upper values of 95% confidence intervals.

Figure 3 Sensitivity of the mediated or unmediated portions of ITT effects to the violation of LSI
Figure 3

Sensitivity of the mediated or unmediated portions of ITT effects to the violation of LSI

As shown in Figure 3a, the mediated ITT effect estimate for those who are assigned to the control will be close to zero if ρc is -0.4. However, the 95% confidence interval of the effect estimate will cover zero with a smaller value of ρc,which is -0.3. This value of ρc is equivalent to the amount of confounding that explains the variances of mediator and outcome, for example, by 25%and 36%, respectively [3]. This amount of confounding can be considered very large given that the strongest covariate (i.e., pre-measured depression) in the existing model explains the variances of mediator and outcome by 5.8% and 16.8%, respectively.

As shown in Figure 3b, the mediated ITT effect estimate for those who are assigned to the treatment will be zero if ρc is -0.3. However, the 95% confidence interval of the effect estimate will cover zero if ρc is -0.2, which is equivalent to the amount of confounding that explains the variances of mediator and outcome, for example, by 16% and 25%, respectively [4]. This amount of confounding can still be considered very large.

In summary, the significant mediation effect for those who are treated is robust to a potential violation of ER and it is robust to a potential violation of LSI while other assumptions are assumed to be satisfied. However, the mediation effect for those who are controlled may lose its significance if the effect of never takers are as large as one fourth of the corresponding complier-average effect; however, it is robust to a potential violation of the LSI when other assumptions are met. For these sensitivity analyses, we used Mplus [28]. Annotated Mplus code can be found in the online appendix.

8 Summary and Conclusions

In this article, we proposed a two-stage joint modeling method that combines a mediation analysis with a mixture analysis to conduct causal mediation analysis in the presence of treatment noncompliance. On the basis of the mediation analysis, the mediator and outcome models can be specified and estimated. On the basis of the mixture analysis, the compliance-specific parameters can be specified and estimated, considering the mixed distributions of compliers and never takers.

One useful feature of the joint modeling method is that it is conducive to conducting sensitivity analyses to the violation of identification assumptions. In this study, we offer a systematic sensitivity analysis that addresses the two identification assumptions (the ER and LSI), which was not available in the previous instrumental variables approach. Sensitivity analysis is an important component in any causal inference framework because many identification assumptions are not verifiable with empirical data. The proposed sensitivity analysis can be easily used by applied researchers to test their results against violation of these identification assumptions.

Another useful feature of the joint modeling method is that we can invoke modeling assumptions such as normality or the existence of strong predictors of compliance that can decrease the sensitivity of violating some identification assumptions such as the ER. In the context of CACE, including a strong predictor of compliance can decrease the bias due to violation of the ER and increase precision of the estimates. We demonstrate in our simulation study that these benefits also apply when estimating the mediated ITT effect. Normality also plays a role in estimating compliance type more precisely, and the simulation study suggests that estimating compliance type is more affected by the outcome distribution than the mediator distribution.

However, these benefits come with a cost. From the simulation study, we observe a large variance in the estimates even when all identification assumptions are met if normality is violated. If normality is violated, advantages of the proposed joint modeling method disappear. In this case, one should consider using a propensity score method, suggested by Jo and Stuart [25], Ding and Lu [27], which relies only on pre-treatment covariates to identify unobserved compliance types and, thus, reduces reliance on particular parametric assumption such as normality.

In this article, we introduced a two-stage joint modeling method to estimate the mediated and unmediated portion of the ITT effect and demonstrated the benefits of employing this method through simulation and case studies. A next logical step for future research is to compare relative performance between the proposed joint modeling method and the previous approach using the IV method [7, 8]. Unlike the joint modeling method, the IV method does not require modeling assumptions and hence, the identification of the mediated ITT effect relies only on identification assumptions. Comparing relative performance when modeling assumptions are met or not met would be an interesting subject for future study.

References

[1] K. Imai, L. Keele, and D Tingley. A general approach to causal mediation analysis. Psychological Methods, 15:309–334, 2010.10.1037/a0020761Search in Google Scholar PubMed

[2] K. Imai and T Yamamoto. Identification and sensitivity analysis for multiple causal mechanisms: Revisiting evidence from framing esperiments. Political Analysis, 21:141–171, 2013.10.1093/pan/mps040Search in Google Scholar

[3] T Vander Weele. Explanation in causal inference: methods for mediation and interaction. Oxford University Press, 2015.Search in Google Scholar

[4] Johan Steen, Tom Loeys, Beatrijs Moerkerke, and Stijn Vansteelandt. Flexible mediation analysis with multiple mediators. American journal of epidemiology, 186(2):184–193, 2017.10.1093/aje/kwx051Search in Google Scholar PubMed

[5] Guido W Imbens and Donald B Rubin. Causal inference in statistics, social, and biomedical sciences. Cambridge University Press, 2015.10.1017/CBO9781139025751Search in Google Scholar

[6] J Pearl. The causal mediation formula—a guide to the assessment of pathways and mechanisms. Prevention Science, 13(4): 426–436, 2012.10.21236/ADA557663Search in Google Scholar

[7] T Yamamoto. Identification and estimation of causal mediation effects with treatment noncompliance. Unpublished manuscript, 2014.Search in Google Scholar

[8] Soojin Park and Esra Kürüm. Causal mediation analysis with multiple mediators in the presence of treatment noncompliance. Statistics in medicine, 37(11):1810–1829, 2018.10.1002/sim.7632Search in Google Scholar PubMed

[9] JL Zhang, DB Rubin, and F Mealli. Likelihood-based analysis of causal effects via principal stratification: new approach to evaluating job-training programs. Journal of the American Statistical Association, 104:166–176, 2009.10.1198/jasa.2009.0012Search in Google Scholar

[10] Peng Ding, Zhi Geng, Wei Yan, and Xiao-Hua Zhou. Identifiability and estimation of causal effects by principal stratification with outcomes truncated by death. Journal of the American Statistical Association, 106(496):1578–1591, 2011.10.1198/jasa.2011.tm10265Search in Google Scholar

[11] Booil Jo, Tihomir Asparouhov, Bengt O Muthén, Nicholas S Ialongo, and C Hendricks Brown. Cluster randomized trials with treatment noncompliance. Psychological methods, 13(1):1–18, 2008.10.1037/1082-989X.13.1.1Search in Google Scholar PubMed PubMed Central

[12] Amiram D Vinokur, Michelle Van Ryn, EdwardMGramlich, and Richard H Price. Long-term follow-up and benefit-cost analysis of the jobs program: a preventive intervention for the unemployed. Journal of Applied Psychology, 76(2):213–219, 1991.10.1037/0021-9010.76.2.213Search in Google Scholar PubMed

[13] R. Catalano and C. D Dooley. Economic predictors of depressed mood and stressful life events in a metropolitan community. Journal of Health and Social Behavior, 18:292–307, 1977.10.2307/2136355Search in Google Scholar

[14] Sidney Cobb and Stanislav V Kasl. Termination; the consequences of job loss, volume 77. NIOSH, 1977.Search in Google Scholar

[15] R Catalano. The health effects of economic insecurity. American Journal of Public Health, 81(9):1148–1152, 1991.10.2105/AJPH.81.9.1148Search in Google Scholar

[16] R. J. Little and L. H. Y Yau. Statistical techniques for analyzing data from prevention trials: Treatment of no-shows using rubin’s causal model. Psychological Methods, 3(2):147–159, 1998.10.1037/1082-989X.3.2.147Search in Google Scholar

[17] C. E. Frangakis and D. B Rubin. Principal stratification in causal inference. Biometrics, 58(1):21–29, 2002.10.1111/j.0006-341X.2002.00021.xSearch in Google Scholar

[18] L. H. Y Yau and R. J Little. Inference for the complier-average causal effect from longitudinal data subject to noncompliance and missing data, with application to a job training assessment for the unemployed. Journal of the American Statistical Association, 96(456):1232–1244, 2001.10.1198/016214501753381887Search in Google Scholar

[19] R. D. Caplan, A. D. Vinokur, R. H. Price, and M Van Ryn. Job seeking, reemployment, and mental health: a randomized field experiment in coping with job loss. Journal of applied psychology, 74(5):759–769, 1989.10.1037/0021-9010.74.5.759Search in Google Scholar PubMed

[20] Michael E Sobel and Bengt Muthén. Compliance mixture modelling with a zero-effect complier class and missing data. Biometrics, 68(4):1037–1045, 2012.10.1111/j.1541-0420.2012.01791.xSearch in Google Scholar PubMed

[21] R. H. Price, M. Van Ryn, and A. D Vinokur. Impact of a preventive job search intervention on the likelihood of depression among the unemployed. Journal of Health and Social Behavior, 33:158–167, 1992.10.2307/2137253Search in Google Scholar

[22] L. R. Derogatis, R. S. Lipman, K. Rickels, E. H. Uhlenhuth, and L Covi. The hopkins symptom checklist (hscl): A self-report symptom inventory. Systems Research and Behavioral Science, 19(1):1–15, 1974.10.1002/bs.3830190102Search in Google Scholar PubMed

[23] J. D Angrist, G. W. Imbens, and D. B Rubin. Identification of causal effects using instrumental variables. Journal of the American statistical Association, 91(434):444–455, 1996.10.3386/t0136Search in Google Scholar

[24] Edward Bein, Jonah Deutsch, Guanglei Hong, Kristin E Porter, Xu Qin, and Cheng Yang. Two-step estimation in ratio-of-mediator-probability weighted causal mediation analysis. Statistics in medicine, 37(8):1304–1324, 2018.10.1002/sim.7581Search in Google Scholar PubMed

[25] Booil Jo and Elizabeth A Stuart. On the use of propensity scores in principal causal effect estimation. Statistics in medicine, 28(23):2857–2875, 2009.10.1002/sim.3669Search in Google Scholar PubMed PubMed Central

[26] Elizabeth A Stuart and Booil Jo. Assessing the sensitivity of methods for estimating principal causal effects. Statistical methods in medical research, 24(6):657–674, 2015.10.1177/0962280211421840Search in Google Scholar PubMed PubMed Central

[27] Peng Ding and Jiannan Lu. Principal stratification analysis using principal scores. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 79(3):757–777, 2017.10.1111/rssb.12191Search in Google Scholar

[28] Linda K Muthén, Bengt O Muthén, et al. Mplus (version 5.1). Los Angeles, CA: Muthén & Muthén, 2008.Search in Google Scholar

[29] Guanglei Hong, Xu Qin, and Fan Yang. Weighting-based sensitivity analysis in causal mediation studies. Journal of Educational and Behavioral Statistics, 43(1):32–56, 2018.10.3102/1076998617749561Search in Google Scholar

[30] T. J VanderWeele. Bias formulas for sensitivity analysis for direct and indirect effects. Epidemiology (Cambridge, Mass.), 21(4):540–551, 2010.10.1097/EDE.0b013e3181df191cSearch in Google Scholar PubMed PubMed Central

9 Appendix A: Identification of δ(z) and ζ (z)

The mediated and unmediated portion of ITT effects are identified on the basis of CACME and CANDE, respectively. Therefore, we first identify the CACME and CANDE. From equations (5), note that parameters in the second line of equations (5) are identified under randomization because ec1(z) ⊥ Z |X = x holds. The parameters in the third line of equations (5) are identified under randomization and LSI because ec3(z, m) ⊥ Z |X = x and ec3(z, m) ⊥ M |Z = z′ , X = x, P = c. Given these parameters, the CACME is identified as

(A-1)δc(z)=E[Yi(z,Mi(1))Yi(z,Mi(0))|Pi=c]=E[βc,i+βcz,iz+βcm,i{Mi(1)|Pi=c}+βczm,iz{Mi(1)|Pi=c}βc,iβcz,izβcm,i{Mi(0)|Pi=c}βczm,iz{Mi(0)|Pi=c}],=E[(βcm,i+βczm,iz){Mi(1)Mi(0)|Pi=c}]=E[(βcm,i+βczm,iz)αcz,i]=(βcm+βczmz)αcz

The first equality is from the definition of CACME. The second equality holds after incorporating the outcome model (i.e., the third line of equations (5)). The fourth equality holds after incorporating the mediator model (i.e., second line of equations (5)). The fifth equality holds due to LSI (assumption 2). Specifically, given compliers, Yi(z, m) − Yi(z, m′) = βcm,i + βczm,iz is independent from Mi(z) for any z ∈ {0, 1} as in LSI.

Likewise, the CANDE is identified as

(A-2)ζc(z)=E[Yi(1,Mi(z))Yi(0,Mi(z))|Pi=c]=E[βc,i+βcz,i+βcm,i{Mi(z)|Pi=c}+βczm,i{Mi(z)|Pi=c}βc,iβcm,i{Mi(z)|Pi=c}]=E[βcz,i+βczm,i{Mi(z)|Pi=c}]=E[βcz,i+βczm,i(αc,i+αcz,iz)]=βcz+βczm(αc+αczz).

The first equality is from the definition of CANDE. The second equality holds after incorporating the outcome model (i.e., third line of equations (5)). The fourth equality holds after incorporating the mediator model (i.e., second line of equations (5)). The fifth equality holds due to LSI (assumption 2). Specifically, given compliers, Yi(1,m)Yi(0,m)=βczm,izis independent from Mi(z) for any z ∈ {0, 1} as in LSI.

Next, we identify the the mediated and unmediated ITT effects on the basis of CACME and CANDE as

δ(z)=δc(z)πc+δn(z)πn=δc(z)πc, and ζ(z)=ζc(z)πc+ζn(z)πn=ζc(z)πc,

where δn and ζn are ACME and average natural direct effects among never takers, respectively. The first equality holds because of strong monotonicity. The second equality holds because of ER for never takers. This completes the proof.

10 Appendix B: Sensitivity analysis for ER

Our sensitivity parameters depend on the deviation from the ER as

ϵm=E[M(1)M(0)P=n,X=x] andϵy1+ϵy2m=E[Y(1,m)Y(0,m)P=n,X=x] for all mM, and xX.

This implies that among never takers, the parameter for the treatment on the mediator is fixed to αnz = ϵm, and the parameters for the treatment on the outcome among never takers are fixed to βnz = ϵy1 and βnzm = ϵy2. Given particular values of ϵm, ϵy1, and ϵy2, we can rewrite linear structural models as

(A-3)Mi(z)=Niα˜n,i+Ciα˜c,i+Niϵmz+Ciα˜cz,iz+α˜x,iXi+Nien2,i+Ciec2,i, andYi(z,m)=Niβ˜n,i+Ciβ˜c,i+Niϵy1z+Ciβ˜cz,iz+Niβ˜nm,im+Ciβ˜cm,im+Niϵy2zm+Ciβ˜czm,izm+β˜x,iXi+Nien3,i+Ciec3,i,

where α˜p,i,β˜p,i,β˜pm,i,andβ˜x,i for p{c,n}are obtained from the maximized complete-data likelihood given particular values of ϵm, ϵy1, and ϵy2. We define α˜pE(α˜p,i),α˜pzE(α˜pz,i),α˜χE(α˜x,i),β˜pE(β˜p,i),β˜pzE(β˜pz,i),β˜pzmE(β˜pzm,i), and β˜xE(β˜x,i) for p{c,n}.

Based on equations (A-3), the ACME among never takers given particular values of ϵm,ϵy1, and ϵy2(δnϵ(z))is identified as

(A-4)δnϵ(z)=E[Yi(z,Mi(1))Yi(z,Mi(0))|Pi=n]=E[β~n,i+ϵy1z+β~nm,i{Mi(1)|Pi=n}+ϵy2z{Mi(1)|Pi=n}β~n,iϵy1zβ~nm,i{Mi(0)|Pi=n}ϵy2z{Mi(0)|Pi=n}],=E[(β~nm,i+ϵy2z){Mi(1)Mi(0)|Pi=c}]=E[(β~nm,i+ϵy2z)ϵm]=(β~nm+ϵy2z)ϵm.

The first equality is due to the definition of ACME among never takers. The second equality holds after incorporating the second line of equations (A-3). The fourth equality holds after incorporating the first line of equations (A-3). The fifth equality holds because ϵm is constant. In the same way, the ANDE among never takers (ζnϵ(z))is identified as ϵy1+ϵy2(α˜n+ϵmz).

Given equations (A-3), we can also obtain CACME and CANDE. Under LSI, δcϵ(z)=α~cz×β~cm+β~czmz,as in equations (A-1), and ζcϵ(z)=β~cz+β~czmα~c+α~czz,as in equations (A-2).

Based on δcϵ(z),δnϵ(z),ζcϵ(z), and ζnϵ(z),the mediated and unmediated ITT effects are identified, respectively, as

(A-5)δ(z)=π˜cδcϵ(z)+π˜nδnϵ(z)=π˜c{α˜cz×(β˜cm+β˜czmz)}+π˜n{ϵm×(β˜nm+ϵy2z)}, and ζ(z)=π˜cζcϵ(z)+π˜nζnϵ(z)=π˜c{β˜cz+β˜czm(α˜c+α˜czz)}+π˜n{ϵy1+ϵy2(α˜n+ϵmz))},

where π˜c,π˜n,α˜n,α˜cz,β˜cz,β˜cm,β˜nm, and β˜czmare obtained from the maximized complete-data likelihood given particular values of ϵm, ϵy1, and ϵy2. The first equality is due to strong monotonicity. The second equality is due to incorporating results for δpϵ(z) and ζpϵ(z) for p{c,n}.This completes the proof.

11 Appendix C: Sensitivity analysis for LSI

For this proof, we follow Imai et al. [1]’s work. We assumed homogeneous effects as in Imai et al. [1] but expand their work by conditioning on pseudo-population of compliers. We omit pre-treatment confounding in equation (5) for simplicity, but the result remains the same even with covariates. Under randomization (assumption 1), we can consistently estimate γc , γcz , αc, and αcz as well as a variance measure for each error term σc1 and σc2, and the correlation between the errors ρ˜c1=cor(ec1,i,ec2,iZi=1) and ρ˜c0=cor(ec1,i,ec2,iZi=0).We assume that σc1, σc2 are constant between Z = 1 and Z = 0.

Using equations (5), Yi(0, M i(0)) among pseudo-population of compliers can be expressed as

(A-6)Yi(0,Mi(0))=βc+βcmMi(0)+ec3,i=βc+βcm(αc+ec2,i)+ec3,i=βc+βcmαc+ec3,i+βcmec2,i.

By comparing this result with Yi(0)=γc+ec1,i(using the first line of equations (5)), we know that ec1i=ec3i+βcmec2iunder Z = 0. Let ρc be the correlation among the pseudo-population of compliers between the error terms obtained from the mediator and outcome models (the second and third lines of equations (5)). Given a value of ρc, we have ρ˜c0σc1σc2=ρcσc3σc2+βcmσ22 and σc12=σc32+βcm2σc22+2βcmρcσc3σc2.Now assume that ρc0,which indicates the violation of LSI. Then, solving these equations with respect to the value of βcm, we have

(A-7)βcm={σc1σc2(ρ˜c0ρc1ρ˜c021ρc2)}.

Likewise, Y(1, M (1)) among the pseudo-population of compliers can be expressed as

(A-8)Y(1,M(1))=βc+βcz+βcmM(1)+βczmM(1)+ec3,i=βc+βcz+(βcm+βczm)M(1)+ec3,i=βc+βcz+(βcm+βczm)(αc+αcz+ec2,i)+ec3,i=βc+βcz+(βcm+βczm)(αc+αcz)+ec3,i+(βcm+βczm)ec2,i.

When comparing this result with Y(1)=γc+γcz+ec1,i(using the first line of equations (5)), we know that ec1.i=ec3.i+(βcm+βczm)ec2.iunder Z = 1. Given a value of ρc,we have ρ˜c1σc1σc2=ρcσc3σc2+(βcm+βczm)σc22and σc12=σc32+(βcm+βczm)2σc22+2(βcm+βczm)ρcσc3σc2.Then solving these equations with respect to the value of βcm + βczm, we have

(A-9)βcm+βczm={σc1σc2(ρ˜c1ρc1ρ˜c121ρc2)}.

Therefore, given a particular value of ρc, the CACME and CANDE are identified as

(A-10)δc(z)=αcz{σc1σc2(ρ˜czρc1ρ˜cz21ρc2)}, and ζc(z)=CACEδc(z),

where z′ = 1 − z for z ∈ {0, 1}. Under assumptions 2 and 3, the mediated and unmediated ITT effects are estimated by multiplying the proportion of compliers to the CACME and CANDE estimate respectively, as δ(z)=δc(z)×πc=πcαcz{σc1σc2(ρ˜czρc1ρ˜cz21ρc2)} and ζ(z)=ζc(z)×πc=ITTδ(z).This completes the proof.

Received: 2019-07-06
Accepted: 2020-04-05
Published Online: 2020-11-28

© 2020 S. Park and E. Kürüm, published by De Gruyter

This work is licensed under the Creative Commons Attribution 4.0 International License.

Downloaded on 26.5.2024 from https://www.degruyter.com/document/doi/10.1515/jci-2019-0019/html
Scroll to top button