Vol.:(0123456789) Behavior Genetics https://doi.org/10.1007/s10519-018-9918-y ORIGINAL RESEARCH Understanding "What Could Be": A Call for 'Experimental Behavioral Genetics' S. Alexandra Burt1 * Kathryn S. Plaisance2 * David Z. Hambrick1 Received: 27 November 2017 / Accepted: 3 August 2018 © Springer Science+Business Media, LLC, part of Springer Nature 2018 Abstract Behavioral genetic (BG) research has yielded many important discoveries about the origins of human behavior, but offers little insight into how we might improve outcomes. We posit that this gap in our knowledge base stems in part from the epidemiologic nature of BG research questions. Namely, BG studies focus on understanding etiology as it currently exists, rather than etiology in environments that could exist but do not as of yet (e.g., etiology following an intervention). Put another way, they focus exclusively on the etiology of "what is" rather than "what could be". The current paper discusses various aspects of this field-wide methodological reality, and offers a way to overcome it by demonstrating how behavioral geneticists can incorporate an experimental approach into their work. We outline an ongoing study that embeds a randomized intervention within a twin design, connecting "what is" and "what could be" for the first time. We then lay out a more general framework for a new field-experimental BGs-which has the potential to advance both scientific inquiry and related philosophical discussions. Keywords G×E * Randomized intervention * Twin study Introduction Behavioral genetic (BG) research has yielded a number of important discoveries about human behavior that have been not only provocative and interesting, but also surprising. One of the most surprising is that virtually all human traits are heritable, usually substantially so (Turkheimer 2000). Nevertheless, the promise of BG research has not yet been fulfilled. For example, although it is now clear that many psychological disorders are highly heritable, BG research has had little impact on the diagnosis and treatment of these disorders. Furthermore, the implications of findings from BG research for fundamental philosophical questions about human nature are still unclear. These critical gaps in BG research were the impetus for the Genetics and Human Agency (GHA) Project, a recent initiative by the John Templeton Foundation. This project brings together scholars from both empirical and philosophical traditions, with the ultimate aim of exploring both practical and philosophical implications of BG research (for more information, see http://www.genet icshu manag ency.org/about /). Our research group postulates that these gaps in BG research relate to the fundamentally epidemiologic nature of BG research methods. More specifically, BG studies focus on understanding etiology as it currently exists, rather than etiology in environments that could exist. To put it another way, BG research focuses on the etiology of what is rather than what could be. As a consequence, BG research provides virtually no information about how we might intervene to change behavior, or how interventions might alter etiology (Lewontin 1974). This reality is, in many ways, intentional. The overarching goal of BG research is to understand the origins of individual differences in psychological traits and other characteristics. In other words, behavioral geneticists aim to answer the question, "Why are people different?" (Plomin 1990). A truly experimental approach to answering this question, which would require cloning humans and randomly assigning them to different rearing environments, is both ethically reprehensible and infeasible. Thus, behavioral geneticists cleverly leveraged an "experiment of nature" to tease apart * S. Alexandra Burt burts@msu.edu 1 Department of Psychology, Michigan State University, 107D Psychology Building, East Lansing, MI 48824, USA 2 Department of Knowledge Integration & Department of Philosophy, University of Waterloo, Waterloo, Canada Behavior Genetics 1 3 genetic and environmental influences on naturally-occurring individual differences (Plomin 1990). In this so-called 'natural experiment', twin siblings have been compared across zygosity (monozygotic versus dizygotic) to estimate the various contributions to the variance in a given phenotype, including additive genetic variance (A), shared environmental variance (C), and nonshared environmental variance (E). Behavioral geneticists have also compared adoptive sibling similarity to biological sibling similarity to estimate these proportions of variance. Decades of twin and adoption research have since yielded what are now known as the "three laws" of BG (Turkheimer 2000), which can be expressed in terms of A, C, and E as described above. The first law states that all behavioral traits demonstrate A, the second law that A is larger than C (in adulthood), and the third law that all traits demonstrate substantial E. While epidemiological research in the ACE tradition has clearly advanced knowledge regarding the origins of human behavior, philosophers and other critics of BG have rightly observed that such work does not illuminate whether or how alternate environments (and perhaps most importantly, interventions) might change these origins (Kaplan 2000; Lewontin 1974; Tabery 2014). To be sure, there has been a recent surge of interest in genotype-by-environment interactions (G×E) (Caspi et al. 2002; Moffitt et al. 2006) and the ways in which etiology can be moderated by various environmental contexts (e.g., Burt 2015; Burt et al. 2016). This research provides clues about how interventions could alter etiology. However, extant G×E work is still limited to the examination of naturally-occurring environmental moderators that exist in the data, and does not meaningfully consider the counterfactual (i.e., what the etiology would be in an environment that could, but does not yet, exist, such as the one following a yet-to-be implemented intervention). In short, BG seeks to identify the origins of human behavior as it naturally occurs, but does not elucidate the ways in which these origins might change following interventions. We argue that, although this approach has borne a great deal of scientific insight to date, it also represents a fundamental limitation, in that it (1) limits our understanding of the full range of etiologies of a given condition, and (2) hamstrings our ability to meaningfully inform prevention and intervention science. To move ahead, we need to consider what heritability estimates can and cannot tell us, with the goal of using that information to go beyond the field's current epidemiological approach. In this article, we seek to do just this, outlining the relevant limitations of BG and identifying one promising path forward. Limitations of BG research Philosophers and scientists have long debated the uses and limitations of heritability estimates, and of twin and adoption designs more generally (Downes 2017; Longino 2013; Sesardic 2005). A pivotal moment in this debate was Lewontin's (1974) article arguing that heritability estimates are both populationand trait-relative, and thus cannot be generalized beyond the specific range of environments lived by the participants. The classic illustration of this point, which has been called the "locality problem," is phenylketonuria (PKU), a condition that causes buildup of the amino acid phenylalanine in the body and which leads to severe intellectual disability, psychological disorders, and other problems if left untreated (Wahlsten 1997). PKU is 100% heritable, but the phenotypic effects of PKU can be almost eliminated through changes in diet (Wahlsten 1997). The reason is that the heritability estimate reflects typical environments (i.e., diets that include phenylalanine, which is present in most foods) rather than the phenotypic effects in alternate environments that exist only after intervention (i.e., diets without phenylalanine). A further problem with BG research is the level of analysis problem and the related 'black box problem' (Plomin et al. 2013). Heritability estimates are estimates of variance, and as such, reveal the overall origins of individual differences around the mean. They are thus population-level statistics, and accordingly, do not apply to any one individual (note that this problem is not unique to the field of BG). What's more, while latent genetic variance reflects statistical signals from susceptibility loci (Kendler 2005), it reveals nothing about the number or location of genes involved, just as estimates of shared and nonshared environmental influences do not reveal specific environmental experiences that give rise to these influences (Turkheimer 2004). Although the descriptive nature of the estimated variance components is clearly a significant limitation of traditional BG research, others have defended the utility of heritability estimates, pointing out that in cases where there is little-to-no geneenvironment interplay, such estimates can provide us with important information about the causal relationship between genetic variance and phenotypic variance (Sesardic 2005; Tal 2009). Namely, in this scenario, heritability estimates reveal the probability that genes have a larger effect than the environment when it comes to explaining individual variation for a particular trait (Tal 2009). Understanding this level-of-analysis distinction between estimating variance in a population on the one hand, and identifying specific causal factors on the other hand, is the key to explaining an apparent paradox: even highly heritable traits can be responsive to environmental interventions (Dickens and Flynn 2001; Sauce and Matzel 2018). This paradox, we argue, only seems like a paradox if one assumes that highly heritable traits are immutable-or, at least, would require drastic and long-lasting interventions to change. Yet, there is evidence that traits with relatively high heritability estimates can be altered through both intensive and less Behavior Genetics 1 3 intensive environmental interventions. As one example, academic achievement is both stable and highly heritable (60+%) (Bartels et al. 2002), and yet this phenotype has been shown in randomized experiments to be affected by a variety of factors, including class size (Nye et al. 2000), proportion of female students in the classroom (Whitmore 2005), school quality (Hastings and Weinstein 2008), and, in the case of more impoverished students, academic mindset or beliefs about whether intelligence is fixed or can be improved with effort (Dweck and Leggett 1988). Before proceeding, we need to acknowledge two major ways in which BG has already attempted to move beyond simple heritability estimates and the limitations they impose on interpretation: biometric G×E studies and molecular genetic studies. Biometric G×E studies center on environmental moderation of latent or inferred genetic risk, calculated via comparisons of resemblance among relatives with varying degrees of genetic similarity (Purcell 2002). And, although biometric G×E methods are now are well established, it should be noted that biometric G×E methods do challenge two longstanding assumptions in BG: (1) that the genetic portion of variance is comprised solely of additive genetic effects, and (2) that G×Es do not contribute to the phenotype in question (for an in-depth historical and philosophical analysis of G×E research, see Tabery 2014). Recent work indicates that, for many phenotypes, the genetic portion of variance is very likely to reflect both gene–environment interplay and simple additive genetic effects (Moffitt et al. 2006). Biometric G×E studies thus represent a major advance for the field of BG. Even so (and as noted earlier), they fail to solve the core epidemiological challenge to traditional BG, in that they are still limited to observations of naturally occurring variance. In other words, like traditional heritability estimates, biometric G×E studies are fundamentally limited to evaluating what is rather than what could be. Molecular genetic analyses also resolve several of the pitfalls inherent to traditional BG, in large part by circumventing the descriptive nature of latent genetic risk. In molecular analyses, genetic risk is directly measured rather than inferred; accordingly, this approach has far more promise for elucidating underlying biological processes. Nevertheless, it has become abundantly clear from this research that specific genes have limited explanatory power. Effect sizes for individual genetic contributions to psychopathology have thus far been found to range from essentially zero to tiny, explaining well under 1% of the variance. Conducting G×E research at the level of any one specific gene (or small set of genes) thus has very limited utility for most psychiatric outcomes. Even when the entire genome is examined, molecular genetic contributions typically explain well under 50% of the genetic variance in any given trait. Many have referred to this as the "missing heritability" problem (Turkheimer 2011). Much like the apparent paradox of highly heritable but malleable traits, the missing heritability problem is only surprising if one assumes that high heritability estimates represent primarily additive genetic effects (rather than G×E). However, there is no definitive evidence that highly heritable traits are more likely to be directly and additively influenced by genes. For example, heritability estimates for breast cancer are only around 27% (Lichtenstein et al. 2000; Möller et al. 2016), yet particular gene variants have been identified that are highly predictive of breast cancer for the individuals who have them (e.g., BRCA1 and BRCA1). One possible reason for this discrepancy between the size of the heritability estimate and the ability to identify influential genes is that at least some of the genes associated with breast cancer are highly penetrant (i.e., the presence of a gene variant is highly associated with the development of the phenotype), which may not be the case for more complex behavioral traits like IQ. Regardless, the molecular approach is currently limited in terms of its ability to enhance our understanding of what could be. In sum, current BG approaches are not able to uncover the etiology of individual differences in the context of environments that do not (yet) exist. We contend that overcoming this methodological reality will provide invaluable insights in BG research going forward. These include (1) illuminating how novel environments influence a given trait's etiology, (2) detailing how specific interventions can change behavior, and (3) allowing scientists to make stronger causal inferences about the origins of individual differences in behavior. Put another way, it is well-nigh time for BG to go experimental. Experimental BG In a controlled psychological experiment, an independent variable (IV; typically an environmental or situational factor) is manipulated by the researcher to determine its effect on a dependent variable (DV; typically the participants' behavior), holding other factors constant. There are two variants of this design. In the between-subjects design, each participant is assigned to a group representing one level of the IV (e.g., drug or placebo). The assignment of participants to conditions is randomized to control for the influence of confounding variables on the DV (e.g., preexisting health in a drug study)-that is, to ensure that these factors have about the same effect on the DV across groups. By contrast, in the within-subjects design, each participant is exposed to each level of the IV, and thus serves as their own control for confounding variables. In either design, casual inferences are permitted, because the effect of the IV on the DV presumably reflects the variable in question and not other variables. Not surprisingly, then, experiments are at the heart of contemporary philosophical accounts of causation. Philosophers of science offer thoughtful and rich accounts of Behavior Genetics 1 3 causation and what sorts of evidence scientists need to be able to make well-supported causal inferences. Many of these theories conceptualize causal variables in terms of their ability to manipulate particular effects (Woodward 2016). Interventions are accordingly seen as one of the best ways to determine a causal relationship between variables. Put another way, experiments are an ideal vehicle for evaluating the 'what could be' counterfactual for the DV. What might experiments look like within the context of BG? In the next section of this article, we provide an illustrative example by describing a recently funded study that combines BG twin methods with a between-subjects experiment. The goal of this ongoing experiment is to investigate whether and how a brief intervention designed to change people's beliefs about the nature of intelligence (i.e., mindset) alters the etiology of phenotypes related to academic achievement (e.g., locus of control, challenge-seeking behavior, grit, and IQ). An experimental twin study Rationale for specific phenotypes chosen Children acquire academic skills at markedly different rates, with some children developing much more rapidly than others. As a case in point, while many high school students progress no further than algebra, others make it through geometry and trigonometry, and still others through advanced placement calculus courses, allowing them to "place out" of college mathematics courses. These individual differences in academic achievement can have enormous consequences later in life for occupational attainment and other outcomes. What might account for this striking variability in academic achievement? A major part of the answer, at least in existing environments, appears to be general cognitive ability ("g"), which is typically indexed via overall IQ. In one 5-year prospective study of over 70,000 children (Deary et al. 2007), the correlation between childhood IQ and later academic achievement was .81. This very strong association has clear implications for etiology, since g evidences heritability estimates upwards of 60 or 70% (Plomin et al. 2013). Academic achievement during adolescence is similarly heritable, with typical heritability estimates hovering around 60% (Bartels et al. 2002). At the same time, there is growing evidence that, at least in students from disadvantaged contexts, mindset also predicts academic achievement, albeit far less than IQ does (Dweck and Leggett 1988; Sisk et al. 2018). Mindset refers to a person's beliefs about whether their abilities are fixed (innate and unchanging) or malleable (modifiable through effort). Prior work (Dweck 2006) has demonstrated that those who endorse a growth mindset are more likely to tackle difficult tasks than those with a fixed mindset, since growth-minded individuals see difficult tasks as an opportunity to learn while fixed-minded individuals see them as something to avoid (lest they appear 'simpleminded'). People with a growth mindset are also more likely to view cognitive ability as a product of effort and environmental opportunity, whereas those with a fixed mindset view cognitive abilities as immutable and genetic in origin. What makes these results all the more interesting is that growth mindsets can be induced in some people using interventions as brief as 10 min, with significant, albeit small, consequences for later academic performance (Sisk et al. 2018). As an example, a recent pair of meta-analyses assessed the relationship between mindset and academic achievement, one of which evaluated experimental effects of growth mindset inductions on achievement (Sisk et al. 2018). They found a small but significant effect of growth mindset inductions on academic achievement, d = 0.08, 95% (CI) = [0.02, 0.14], p = .01. These growth mindset interventions were particularly effective for students from low socioeconomic status (SES; d = 0.34, 95% CI [0.07, 0.62], p = .013) and for students who were academically highrisk (d = 0.19, 95% CI [0.02, 0.36], p = .031). Such findings have been interpreted to suggest that, in much the way that environmental influences have been shown to be more important to the etiology of IQ in lower SES contexts as compared to higher SES contexts (Tucker-Drob and Bates 2016; Turkheimer et al. 2003), environmental influences (perhaps including the growth mindset induction) may be more important to academic achievement in more disadvantaged groups. Knowledge to be gained through experimental BG Cumulative evidence thus indicates that academic achievement is both quite heritable and can be improved in some students following a brief online intervention. In our ongoing experimental BG study, we are addressing these apparent inconsistencies, employing a novel and state-of-the-science methodologic design that integrates standard BG methods with randomized experimental methods to connect what is with what could be. After completing baseline tests of mindset, locus of control, grit, and cognitive ability, twins are randomly assigned to complete either a growth mindset induction or a control task. They then repeat all measures a second time. Analyses will compare heritability estimates, as well as estimates of shared and nonshared environmental variance, before and after the intervention as well as across experimental and control conditions, allowing us to directly evaluate the role of our brief intervention in altering etiology. Below, we discuss our experimental twin study design in detail, the very first of its kind (the only other BG Behavior Genetics 1 3 intervention study conducted to date (Haworth et al. 2016) did not randomize the assignment of participants to conditions). We then discuss the expectations for outcomes in terms of changes to ACE components, as well as what those possible outcomes would mean for our understanding of the etiology of human behavior. Design and procedure We are currently recruiting early-to-mid adolescent twin pairs (50% monozygotic, 50% dizygotic) for a ~ 1 h online assessment (using Qualtrics). Twin pairs are randomly assigned to the experimental or control conditions using the following targets: 1/3 will be assigned to the control–control condition (in which both twins are assigned to the active control), 1/3 will be assigned to the growth–growth condition (in which both twins are assigned to the growth condition), and 1/3 will be assigned to the control–growth condition (in which one twin is assigned to the growth condition and the other to the control condition). The assessment itself has three phases (see Table 1). In Phase 1, we ask all participants to complete measures of (a) mindset and the conceptually related constructs of grit and locus of control, (b) challenge-seeking behavior, as examined in Yeager et al. (2016; choosing hard versus easy math problems), and (c) cognitive ability (verbal and non-verbal reasoning ability). In Phase 2, we experimentally manipulate mindset using a state-of-the-art mindset induction paradigm developed specifically for early-to-mid adolescents (Yeager et al. 2016). The two conditions differ in terms of the content presented to participants. In the growth mindset condition, participants are presented with content suggesting that intelligence develops from stimulating environments and can be improved with effort [e.g., "the brain is like a muscle-it gets stronger (and smarter) when you exercise it"]. In the active control condition, participants are presented with content that reviews basic findings about the human brain but is neutral with respect to the influence of effort on intelligence (e.g., "the parietal lobe is where the brain interprets the sense of touch"). In Phase 3, all participants complete the aforementioned measures of mindset, grit, cognitive ability, and challenge-seeking behaviors a second time. They are then be debriefed regarding the intervention (i.e., we clarify that both genes and environments contribute to intelligence). Planned analyses We expect de novo (Phase 1) mindset, grit, locus of control, challenge-seeking behavior, and cognitively ability to emerge as moderately-to-highly heritable (Tucker-Drob et al. 2016). We also expect that these Phase 1 heritabilities will not vary across experimental conditions, given that participants are randomly assigned to these conditions and have yet to receive the intervention. We do not have any a priori expectations for changes in etiology following the intervention (i.e., Phase 3), as there is no research to guide hypotheses. However, we do lay out the various possibilities for changes in components of variance and discuss possible interpretations of potential outcomes in the next section. We will specifically explore whether and how these genetic and environmental influences change from Phase 1 to 3, separately for those in the growth-mindset and active control conditions. Analyses will be conducted as follows: We will first confirm that the etiologies of mindset, grit, locus of control, challenge-seeking behavior, and cognitive ability, respectively, do not vary across the two intervention groups de novo (i.e., Phase 1) via Purcell's univariate G×E model (Purcell 2002). This is the most appropriate and powerful G×E model when there is no possibility of gene–environment correlations (rGE) between the moderator and the outcome (van der Sluis et al. 2012), which is necessarily the case in our study given the use of random assignment. This analysis serves as an etiologic check on our randomization process. We will then repeat these analyses in those data collected after the mindset intervention (i.e., Phase 3), exploring whether the intervention altered etiology. We will then elaborate on the above results using a simple extension of the bivariate G×E model, estimating genetic and environmental similarity across Phases 1 and 3 via genetic and environmental correlations. Table 1 Study design Condition is assigned so as to elicit the following configuration of twin pairs: a third will be assigned to the control–control condition, in which both twins are assigned to the active control. A third will be assigned to the growth–growth condition, in which both twins are assigned to the growth condition. The final third will be assigned to the control–growth condition, in which one twin is assigned to the growth condition and the other to the control condition. The latter allows us to confirm condition-induced changes at the level of the individual twin, and also allows for the possible use of the co-twin control model in the future Condition Phase 1: pre-test Phase 2: intervention Phase 3: post-test Experimental (growth) group Complete measures Learn that intelligence develops from stimulating environments and can be improved with effort Complete measures Active control group Complete measures Learn basic information about the human brain Complete measures Behavior Genetics 1 3 Interpretive framework The presence of any etiologic change from Phase 1 to Phase 3 would be interesting, in that it would point to etiologic instability over the course of an hour. Extant data have already highlighted the instability of non-shared environmental influences in particular over lags as short as a few minutes (Burt et al. 2015). However, the Burt et al. (2015) study still captures etiology as it is, not as it could be. In this light, etiologic change specifically in response to the growth mindset intervention, relative to both the individual's baseline and to those in the active control condition, would be quite meaningful. Indeed, our study would be the first ever to identify experimentally-induced changes in etiology directly in response to a randomly-assigned intervention (to our knowledge). Moreover, our findings would represent a proof-of-concept for the idea that interventions need not be particularly long and intense to accomplish etiologic change. Such findings would have major implications for our conceptual understanding of G×E, and for our philosophical understanding of BG results. The absence of etiologic moderation would have similarly important (though obviously different) implications for these same questions, though of course would not rule out the possibility that other brief interventions could alter etiology. Should there be relevant etiologic change in response to the growth mindset intervention (see Table 2), however, we further suggest that the specific pattern of etiologic change may also be theoretically and practically meaningful (assuming, as before, that similar change is not observed in the active control condition). This interpretation is, to a considerable extent, predicated on the positive or negative valence of both the environmental moderator and the outcome. In our case, the growth mindset induction appears to function as a protective or positively valenced environmental manipulation (e.g., it improves grades). Achievement outcomes are similarly positively valenced, in that we nearly always want to maximize these outcomes. Given the valence structure embedded in our study, there are two ready-made interpretive frameworks: bioecological G×E and environmental main effects. Under the bioecological model, deleterious environments would amplify shared environmental influences, whereas genetic influences would be more important under protective environmental conditions. In this case, the model would specifically predict absolute (or unstandardized) decreases in environmental influences with exposure to protective environments (the growth mindset). Genetic influences would simultaneously be expected to increase (as is the case with Turkheimer's seminal SES and IQ findings). One key caveat, however, is that the change in genetic influences may only be observable when examined relative to the environmental moderation (i.e., via standardized estimates): "unlike in a diathesis-stress model, the environmental (risk) factor in a bioecological interaction does not necessarily act on the same biological substrate as the genetic risk factors. Instead, it may just allow those genetic risk factors to account for more of the variance in outcome, because environmental risk factors that affect that outcome have been minimized" (Pennington et al. 2009, p. 80). In short, increases in standardized or unstandardized genetic influences on a given outcome following the growth mindset induction would likely be interpreted via the bioecological framework, particularly when accompanied by decreases in environmental influences, Table 2 Possible configurations of etiologic change following the growth mindset induction BE stands for the bioecological model of G×E, and DE stands for environmental 'direct effect'. A, C, and E stand for unstandardized (or absolute) additive genetic, shared environmental, and non-shared environmental variation, respectively. Because they are not standardized, they do not sum to 100%, and so each variance component can increase or decrease simultaneously. An arrow pointing up indicates that the magnitude of variance increases following the intervention, whereas an arrow pointing down indicates that the magnitude of variance decreases. A dash indicates no change **Strong evidence in favor of that interpretation *Some evidence in favor of that interpretation Possible configurations of unstandardized ACE Interpretive framework A C E BE DE ↑ – – * ↑ ↑ – ↑ ↑ ↑ ↑ – ↑ ↑ ↓ – ** ↑ ↓ ↓ ** ↑ – ↓ ** ↑ ↑ ↓ * ↑ ↓ ↑ * ↓ – – * ↓ ↑ – ** ↓ ↑ ↑ ** ↓ – ↑ * ↓ ↓ – ↓ ↓ ↓ ↓ – ↓ ↓ ↑ ↓ ** ↓ ↓ ↑ * – ↑ – ** – ↑ ↑ ** – – ↑ * – ↓ – ** – ↓ ↓ ** – – ↓ * – ↑ ↓ ** – ↓ ↑ * Behavior Genetics 1 3 (Bronfenbrenner and Ceci 1994; Pennington et al. 2009), such that the growth mindset intervention enhanced genetic influences on achievement-related outcomes. In sharp contrast, increases in environmental influences (especially shared) on a given outcome following the growth mindset induction, perhaps especially when accompanied by decreases in genetic influences, would likely be interpreted as something akin to an environmental 'direct effect'. Put differently, under this scenario, common exposure to the growth mindset intervention would increase twin similarity regardless of their level of genetic similarity. This would be very interesting, in that direct or 'main' effects of the environment on a given outcome have not historically been considered a form of G×E (since G×E clearly implies a statistical interaction), and teases out the possible new insights we could gain from experimental BG. The utility of the oft-discussed diathesis-stress G×E model is murkier here, in large part because in its original form, the model postulates that environmental risk experiences activate or increase genetic influences on psychopathology (i.e., a negatively-valenced moderator acting on a negativelyvalenced outcome). Although this model has been extended to positively-valenced moderators acting on negatively-valenced outcomes (e.g., prosocial peer affiliation appears to suppress genetic influences on youth antisocial behavior; Burt and Klump 2014), it is more difficult to extend it here, given that both the moderator and the outcome are positively valenced. With these caveats in mind, the diathesis-stress model might perhaps predict absolute decreases in genetic influences with exposure to the growth mindset induction, such that genetic influences would be more influential in the 'riskier' active control condition. There are no clear predictions for environmental influences on the outcome in the diathesis-stress model (regardless of the valence of the environmental experience itself), so we do not anticipate any in our case either. Discussion Limitations By merging experimental science into the twin study design, our first-of-its-kind experimental twin study will allow us to directly evaluate not only the etiology of what could be but also how this etiology changes relative to what is. Although poised to offer meaningful insights that are otherwise not accessible to BG, this novel design only partially remedies the other inherent limitations in traditional BG designs. Should there be evidence of etiologic moderation by the growth mindset intervention, it would necessarily point to an identified environmental influence (the intervention) on the phenotype (either directly or by altering the importance of genetic influences). Even so, the "black box problem" discussed above remains a significant limitation, and particularly so for the genetic component of variance. That is, it does not yield any information on the number or location of genes influencing the phenotype. And should the genetic correlation between Phases 1 and 3 be < 1.0, we will not gain any insight into the specific genes that contribute to the phenotype in one phase but not the other. The experimental twin design also fails to address the level of analysis issue discussed above, in that results remain confined to the level of the population and would not apply to the individual. Critically, however, there are other iterations of experimental BG that could more fully address both the level of analysis and the black box problems. One such design would involve examinations of epigenetic marks before and after a randomized intervention, an approach that would more clearly reveal specific epigenetic mechanisms of effect and move closer to applying results to the level of the individual. As already noted, however, molecular and epigenetic approaches each have limitations of their own, including issues of missing heritability and the inability to evaluate epigenetic alterations in the brain in living humans. The single best design, then, might incorporate both twin and more molecular approaches, allowing us to evaluate changes in genetic and environmental variance with the intervention and to identify specific epigenetic marks in the periphery. Future work should thus develop and refine integrative experimental behavior genetic approaches. Conclusions The project described above represents (to our knowledge) the very first randomized intervention conducted within a twin study design, providing an ideal platform for scientifically and philosophically rigorous advances in our philosophical and conceptual understandings of the etiology of human behavior, and in particular the concepts of heritability and G×E. This innovation allows us to empirically address one of the core philosophical critiques of BG-namely that BG research describes only what is and not what could be. It will also allow us to simultaneously consider and empirically piece together the (seemingly opposed) concepts of heritability and malleability, furthering our foundational understanding of the interplay between genetic influences and environmental experiences. In these ways, the project should meaningfully contribute to two important discussions in the philosophical literature. The first such discussion relates to the issue of causal inference in BG, and in particular the utility of heritability estimates for understanding relative influences on behavioral trait variation. Our study will clarify whether using randomized, controlled interventions enables us Behavior Genetics 1 3 to make different and/or more valid causal claims than epidemiological data, or whether the limitations of BG findings apply whenever one is studying population variation. The second discussion relates to the recent debate over the relationship of the human behavioral sciences to one another (i.e., Helen Longino's pluralistic account which claims that the questions and methods in each subfield are sufficiently different so as to prevent integration; see Longino 2013). The current study will provide some insight into our ability to intentionally integrate, at least to some extent, the methods of different subfields, with corresponding implications for our understanding of the nature and relationship of different scientific disciplines. Conceptually, the proposed study should further our foundational understanding of the interplay between genetic inf luences and environmental experiences. Implicit in much of the prior theorizing regarding G×E is that genetic influences are potent and stable, and are altered primarily in response to "major" environmental experiences. Philosophical work has not supported these assumptions (e.g., Lewontin 1974), but they have yet to be tested empirically. The proposed project will tackle these assumptions head on, and moreover, will do so in an experimental design immune to gene–environment correlation (rGE) confounds (given our use of random assignment), thus providing an ideal platform for scientifically and philosophically rigorous advances in our conceptualization of G×E. We are excited to see where else this approach might lead. Funding The funding for this research was provided by the Templeton Foundation through the Genetics of Human Agency Initiative. Compliance with ethical standards Conflict of interest S. Alexandra Burt, Kathryn S. Plaisance, David Z. Hambrick declare that they have no conflict of interest. Human and animal rights The described study has been approved by the Michigan State University IRB. Informed consent All participants give informed consent (informed consent is obtained by one parent since the twins are younger than 18 years old). References Bartels M, Rietveld MJ, Van Baal GCM, Boomsma DI (2002) Heritability of educational achievement in 12-year-olds and the overlap with cognitive ability. Twin Res 5(06):544–553 Bronfenbrenner U, Ceci SJ (1994) Nature-nurture reconceptualized in developmental perspective: a bioecological model. Psychol Rev 101:568–586 Burt SA (2015) Evidence that the GxE underlying youth conduct problems vary across development. Child Dev Perspect 9:217–221 Burt SA, Klahr AM, Klump KL (2015) Do non-shared environmental influences persist over time? An examination of time and minutes. Behav Genet 45(1):24–34 Burt SA, Klump KL (2014) Prosocial peer affiliation suppresses genetic influences on non-aggressive antisocial behaviors during childhood. Psychol Med 44:821–830 Burt SA, Klump KL, Gorman-Smith D, Neiderhiser JM (2016) Neighborhood disadvantage alters the origins of children's non-aggressive conduct problems. Clin Psychol Sci 4:511–526 Caspi A, McClay J, Moffitt TE, Mill J, Martin J, Craig IW et al (2002) Role of violence in maltreated children. Science 297:851–854 Deary IJ, Strand S, Smith P, Fernandes C (2007) Intelligence and educational achievement. Intelligence 35(1):13–21 Dickens WT, Flynn JR (2001) Heritability estimates versus large environmental effects: the IQ paradox resolved. Psychol Rev 108(2):346 Downes SM (2017) Heritability. The Stanford Encyclopedia of Philosophy (Spring 2017 Edition). In: Zalta EN (ed). https ://plato .stanf ord.edu/archi ves/spr20 17/entri es/hered ity/ Dweck CS (2006) Mindset: the new psychology of success. New York, Random House Dweck CS, Leggett EL (1988) A social-cognitive approach to motivation and personality. Psychol Rev 95(2):256 Hastings JS, Weinstein JM (2008) Information, school choice, and academic achievement: evidence from two experiments. Q J Econ 123(4):1373–1414 Haworth CM, Nelson SK, Layous K, Carter K, Bao KJ, Lyubomirsky S, Plomin R (2016) Stability and change in genetic and environmental influences on well-being in response to an intervention. PLoS ONE 11(5):e0155538 Kaplan JM (2000) The limits and lies of human genetic research: dangers for social policy. Routledge, London Kendler KS (2005) Psychiatric genetics: a methodological critique. Am J Psychiatry 162:3–11 Lewontin RC (1974) The analysis of variance and the analysis of causes. Am J Hum Genet 26(3):400–411 Lichtenstein P, Holm NV, Verkasalo PK, Iliadou A, Kaprio J, Koskenvuo M et al (2000) Environmental and heritable factors in the causation of cancer-analyses of cohorts of twins from Sweden, Denmark, and Finland. N Engl J Med 343(2):78–85 Longino HE (2013) Studying human behavior: how scientists investigate aggression and sexuality. University of Chicago Press, Chicago Moffitt TE, Caspi A, Rutter M (2006) Measured gene-environment interactions in psychopathology. Perspect Psychol Sci 1:5–27 Möller S, Mucci LA, Harris JR, Scheike T, Holst K, Halekoh U et al (2016) The heritability of breast cancer among women in the Nordic Twin Study of Cancer. Cancer Epidemiol Prev Biomark 25(1):145–150 Nye B, Hedges LV, Konstantopoulos S (2000) The effects of small classes on academic achievement: the results of the Tennessee class size experiment. Am Educ Res J 37(1):123–151 Pennington BF, McGrath LM, Rosenberg J, Barnard H, Smith SD, Willcutt EG et al (2009) Gene × environment interactions in reading disability and attention-deficit/hyperactivity disorder. Dev Psychol 45:77–89 Plomin R (1990) Nature and nurture: an introduction to human behavioral genetics. Thomson Brooks/Cole Publishing Co, Belmont Plomin R, DeFries JC, Knopik VS, Neiderhiser JM (2013) Behavioral genetics, 6th edn. Worth Publishers, New York Purcell S (2002) Variance components model for gene-environment interaction in twin analysis. Twin Res 5:554–571 Sauce B, Matzel LD (2018) The paradox of intelligence: heritability and malleability coexist in hidden gene-environment interplay. Psychol Bull 144(1):26 Behavior Genetics 1 3 Sesardic N (2005) Making sense of heritability. Cambridge University Press, Cambridge Sisk VF, Burgoyne AP, Sun J, Butler JL, Macnamara BN (2018) To what extent and under which circumstances are growth mindsets important to academic achievement? Two meta-analyses. Psychol Sci 29:549–571 Tabery J (2014) Beyond versus: the struggle to understand the interaction of nature and nurture. MIT Press, London Tal O (2009) From heritability to probability. Biol Philos 24(1):81–105 Tucker-Drob EM, Bates TC (2016) Large cross-national differences in gene × socioeconomic status interaction on intelligence. Psychol Sci 27:138–149 Tucker-Drob EM, Briley DA, Engelhardt LE, Mann FD, Harden KP (2016) Genetically-mediated associations between measures of childhood character and academic achievement. J Pers Soc Psychol 111:790 Turkheimer E (2000) Three laws of behavior genetics and what they mean. Curr Dir Psychol Sci 13:160–164 Turkheimer E (2004) Spinach and ice cream: why social science is so difficult. In: DiLalla LF (ed) Behavior genetics principles: perspectives in development, personality, and psychopathology. American Psychological Association, Washington Turkheimer E (2011) Genetics and human agency: comment on DarNimrod and Heine. Psychol Bull 137(5):825–828 Turkheimer E, Haley A, Waldron M, D'Onofrio B, Gottesman II (2003) Socioeconomic status modifies heritability of IQ in young children. Psychol Sci 14:623–628 van der Sluis S, Posthuma D, Dolan CV (2012) A note on false positives and power in GxE modeling of twin data. Behav Genet 42:170–186 Wahlsten D (1997) The malleability of intelligence is not constrained by heritability. In: Devlin B, Fienberg SE, Resnick DP, Roeder K (eds) Intelligence, genes, and success. Springer, New York, pp 71–87 Whitmore D (2005) Resource and peer impacts on girls' academic achievement: evidence from a randomized experiment. Am Econ Rev 95(2):199–203 Woodward J (2016) Causation and manipulability. The Stanford Encyclopedia of Philosophy (Winter 2016 Edition). In: Zalta EN (ed). https ://plato .stanf ord.edu/archi ves/win20 16/entri es/causa tion-mani/ Yeager DS, Romero C, Paunesku D, Hulleman CS, Schneider B, Hinojosa C et al (2016) Using design thinking to improve psychological interventions: the case of the growth mindset during the transition to high school. J Educ Psychol 108(3):