Skip to main content
Log in

Three puzzles and eight gaps: what heritability studies and critical commentaries have not paid enough attention to

  • Published:
Biology & Philosophy Aims and scope Submit manuscript

Abstract

This article examines eight “gaps” in order to clarify why the quantitative genetics methods of partitioning variation of a trait into heritability and other components has very limited power to show anything clear and useful about genetic and environmental influences, especially for human behaviors and other traits. The first two gaps should be kept open; the others should be bridged or the difficulty of doing so should be acknowledged: 1. Key terms have multiple meanings that are distinct; 2. Statistical patterns are distinct from measurable underlying factors; 3. Translation from statistical analyses to hypotheses about measurable factors is difficult; 4. Predictions based on extrapolations from existing patterns of variation may not match outcomes; 5. The partitioning of variation in human studies does not reliably estimate the intended quantities; 6. Translation from statistical analyses to hypotheses about the measurable factors is even more difficult in light of the possible heterogeneity of underlying genetic or environmental factors; 7. Many steps lie between the analysis of observed traits and interventions based on well-founded claims about the causal influence of genetic or environmental factors; 8. Explanation of variation within groups does not translate to explanation of differences among groups. At the start, I engage readers’ attention with three puzzles that have not been resolved by past debates. The puzzles concern generational increases in IQ test scores, the possibility of underlying heterogeneity, and the translation of methods from selective breeding into human genetics. After discussing the gaps, I present each puzzle in a new light and point to several new puzzles that invite attention from analysts of variation in quantitative genetics and in social science more generally. The article’s critical perspectives on agricultural, laboratory, and human heritability studies are intended to elicit further contributions from readers across the fields of history, philosophy, sociology, and politics of biology and in the sciences.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. The “classical” quantitative genetics methods discussed in this article includes partitioning variation into heritability and other components, but not the technique of mapping of quantitative trait loci (QTL). QTL are regions of the genome containing genetic factors that influence a continuously variable trait. QTL mapping has had most success in animal and plant varieties that can be replicated and raised in controlled conditions. Reliable QTL results for human populations have been few (Majumder and Ghosh 2005), but Genome-wide association (GWA) studies may be changing this picture (Khoury et al. 2007).

  2. For key points in the debate, see the Harvard Educational Review article by psychometrician Arthur Jensen (1969), which elicited a critical response from, among others, the population geneticist, Richard Lewontin (1970a, b, 1974); see also Jensen (1970). Jencks and Phillips (1998) reviews research on the black-white test score gap and Parens (2004) provides an even-handed overview of past and potential contributions of human behavioral genetics to discussions of social importance well beyond IQ tests.

  3. There has been some success recently in using regression analysis to identify associations between environmental factors and differences between the mean test scores for racial groups (Fryer and Levitt 2004).

  4. Heritability can be related to correlations between parents and offspring, but to do so requires models of hypothetical genes that determine the trait and a suite of assumptions (Lynch and Walsh 1998, 48–50; 142; see also Gaps 36).

  5. Strictly this defines “across-location” heritability. An alternative, “within-location” heritability, is relevant where researchers envisage that the variety will continue to be raised in the same location. In effect, this quantity takes the heritability estimated in each location separately and averages these estimates over all locations. This estimation means that differences among the averages for the trait from one location to the next are not taken into account. Across-location heritability, which always has a smaller value than the within-location heritability, is relevant when the varieties could be raised or grown again in any of the original set of locations. Strictly it also defines “broad” heritability. “Narrow” heritability, which is used to predict change under selective breeding, is a construct that depends on assumptions about the action of hypothetical genes in the standard models of quantitative genetics (see note 10). Heritability can also be estimated through path analysis, a data analysis technique that quantifies the relative contributions (“path coefficients”) of variables to the variation in a focal variable once a certain network of interrelated variables has been accepted (Lynch and Walsh 1998, 823). The reliability of the estimates depends on the assumptions built into the networks, such as the similarity of relatives of different degrees and the inclusion/exclusion of coefficients for variety-location interaction (see Gap 5 and note 10). When the same assumptions are used, ANOVA and path analysis estimate the same quantities.

  6. Variance is the common measure of variation. The variety variance, for example, assesses the size of the deviations of the variety means from the overall mean for the trait by averaging the square of those deviations.

  7. In an ANOVA the components of variation are conventionally labeled “effects,” a term that misleadingly connotes the influence of some causal factor. This connotation tends to be especially confusing in the case in discussion of shared versus non-shared environmental effects (Turkheimer 2000).

  8. The existence of such gradients is suggested by the symbols often used in equations, e.g., P = G + E, where P is the measurement of the trait (“phenotype”), G a contribution from the variety (“genotype”) and E a contribution from the location (“environment,” or environment plus error). However, such contributions are derived using statistical methods, such as ANOVA and path analysis, that partition the variation in traits across some specific set of varieties and locations into components. The results are thus conditional on that set of varieties and locations (see #d to follow in the text, as well as the other points under Gap 2).

  9. In the terminology of this article, genotype-environment correlation is “variety-location correlation” (or covariance), which is distinct from “variety-location-interaction variance.” The correlation is readily explained by referring to the general case of an agricultural evaluation trial, in which the means of varieties, locations, and variety-location combinations can be estimated. For the full data set, the usual method of estimation ensures that the variety and location means (averaged across, respectively, all locations and all varieties) are uncorrelated. Within a subset of the full data, however, those same means may be positively or negatively correlated—this is the variety-location correlation. In human studies, varieties are raised in at most two locations (identical twins raised apart), so the variety and location means across all locations and varieties are unknown; variety-location correlation is thus difficult to estimate (Jacquard 1983), and, if estimated (e.g., Otto et al. 1995), requires many additional assumptions (Lynch and Walsh 1998, 142).

  10. The first step in the construction of the standard models of quantitative genetics is to consider the case of a trait governed by a pair of alleles of a single gene (i.e., at a single locus) and where all individuals are raised in a single location. In that location, the presence or level of such a trait depends only on whether the individual has two copies of one allele (i.e., is “homozygote” for that allele), two of the other, or one of each (“heterozygote”). For example, phenylketonuria (PKU) in humans is associated with having two copies of a non-functioning allele for the enzyme phenylalanine hydroxylase (PAH). The development of such individuals is extremely impaired by phenylalanine at the level present in normal diets. In this “normal-diet" location relatives will resemble each other more than unrelated individuals because if, say, a twin has PKU, both parents have at least one copy of the non-functioning PAH allele so the other twin is more likely to have two non-functioning PAH alleles than an unrelated individual (i.e., one chosen at random from the population). This seems straightforward, but few traits are dictated only by alleles at a single locus. The standard models of quantitative genetics envisage the influence of alleles at many loci adding up to shape the traits to be analyzed, allowing also for the effect of one allele to eclipse that of the other (“dominance”) at any given locus and some degree of interaction (“epistasis”) among alleles at different loci. Next, the models allow for some noise (from measurement error or unsystematic variation among the replicates of the variety). Finally, to allow for the variety to be raised in a number of locations, the standard models of quantitative genetics incorporate a term for variance across locations of the mean value of the trait in each location (i.e., “location variance” or “shared environmental variance”). Application of the models to the analysis of data from related and unrelated individuals, such as human twins, requires additional assumptions for which plausible alternatives exist (Taylor 2007, 2009 and Gap 5). Most notably, it is conventional to assume that, all other things being equal, fraternal twins are half as similar as identical twins because fraternal twins share half the genes that vary in the species or population, while monozygotic twins share them all (e.g., Kendler and Prescott 2006, 42). However, it is straightforward to invent plausible models of the contributions of multiple genes to a trait that do not result in this ratio (see example to follow in the text). Ratios other than .5 should not be surprising because measures of similarity (such as, “intraclass correlations”) are based on observed traits and, as such, are not directly given by the number of shared genes involved in the development of those traits (Taylor 2007, 2009).

  11. Genealogical relatedness can be taken into account without the models of hypothetical multiple genes and additional assumptions sketched in note 10. Analyses without those assumptions may, however, require data collected under special conditions, such as, replicates of a variety raised in separate, randomly chosen locations (e.g., twins raised apart) and replicates from different varieties raised in the same location (e.g., unrelated individuals raised in the same family) (Taylor 2007, 2009). Whether these special conditions obtain for humans in any actual cases remains under debate (Richardson and Norgate 2005).

  12. This assumption runs through all quantitative genetics, not only in human studies; see note 10. Models based on this assumption can be fitted to observations (and the fit of different models compared with each other), but support has not been shown for the models’ assumptions independent of that fit. This practice runs counter to the idea in philosophy of science that confirmation of a model requires both aspects (Taylor 2005, 35ff).

  13. See next to last paragraph under Gap 3 for an example of a model in which DZ twins are almost always more than half as similar as DZ twins. (Simulation available from author on request.)

  14. The standard methods of human quantitative genetics cannot demonstrate that the variety-location-interaction variance is negligible, so the resulting estimates of heritability, which incorporate the interaction fraction, may be systematically inflated (Taylor 2007). It is important to be able to separate out the interaction fraction because the claim that the effect of family members growing up in the same location (family) is of small importance (e.g., Turkheimer 2000) requires showing not only that the location variance is a small component of the total variance, but so also is the variety-location-interaction variance. In agricultural plant evaluation trials, variety-location-interaction variance is typically as large a fraction of the total as variety variance (heritability), but it is not known whether this is also the case for animal or human populations observed in a typical range of locations. To estimate interaction variance separately from heritability requires data collected under the special conditions mentioned in note 11.

  15. Residual variance is a “non-shared” component in the sense of not being variation among location averages, variety averages, or additional contributions from averages for variety-location combinations. However, this component should not be labeled an environmental component given the two sources of residual variance, namely: measurement error (after subtracting any systematic differences in measurement error across varieties or across locations); and differences among replicates within variety-location combinations in the ways that the (unknown) genetic and environmental factors possessed or experienced by the replicates influence the trait. Greater accuracy in measurement can reduce the first source of residual variation. The second source can be reduced if replicates of a variety are more uniform and positioned randomly within the location. The neutral terms “noise” or “unsystematic” for this fraction of the variation is more appropriate than “non-shared environmental effects.”

  16. This sense of “heterogeneity” should be distinguished from three other uses of the term in the arenas of statistics and of genetics (Kaplan 2000, 18): Statistical methods often assume equality or “homogeneity” of variances from one sample to the next; mutations in a gene may be heterogeneous in the sense that they occur at a variety of points in the gene and the clinical expression of such mutations can vary significantly; and different genetic factors may be expressed as the same clinical entity. This last form of heterogeneity can be viewed a special subset of the underlying heterogeneity referred to here, which also considers environmental factors acting in conjunction with genetic factors when allowing for the possibility that different underlying factors may be expressed as the same clinical entity.

  17. Associations have not been found in the few instances where genetic factors have been examined, e.g., genes that mark degree of African ancestry (see summary in Nisbett 1998, 89–90).

  18. The limitations of nested analysis for comparing groups can be overcome using Multi-level or Hierarchical regression analysis when data are available on measurable factors within groups or at the group level (Gelman and Hill 2007). Such data are not available in conventional heritability studies, but, if they were, interpretations of the resulting regression coefficients might still be confounded by heterogeneity in the underlying factors.

  19. Although Lindman’s note and the preceding discussion and diagrams in the text center on ANOVA, the points about the possible heterogeneity of underlying factors, about membership in different groups being analyzed as different locations, and about the limitations of nested analysis might also apply to drawing hypotheses and insights from regression analysis and experimental trials (Gap 7; see Table 2). This idea and its implications warrant further inquiry.

  20. Kendler (2005, 10) responded confidently to a trenchant criticism of some key assumptions of twin studies as follows: It is one thing to criticize the methodology of specific studies. It is quite another to suggest… that we reject the results of an entire field of scientific inquiry. This might have been warranted for some pseudoscientific systems, such as astrology, alchemy, and the Ptolemaic astronomic system. It is highly unlikely that modern psychiatric genetics will be judged by future historians of science to be in such company.

References

  • Byth DE, Eisemann RL, DeLacy IH (1976) Two-way pattern analysis of a large data set to evaluate genotypic adaptation. Heredity 37(2):215–230

    Article  Google Scholar 

  • Davey-Smith G, Ebrahim S (2007) Mendelian randomization: genetic variants as instruments for strengthening causal influences in observational studies. In: Weinstein M, Vaupel JW, Wachter KW (eds) Biosocial surveys. National Academies Press, Washington DC, pp 336–366

    Google Scholar 

  • Dickens WT, Flynn JR (2001) Heritability estimates versus large environmental effects: the IQ paradox resolved. Psychol Rev 108(2):346–369

    Article  Google Scholar 

  • Downes SM (2004) Heredity and heritability. In: Zalta EN (ed) The stanford encyclopedia of philosophy. (http://plato.stanford.edu/entries/heredity/ (viewed 11 May 2006)

  • Falconer DS, Mackay TFC (1996) Introduction to quantitative genetics, 4th edn. Longman, Harlow

    Google Scholar 

  • Flynn JR (1994) IQ gains over time. In: Sternberg RJ (ed) Encyclopedia of human intelligence. Macmillan, New York, pp 617–623

    Google Scholar 

  • Flynn JR (2000) How to defend humane ideals: substitutes for objectivity. University of Nebraska Press, Lincoln, NE

    Google Scholar 

  • Freedman DA (2005) Linear statistical models for causation: a critical review. In: Everitt B, Howell D (eds) Encyclopedia of statistics in the behavioral sciences. Wiley, Chichester

    Google Scholar 

  • Fryer R, Levitt S (2004) Understanding the black-white test score gap in the first two years of school. Rev Econ Stat 86(2):447–464

    Article  Google Scholar 

  • Gelman A, Hill J (2007) Data analysis using regression and multilevel/hierarchical models. Cambridge University Press, New York

    Google Scholar 

  • Jacquard A (1983) Heritability: one word, three concepts. Biometrics 39:465–477

    Article  Google Scholar 

  • Jencks C, Phillips M (eds) (1998) The black-white test score gap. Brookings Institution Press, Washington, DC

    Google Scholar 

  • Jensen AR (1969) How much can we boost IQ and scholastic achievement? Harv Educ Rev 39:1–123

    Google Scholar 

  • Jensen AR (1970) Race and the genetics of intelligence: a reply to Lewontin. Bull At Sci 26:17–23

    Google Scholar 

  • Kaplan JM (2000) The limits and lies of human genetic research. Routledge, New York

    Google Scholar 

  • Kendler KS (2005) Reply to J. Joseph, research paradigms of psychiatric genetics. Am J Psychiatry 162:1985–1986

    Article  Google Scholar 

  • Kendler KS, Prescott CA (2006) Genes, environment, and psychopathology: understanding the causes of psychiatric and substance abuse disorders. The Guilford Press, New York

    Google Scholar 

  • Khoury MJ, Little J, Gwinn M, Ioannidis JP (2007) On the synthesis and interpretation of consistent but weak gene-disease associations in the era of genome-wide association studies. Int J Epidemiol 36:439–445

    Article  Google Scholar 

  • Lewontin RC (1970a) Race and intelligence. Bull At Sci 26:2–8

    Google Scholar 

  • Lewontin RC (1970b) Further remarks on race and the genetics of intelligence. Bull At Sci 26:23–25

    Google Scholar 

  • Lewontin RC (1974) The analysis of variance and the analysis of causes. Am J Hum Genet 26:400–411

    Google Scholar 

  • Lindman HR (1992) Analysis of variance in experimental design. Springer, New York

    Google Scholar 

  • Lynch M, Walsh B (1998) Genetics and analysis of quantitative traits. Sinauer, Sunderland

    Google Scholar 

  • Majumder PP, Ghosh S (2005) Mapping quantitative trait loci in humans: achievements and limitations. J Clin Investig 115(6):1419–1424

    Article  Google Scholar 

  • McLaughlin P (1998) Rethinking the agrarian question: the limits of essentialism and the promise of evolutionism. Hum Ecol Rev 5:25–39

    Google Scholar 

  • Miele F (2002) Intelligence, race, and genetics: conversations with Arthur Jensen. Westview Press, Boulder

    Google Scholar 

  • Moffitt TE, Caspi A, Rutter M (2005) Strategy for investigating interactions between measured genes and measured environments. Arch Gen Psychiatry 62(5):473–481

    Article  Google Scholar 

  • Neisser U, Boodoo G, Bouchard TJ, Boykin AW, Brody N, Ceci SJ, Halpern DF, Loehlin JC, Perloff R, Sternberg RJ, Urbina S (1996) Intelligence: knowns and unknowns. Am Psychol 51:77–101

    Article  Google Scholar 

  • Nisbett RE (1998) Race, genetics, and IQ. In: Jencks C, Phillips M (eds) The black-white test score gap. Brookings Institution Press, Washington, DC, pp 86–102

    Google Scholar 

  • Nuffield Council on Bioethics (2002) Genetics and human behavior: the ethical context. http://www.nuffieldbioethics.org (viewed 22 Jun. 2007)

  • Otto SP, Christiansen FB, Feldman MW (1995) Genetic and cultural inheritance of continuous traits. Stanford University Morrison Institute for Population and Resource Studies Working Paper Series No. 64. http://www.stanford.edu/group/morrinst/pdf/64.pdf (viewed 24 March 2009)

  • Parens E (2004) Genetic differences and human identities: on why talking about behavioral genetics is important and difficult, Hastings center report (January-February). pp S1–S36

  • Plomin R, Asbury K (2006) Nature and nurture: genetic and environmental influences on behavior. Ann Am Acad Political Soc Sci 600(1):86–98

    Article  Google Scholar 

  • Plomin R, DeFries JC, Loehlin JC (1977) Genotype-environment interaction correlation in analysis of human behavior. Psychol Bull 84:309–322

    Article  Google Scholar 

  • Richardson K, Norgate S (2005) The equal environments assumption of classical twin studies may not hold. Br J Educ Psychol 75(3):339–350

    Article  Google Scholar 

  • Rutter M (2002) Nature, nurture, and development: from evangelism through science toward policy and practice. Child Dev 73(1):1–21

    Article  Google Scholar 

  • Sesardic N (2005) Making sense of heritability. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Taylor PJ (2005) Unruly complexity: ecology, interpretation, engagement. University of Chicago Press, Chicago

    Google Scholar 

  • Taylor PJ (2006a) Heritability and heterogeneity: on the limited relevance of heritability in investigating genetic and environmental factors. Biol Theory Integr Dev Evol Cognit 1(2):150–164

    Google Scholar 

  • Taylor PJ (2006b) Heritability and heterogeneity: on the irrelevance of heritability in explaining differences between means for different human groups or generations. Biol Theory Integr Dev Evol Cognit 1(4):392–401

    Google Scholar 

  • Taylor PJ (2007) The unreliability of high human heritability estimates and small shared effects of growing up in the same family. Biol Theory Integr Dev Evol Cognit 2(4):387–397

    Google Scholar 

  • Taylor PJ (2008a) Puzzles in the history and philosophy of heredity that warrant more attention. http://sicw.wikispaces.com/HeredityVariationPuzzles (viewed 12 Aug. 2008)

  • Taylor PJ (2008b) The under-recognized implications of heterogeneity: opportunities for fresh views on scientific, philosophical, and social debates about heritability. Hist Philos Life Sci 30:423–448

    Google Scholar 

  • Taylor PJ (2009) Critical assumptions of classical quantitative genetics and twin studies that warrant more attention (manuscript)

  • Turkheimer E (2000) Three laws of behavior genetics and what they mean. Curr Dir in Psychol Sci 9(5):160–164

    Article  Google Scholar 

  • Turkheimer E, Haley A, Waldron M, D’Onofrio B, Gottesman II (2003) Socioeconomic status modifies heritability of IQ in young children. Psychol Sci 16(6):623–628

    Article  Google Scholar 

  • Wikipedia (2008) Heritability. http://en.wikipedia.org/wiki/Heritability (viewed 14 Mar 2008)

Download references

Acknowledgments

This article is based on research supported by the National Science Foundation under grant SES–0634744. The comments of Hamish Spencer and anonymous reviewers of this manuscript and a related one helped in the revision process.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter Taylor.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Taylor, P. Three puzzles and eight gaps: what heritability studies and critical commentaries have not paid enough attention to. Biol Philos 25, 1–31 (2010). https://doi.org/10.1007/s10539-009-9174-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10539-009-9174-x

Keywords

Navigation