Abstract
This is the story, told in the light of a new analysis of historical data, of a mathematical biology problem that was explored in the 1930s in Thomas Morgan’s laboratory at the California Institute of Technology. It is one of the early developments of evolutionary genetics and quantitative phylogeny, and deals with the identification and counting of chromosomal inversions in Drosophila species from comparisons of genetic maps. A re-analysis of the data produced in the 1930s using current mathematics and computational technologies reveals how a team of biologists, with the help of a renowned mathematician and against their first intuition, came to an erroneous conclusion regarding the presence of phylogenetic signals in gene arrangements. This example illustrates two different aspects of a same piece: (1) the appearance of a mathematical in biology problem solved with the development of a combinatorial algorithm, which was unusual at the time, and (2) the role of errors in scientific activity. Also underlying is the possible influence of computational complexity in understanding the directions of research in biology.
Similar content being viewed by others
Notes
Theodosius Dobzhansky, letter to Milislav Demerec 1936
Despite crucial differences in the biological objects have also been discussed (Darden, 2005).
I.e not explicit.
The term homology, in the sense of “common evolutionary origin”, was not commonly used at the beginning of the 20th century. The terminology was discussed and ranged from “allelomorph” to “corresponding”. I use the current terminology for the sake of consistency and clarity.
Renamed melanogaster shortly after.
Later named the X-chromosome in order to emphasize its peculiarity. The link between chromosomes and linkage groups was already well established, as can be seen by the natural use of “chromosome” in genetic studies from the 1910s onward.
I.e homologous, see footnote 3. Homology was deduced from the similarity of phenotype variations during crossing experiments.
Several types of translocations, i.e. other mutations of the linear organization of genes along chromosomes, were predicted at the same time (Bridges, 1917; Mohr, 1919; Morgan, Bridges, & Sturtevant, 1925) and later demonstrated using cytology (Muller, 1929; Dobzhansky, 1930). They were generally considered to be “deficiencies”, or abnormalities of karyotypes, possibly resulting from mutagenic conditions. By contrast, inversions were immediately seen as evolutionary patterns susceptible to being used in differentiating species, and thus be a character for taxonomy. Translocations were later used in plant taxonomy by Babcock and Stebbins (1938).
Examined by Kohler (1994), who writes that the unpublished part is of wider significance.
It is a coincidence that the maximum number found in 1995 is precisely the one that biologists struggled with in 1937. That we have not been able to greatly improve our handling of the data is indicative of the inherent computational complexity of the problem.
Note that the corrected values given here were obtained only with the published data and the statistical test proposed by the original authors. However this analysis requires computational tools that were not available at the time. There would probably be a lot more to discover if we were to redo this analysis with new data.
A bona fide statistical test in this case would require a p-value rather than standard deviations. This was not considered in the 1937 and 1941 articles but it is possible to compute an empirical p-value from a sample of 1,000 uniformly sampled random permutations. This gives a probability of achieving six or less inversions for 13 genes of 0.06, a probability of achieving two or less inversions for six genes of 0.2, and a probability of achieving three or less inversions for seven genes of 0.35. Considering each chromosome independently is hardly conclusive. When all chromosomes are taken into account, gene inversions can be considered significantly different from what would be expected at random based on the usual significance thresholds.
This story is attributed to Charlie Munger in Belevin (2007).
Note that this contrasts with the history of protein sequence alignment, where it became possible to compare two related sequences without excessive mathematical involvement (see, for example, (Margoliash, 1963)). I am not saying that sequence alignment did not pose an interesting mathematical problem but it was inherently easier to solve with the intuitive ideas of biologists than computing an inversion distance.
Finding the minimum number of inversions to transform a sequence of letters into alphabetical order is provably intractable (Caprara & Lancia, 2000).
References
Anderson, E. (1937). Cytology in its relation to taxonomy. The Botanical Review, 3(7), 335–350.
Babcock, E. B., & Stebbins, G. L. (1938). The American species of Crepis: Their interrelationships and distribution as affected by polyploidy and apomixis. Carnegie Institution of Washington.
Bafna, V., & Pevzner, P. A. (1996). Genome rearrangements and sorting by reversals. SIAM Journal on Computing, 25(2), 272–289.
Belevin, P. (2007). Seeking wisdom: From Darwin to Munger. PCA Publications.
Bosch, G. (2018). Train PhD students to be thinkers not just specialists. Nature, 554(7692), 277–277.
Bowler, P. J. (2003). Evolution: The history of an idea. University of California Press.
Boyden, A. (1934). Precipitins and phylogeny in animals. The American Naturalist, 68(719), 516–536.
Brehm, A. (1990). Phylogénie de neuf espèces de drosophila du groupe obscura d’après les homologies de segments des chromosomes polytènes. Ph. D. thesis, Université de Lyon 1.
Bridges, C. B. (1917). Deficiency. Genetics, 2, 445–465.
Camin, J. H., & Sokal, R. R. (1965). A method for deducing branching sequences in phylogeny. Evolution, 19(3), 311–326.
Caprara, A., & Lancia, G. (2000). Experimental and statistical analysis of sorting by reversals. In D. Sankoff & J. H. Nadeau (Eds.), Comparative genomics (pp. 171–183). Springer.
Carson, H. L., & Kaneshiro, K. Y. (1976). Drosophila of Hawaii: Systematics and ecological genetics. Annual Review of Ecology and Systematics, 7(1), 311–345.
Castle, W. E. (1918). Is the arrangement of the genes in the chromosome linear? Proceedings of the National Academy of Sciences, 5(2), 25–32.
Castle, W. E. (1919). Are genes linear or non-linear in arrangement? Proceedings of the National Academy USA, 5(11), 500–506.
Chabot, H. (1999). Enquête historique sur les savoirs scientifiques rejetés a l’aube du positivisme (1750-1835). Ph. D. thesis, Université de Nantes.
Darden, L. (2005). Relations among fields: Mendelian, cytological and molecular mechanisms. Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences, 36(2), 349–371.
de Chadarevian, S. (1996). Sequences, conformation, information: Biochemists and molecular biologists in the 1950s. Journal of the History of Biology, 29(3), 361–386.
de Chadarevian, S., & Kamminga, H. (Eds.) (1998). Molecularizing biology and medicine new practices and alliances, 1920s to 1970s. Taylor and Francis.
Dietrich, M. R. (1994). The origins of the neutral theory of molecular evolution. Journal of the History of Biology, 27(1), 21–59.
Dietrich, M. R. (1998). Paradox and persuasion: Negotiating the place of molecular evolution within evolutionary biology. Journal of the History of Biology, 31(1), 85–111.
Dietrich, M. R. (2016). History of molecular evolution. In Encyclopedia of evolutionary biology. Elsevier.
Dobzhansky, T. (1930). Translocations involving the third and the fourth chromosomes of drosophila melanogaster. Genetics, 15(4), 347–399.
Dobzhansky, T., & Powell, J. R. (1975). Drosophila pseudoobscura and its american relatives, drosophila persimilis and drosophila miranda. In R. King (Ed.), Invertebrates of genetic interest (pp. 537–587). Plenum Press.
Dobzhansky, T., & Wright, S. (1941). Genetics of natural populations v. relations between mutation rate and accumulation of lethals in populations of drosophila pseudoobscura. Genetics, 26, 23–51.
Dutrillaux, A.-M., & Dutrillaux, B. (2012). Chromosome analysis of 82 species of scarabaeoidea (coleoptera), with special focus on nor localization. Cytogenetic and Genome Research, 136, 208–219.
Fertin, G., Labarre, A., Rusu, I., Tannier, E., & Vialette, S. (2009). Combinatorics of genome rearrangements. MIT Press.
Firestein, S. (2015). Failure. Oxford University Press.
Galvão, G. R., & Dias, Z. (2015). An audit tool for genome rearrangement algorithms. Journal of Experimental Algorithmics 19, 1.7:1.1–1.7:1.34.
Gannett, L., & Greisemer, J. R. (2004). Classical genetics and the geography of genes. In Rheinberger & Gaudilliere (Eds.), Classical genetic research and its legacy, (pp. 57–88). Routledge.
Hagen, J. B. (1982). Experimental taxonomy, 1930-1950: The impact of cytology, ecology, and genetics on ideas of biological classification. Ph. D. thesis, Oregon State University.
Hagen, J. B. (1984). Experimentalists and naturalists in twentieth-century botany: Experimental taxonomy, 1920–1950. Journal of the History of Biology, 17(2), 249–270.
Hagen, J. B. (1999). Naturalists, molecular biologists, and the challenges of molecular evolution. Journal of the History of Biology, 32(2), 321–341.
Hagen, J. B. (2000). The origins of bioinformatics. Nature Reviews Genetics, 1(3), 231.
Hagen, J. B. (2001). The introduction of computers into systematic research in the United States during the 1960s. Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences, 32, 291–314.
Hagen, J. B. (2003). The statistical frame of mind in systematic biology from quantitative zoology to biometry. Journal of the History of Biology, 36(2), 353–384.
Hagen, J. B. (2010). Waiting for sequences: Morris Goodman, immunodiffusion experiments, and the origins of molecular anthropology. Journal of the History of Biology, 43(4), 697–725.
Kay, L. E. (1993). The molecular vision of life: Caltech, the Rockefeller foundation, and the rise of the new biology. Oxford University Press.
Kececioglu, J., & Sankoff, D. (1995). Exact and approximation algorithms for sorting by reversals, with application to genome rearrangement. Algorithmica, 13, 180–210.
Kohler, R. E. (1994). Lords of the fly: Drosophila genetics and the experimental life. University of Chicago Press.
Lehmer, D. H. (1993). The mathematical work of Morgan Ward. Mathematics of Computation, 61, 307–311.
Livio, M. (2014). Brilliant blunders: From Darwin to Einstein - Colossal mistakes by great scientists that changed our understanding of life and the universe. Brilliance Audio.
Margoliash, E. (1963). Primary structure and evolution of cytochrome c. Proceedings of the National Academy of Sciences, 50(4), 672–679.
McClung, C. E. (1908). Cytology and taxonomy. Kansas University Science Bulletin, 4(7), 199–215.
Metz, C. W. (1914). Chromosome studies in the diptera. I. A preliminary survey of five different types of chromosome groups in the genus drosophila. Journal of Experimental Zoology Part A: Ecological Genetics and Physiology, 17(1), 45–59.
Metz, C. W. (1916). Chromosome studies on the Diptera. III. Additional types of chromosome groups in the drosophilidae. The American Naturalist, 50(598), 587–599.
Metz, C. W. (1918). Chromosome studies on the Diptera. Zeitschrift für induktive Abstammungs-und Vererbungslehre, 19(3), 211–213.
Mohr, O. L. (1919). Character changes caused by mutation of an entire region of a chromosome in drosophila. Genetics, 4, 275–282.
Morgan, G. J. (1998). Emile Zuckerkandl, Linus Pauling, and the molecular evolutionary clock, 1959–1965. Journal of the History of Biology, 31(2), 155–178.
Morgan, T. H., & Bridges, C. B. (1916). Sex-linked inheritance in drosophila. Carnegie Institution of Washington.
Morgan, T. H., Bridges, C. B., & Sturtevant, A. H. (1925). The genetics of drosophila. Bibliographia Genetica.
Morgan, T. H., Sturtevant, A. H., & Bridges, C. B. (1920). The evidence for the linear order of the genes. Proceedings of the National Academy of Sciences, 6(4), 162–164.
Muller, H. J. (1929). The first cytological demonstration of a translocation in drosophila. The American Naturalist, 63(689), 481–486.
Murphy, W. J., Larkin, D. M., Everts-van der Wind, A., Bourque, G., Tesler, G., Auvil, L., Beever, J. E., Chowdhary, B. P., Galibert, F., Gatzke, L., Hitte, C., Meyers, S. N., Milan, D., Ostrander, E. A., Pape, G., Parker, H. G., Raudsepp, T. , Rogatcheva, M. B., Schook, L. B., Skow, L. C., Welge, M., Womack, J. E., O’brien, S. J., Pevzner, P. A. & Lewin, H. A. (2005). Dynamics of mammalian chromosome evolution inferred from multispecies comparative maps. Science, 309(5734), 613–617.
Novitski, E. (2005). Sturtevant and Dobzhansky: Two scientists at odds, with a student’s recollections. Xlibris Corporation.
Painter, T. S. (1933). A new method for the study of chromosome rearrangements and the plotting of chromosome maps. Science, 78, 585–586.
Painter, T. S. (1934). Salivary chromosomes and the attack on the gene. Journal of Heredity, 25(12), 465–476.
Papadimitriou, C. H. (1993). Computational complexity. Pearson.
Pevzner, P., & Tesler, G. (2003). Genome rearrangements in mammalian evolution: Lessons from human and mouse genomes. Genome Research, 13(1), 37–45.
Popper, K. (1959). The logic of scientific discovery. Routledge.
Porter, T. M. (1996). Trust in numbers. Princeton University Press.
Smocovitis, V. B. (2006). Keeping up with Dobzhansky: G. Ledyard Stebbins, Jr., plant evolution, and the evolutionary synthesis. History and Philosophy of the Life Sciences, 28, 9–48.
Smocovitis, V. B. (2009). The ”Plant Drosophila”: E. B. Babcock, the GenusCrepis, and the evolution of a genetics research program at Berkeley, 1915–1947. Historical Studies in the Natural Sciences 39(3), 300–355.
Sommer, M. (2008). History in the gene: Negotiations between molecular and organismal anthropology. Journal of the History of Biology, 41(3), 473–528.
Strasser, B. J. (2010a). Collecting, comparing, and computing sequences: The making of Margaret O. Dayhoff’s atlas of protein sequence and structure, 1954–1965. Journal of the History of Biology 43(4), 623–660.
Strasser, B. J. (2010b). Laboratories, museums, and the comparative perspective: Alan A. Boyden’s quest for objectivity in serological taxonomy, 1924-1962. Historical Studies in the Natural Sciences 40(2), 149–182.
Sturtevant, A. H. (1921). A case of rearrangement of genes in drosophila. Proceedings of the National Academy of Sciences, 7(8), 235–237.
Sturtevant, A. H. (1942). The classification of the genus drosophila, with descriptions of nine new species. Austin: The University of Texas Publication4213, 5–51.
Sturtevant, A. H., Bridges, C. B., & Morgan, T. H. (1919). The spatial relations of genes. Proceedings of the National Academy of Sciences, 5(5), 168–173.
Sturtevant, A. H., & Dobzhansky, T. (1936). Inversions in the third chromosome of wild races of drosophila pseudoobscura, and their use in the study of the history of the species. Proceedings of the National Academy of Sciences, 22(7), 448–450.
Sturtevant, A. H., & Novitski, E. (1941). The homologies of chromosome elements in the genus drosophila. Genetics, 26, 517–541.
Sturtevant, A. H., & Plunkett, C. R. (1926). Sequence of corresponding third-chromosome genes in drosophila melanogaster and d. simulans. The Biological Bulletin, 50, 56–60.
Sturtevant, A. H., & Tan, C. C. (1937). The comparative genetics of drosophila pseudoobscura and D. Melanogaster. Journal of Genetics, 34, 415–432.
Suárez-Díaz, E. (2009). Molecular evolution: Concepts and the origin of disciplines. Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences, 40(1), 43–53.
Suárez-Díaz, E. (2010). Making room for new faces: Evolution, genomics and the growth of bioinformatics. History and Philosophy of the Life Sciences, 32(1), 65–89.
Suárez-Díaz, E. (2014). The long and winding road of molecular data in phylogenetic analysis. Journal of the History of Biology, 47(3), 443–478.
Suárez-Díaz, E., & Anaya-Muñoz, V. H. (2008). History, objectivity, and the construction of molecular phylogenies. Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences, 39(4), 451–468.
Turing, A. M. (1936). On computable numbers, with an application to the entscheidungsproblem. Proceedings of the London mathematical society, 2(1), 230–265.
Turrill, W. B. (1938). The expansion of taxonomy with special reference to spermatophyta. Biological Reviews, 13, 342–373.
Watterson, G., Ewens, W., Hall, T., & Morgan, A. (1982). The chromosome inversion problem. Journal of Theoretical Biology, 99(1), 1–7.
Zuckerkandl, E., & Pauling, L. (1965). Molecules as documents of evolutionary history. Journal of Theoretical Biology, 8(2), 357–366.
Acknowledgements
Thanks to Istvan Miklos for showing me the 1937 article by Sturtevant, and to Vincent Daubin and Bastien Boussau for giving me the opportunity to present part of this work at the Jacques Monod conference in 2016, “Molecules as documents of evolutionary history: 50 years after”. Thanks also to several anonymous historians who have kindly helped me improve the historical aspects of this article, find the relevant secondary literature and get rid of most teleological and anachronistic arguments.
Funding
The funded was grant by Agence Nationale de la Recherche (ANR-19-CE45-0010 Evoluthon).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Tannier, E. A hapless mathematical contribution to biology. HPLS 44, 34 (2022). https://doi.org/10.1007/s40656-022-00514-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s40656-022-00514-x