Multi-level computational methods for interdisciplinary research in the HathiTrust Digital Library

PLoS ONE 12 (9) (2017)

Authors
Jun Otsuka
Kyoto University
David Bourget
University of Western Ontario
Colin Allen
University of Pittsburgh
2 more
Abstract
We show how faceted search using a combination of traditional classification systems and mixed-membership topic models can go beyond keyword search to inform resource discovery, hypothesis formulation, and argument extraction for interdisciplinary research. Our test domain is the history and philosophy of scientific work on animal mind and cognition. The methods can be generalized to other research areas and ultimately support a system for semi-automatic identification of argument structures. We provide a case study for the application of the methods to the problem of identifying and extracting arguments about anthropomorphism during a critical period in the development of comparative psychology. We show how a combination of classification systems and mixed-membership models trained over large digital libraries can inform resource discovery in this domain. Through a novel approach of “drill-down” topic modeling—simultaneously reducing both the size of the corpus and the unit of analysis—we are able to reduce a large collection of fulltext volumes to a much smaller set of pages within six focal volumes containing arguments of interest to historians and philosophers of comparative psychology. The volumes identified in this way did not appear among the first ten results of the keyword search in the HathiTrust digital library and the pages bear the kind of “close reading” needed to generate original interpretations that is the heart of scholarly work in the humanities. Zooming back out, we provide a way to place the books onto a map of science originally constructed from very different data and for different purposes. The multilevel approach advances understanding of the intellectual and societal contexts in which writings are interpreted.
Keywords topic modelling  text classification
Categories (categorize this paper)
DOI 10.1371/journal.pone.0184188
Options
Edit this record
Mark as duplicate
Export citation
Find it on Scholar
Request removal from index
Revision history

Download options

Our Archive
External links

Setup an account with your affiliations in order to access resources via your University's proxy server
Configure custom proxy (use this if your affiliation does not provide a proxy)
Through your library

References found in this work BETA

Add more references

Citations of this work BETA

No citations found.

Add more citations

Similar books and articles

Computational Scientific Discovery.D. Sozou Peter, C. Lane Peter, Addis Mark & Gobet Fernand - 2017 - In Lorenzo Magnani & Tommaso Bertolotti (eds.), Springer Handbook of Model-Based Science. Dordrecht: Springer. pp. 719-734.
Cross-Cutting Categorization Schemes in the Digital Humanities.Colin Allen - 2013 - Isis: A Journal of the History of Science 104 (3):573-583.
Theoretical Foundations for Digital Text Analysis.Gabe Ignatow - 2016 - Journal for the Theory of Social Behaviour 46 (1):104-120.
What is Multi–Level Modelling For?Stephen Gorard - 2003 - British Journal of Educational Studies 51 (1):46-63.
The Cambridge Handbook of Computational Psychology.Ron Sun (ed.) - 2008 - Cambridge University Press.

Analytics

Added to PP index
2018-03-22

Total views
86 ( #96,186 of 2,271,451 )

Recent downloads (6 months)
42 ( #19,950 of 2,271,451 )

How can I increase my downloads?

Downloads

My notes

Sign in to use this feature