Research on bias in peer review examines scholarly communication and funding processes to assess the epistemic and social legitimacy of the mechanisms by which knowledge communities vet and self-regulate their work. Despite vocal concerns, a closer look at the empirical and methodological limitations of research on bias raises questions about the existence and extent of many hypothesized forms of bias. In addition, the notion of bias is predicated on an implicit ideal that, once articulated, raises questions about the normative implications (...) of research on bias in peer review. This review provides a brief description of the function, history, and scope of peer review; articulates and critiques the conception of bias unifying research on bias in peer review; characterizes and examines the empirical, methodological, and normative claims of bias in peer review research; and assesses possible alternatives to the status quo. We close by identifying ways to expand conceptions and studies of bias to countenance the complexity of social interactions among actors involved directly and indirectly in peer review. (shrink)
To arrive at their final evaluation of a manuscript or grant proposal, reviewers must convert a submission’s strengths and weaknesses for heterogeneous peer review criteria into a single metric of quality or merit. I identify this process of commensuration as the locus for a new kind of peer review bias. Commensuration bias illuminates how the systematic prioritization of some peer review criteria over others permits and facilitates problematic patterns of publication and funding in science. Commensuration bias also foregrounds a range (...) of structural strategies for realigning peer review practices and institutions with the aims of science. (shrink)
An empirically sensitive formulation of the norms of transformative criticism must recognize that even public and shared standards of evaluation can be implemented in ways that unintentionally perpetuate and reproduce forms of social bias that are epistemically detrimental. Helen Longino’s theory can explain and redress such social bias by treating peer evaluations as hypotheses based on data and by requiring a kind of perspectival diversity that bears, not on the content of the community’s knowledge claims, but on the beliefs and (...) norms of the culture of the knowledge community itself. To illustrate how socializing cognition can bias evaluations, we focus on peer-review practices, with some discussion of peer-review practices in philosophy. Data include responses to surveys by editors from general philosophy journals, as well as analyses of reviews and editorial decisions for the 2007 Cognitive Science Society Conference. (shrink)
Scholars belong to multiple communities of credit simultaneously. When these communities disagree about a scholarly achievement’s credit assignment, this raises a puzzle for decision and game theor...
Psychometrically oriented researchers construe low inter-rater reliability measures for expert peer reviewers as damning for the practice of peer review. I argue that this perspective overlooks different forms of normatively appropriate disagreement among reviewers. Of special interest are Kuhnian questions about the extent to which variance in reviewer ratings can be accounted for by normatively appropriate disagreements about how to interpret and apply evaluative criteria within disciplines during times of normal science. Until these empirical-cum-philosophical analyses are done, it will remain (...) unclear the extent to which low inter-rater reliability measures represent reasonable disagreement rather than arbitrary differences between reviewers. (shrink)
Under the traditional system of peer-reviewed publication, the degree of prestige conferred to authors by successful publication is tied to the degree of the intellectual rigor of its peer review process: ambitious scientists do well professionally by doing well epistemically. As a result, we should expect journal editors, in their dual role as epistemic evaluators and prestige-allocators, to have the power to motivate improved author behavior through the tightening of publication requirements. Contrary to this expectation, I will argue that the (...) publication bias literature in academic medicine demonstrates that editor interventions have had limited effectiveness in improving the health of the publication and trial registration record, suggesting that much stronger interventions are needed. (shrink)
Using the corpus of JSTOR articles, we investigate the role of gender in collaboration patterns across the scholarly landscape by analyzing gender-based homophily--the tendency for researchers to co-author with individuals of the same gender. For a nuanced analysis of gender homophily, we develop methodology necessitated by the fact that the data comprises heterogeneous sub-disciplines and that not all authorships are exchangeable. In particular, we distinguish three components of gender homophily in collaborations: a structural component that is due to demographics and (...) non-gendered authorship norms of a scholarly community, a compositional component which is driven by varying gender representation across sub-disciplines, and a behavioral component which we define as the remainder of observed homophily after its structural and compositional components have been taken into account. Using minimal modeling assumptions, we measure and test for behavioral homophily. We find that significant behavioral homophily can be detected across the JSTOR corpus and show that this finding is robust to missing gender indicators in our data. In a secondary analysis, we show that the proportion of female representation in a field is positively associated with significant behavioral homophily. (shrink)
On the surface, developing a social psychology of science seems compelling as a way to understand how individual social cognition – in aggregate – contributes towards individual and group behavior within scientific communities (Kitcher, 2002). However, in cases where the functional input-output profile of psychological processes cannot be mapped directly onto the observed behavior of working scientists, it becomes clear that the relationship between psychological claims and normative philosophy of science should be refined. For example, a robust body of social (...) psychological research demonstrates implicit gender bias in the evaluation of others (e.g., Steinpreis, Anders, & Ritzke, 1999). Many expected implicit bias to be a major cause of women’s underrepresentation in math intensive fields of science; however, quantitative sociological research of hiring and manuscript and grant evaluation has discovered no gender disparity in outcomes (Ceci & Williams, 2011). Why might this be so? This paper will discuss methodological challenges that go beyond classic problems of external validity in extrapolating psychological effects and explanations to scientific communities. These problems include more complex external validity issues raised by the introduction of multi-process models of cognition (e.g., implicit versus explicit social cognition) as well as the reflexive role that folk and experimental theories of social psychology play in guiding the behavior of scientists at the individual and community level. (shrink)
There is an increasing push by journals to ensure that data and products related to published papers are shared as part of a cultural move to promote transparency, reproducibility, and trust in the scientific literature. Yet few journals commit to evaluating their effectiveness in implementing reporting standards aimed at meeting those goals (1, 2). Similarly, though the vast majority of journals endorse peer review as an approach to ensure trust in the literature, few make their peer review data available to (...) evaluate effectiveness toward achieving concrete measures of quality, including consistency and completeness in meeting reporting standards. Remedying these apparent disconnects is critical for closing the gap between guidance recommendations and actual reporting behavior. We see this as a collective action problem requiring leadership and investment by publishers, who can be incentivized through mechanisms that allow them to manage reputational risk and through continued innovation in journal assessment. (shrink)
Psychologists and philosophers have not yet resolved what they take implicit attitudes to be; and, some, concerned about limitations in the psychometric evidence, have even challenged the predictive and theoretical value of positing implicit attitudes in explanations for social behavior. In the midst of this debate, prominent stakeholders in science have called for scientific communities to recognize and countenance implicit bias in STEM fields. In this paper, I stake out a stakeholder conception of implicit bias that responds to these challenges (...) in ways that are responsive to the psychometric evidence, while also being resilient to the sorts of disagreements and scientific progress that would not undermine the soundness of this call. Along the way, my account advocates for attributing collective (group-level) implicit attitudes rather than individual-level implicit attitudes. This position raises new puzzles for future research on the relationship (metaphysical, epistemic, and ethical) between collective implicit attitudes and individual-level attitudes. (shrink)
Psychologists' work on conversational pragmatics and judgment suggests a refreshing approach to charitable interpretation and theorizing. This charitable approach—what I call Gricean charity —recognizes the role of conversational assumptions and norms in subject-experimenter communication. In this paper, I outline the methodological lessons Gricean charity gleans from psychologists' work in conversational pragmatics. In particular, Gricean charity imposes specific evidential standards requiring that researchers collect empirical information about (1) the conditions of successful and unsuccessful communication for specific experimental contexts, and (2) the (...) conversational norms governing communication in experimental contexts. More generally, the Gricean turn in psychological research shifts focus from attributional to reflexive, situational explanations. Gricean charity does not primarily seek to rationalize subject responses. Rather, it imposes evidential requirements on psychological studies for the purpose of gaining a more accurate picture of the surprising and muddled ways in which we weigh evidence and draw. Key Words: Gricean charity • methodological rationalism • interpretation • principle of charity • cognitive psychology • conversational pragmatics • heuristics and biases • reflexive analysis. (shrink)
In his debates with Daniel Kahneman and Amos Tversky, Gerd Gigerenzer puts forward a stricter standard for the proper representation of judgment heuristics. I argue that Gigerenzer’s stricter standard contributes to naturalized epistemology in two ways. First, Gigerenzer’s standard can be used to winnow away cognitive processes that are inappropriately characterized and should not be used in the epistemic evaluation of belief. Second, Gigerenzer’s critique helps to recast the generality problem in naturalized epistemology and cognitive psychology as the methodological problem (...) of identifying criteria for the appropriate specification and characterization of cognitive processes in psychological explanations. I conclude that naturalized epistemologists seeking to address the generality problem should turn their focus to methodological questions about the proper characterization of cognitive processes for the purposes of psychological explanation. (shrink)
Previous research has found that funding disparities are driven by applications’ final impact scores and that only a portion of the black/white funding gap can be explained by bibliometrics and topic choice. Using National Institutes of Health R01 applications for council years 2014–2016, we examine assigned reviewers’ preliminary overall impact and criterion scores to evaluate whether racial disparities in impact scores can be explained by application and applicant characteristics. We hypothesize that differences in commensuration—the process of combining criterion scores into (...) overall impact scores—disadvantage black applicants. Using multilevel models and matching on key variables including career stage, gender, and area of science, we find little evidence for racial disparities emerging in the process of combining preliminary criterion scores into preliminary overall impact scores. Instead, preliminary criterion scores fully account for racial disparities—yet do not explain all of the variability—in preliminary overall impact scores. (shrink)
What is the current status of Asian Americans in philosophy? How do Asian Americans fare in comparison to other minority groups? And, what professional strategies might they use (more or less successfully) in response to their counterstereotypical status in philosophy? In this piece, I will address these questions empirically by extrapolating from available demographic, survey, and experimental studies. This analysis will be too fast and loose, but I offer it in the spirit of constructing a broad-brushed sketch— painted from a (...) pallet of variegated data—for others to critique, improve, and displace. (shrink)
is normative in the sense that it aims to make recommendations for improving human judgment; it aims to have a practical impact on morally and politically significant human decisions and actions; and it studies normative, rational judgment qua rational judgment. These nonstandard ways of understanding ACP as normative collectively suggest a new interpretation of the strong replacement thesis that does not call for replacing normative epistemic concepts, relations, and inquiries with descriptive, causal ones. Rather, it calls for recognizing that the (...) aims and normative inquiries of epistemology and normative psychology have become intermutual in nature. Key Words: Heuristics and biases • applied cognitive psychology • normative psychology • rationality • naturalized epistemology • Epistemics • Applied Naturalized Epistemology • strong replacement • strategic reliabilism • ameliorative psychology. (shrink)
Merton envisioned his norms of science at a time when peer-reviewed journals controlled scientific communication. Technologies for sharing and finding content have since divorced the certification and amplification of science, generating systemic vulnerabilities. Certified amplification – a new Mertonian-styled norm – enjoins their recoupling and introduces a taxonomy of strategies adopted by institutions to close the certification-amplification gap, including the proportioning of the one to the other. Examples illustrating each taxonomic type collectively paint a picture of an ethos employing a (...) rich range of certification and amplification techniques and emerging in a decentralized fashion across heterogeneous objects, communication modalities, and institutions. (shrink)
In his early experimental work with Suppes, Davidson adopted rationality assumptions, not as necessary constraints on interpretation, but as practical conceits in addressing methodological problems faced by experimenters studying decision making under uncertainty. Although the content of their theory has since been undermined, their methodological approach—a Galilean form of methodological rationalism—lives on in contemporary psychological research. This article draws on Max Weber’s verstehen to articulate an account of Galilean methodological rationalism; explains how anomalies faced by Davidson’s early experimental work gave (...) rise to his later, canonical claims about rationality and interpretation; and reclaims this Galilean framework for use in contemporary psychological research. (shrink)
I motivate and articulate a dispositional account of aversive racism. By conceptualizing and measuring attitudes in terms of their full distribution, rather than in terms of their mode or mean preference, my account of dispositional attitudes gives ambivalent attitudes (qua attitude) the ability to predict aggregate behavior. This account can be distinguished from other dispositional accounts of attitude by its ability to characterize ambivalent attitudes such as aversive racism at the attitudinal rather than the sub-attitudinal level and its deeper appreciation (...) of the analogy between traits and attitudes. (shrink)
The Mertonian norms of science were envisioned at a time when scientific communication was relatively centralized and hierarchical. However, Web 2.0 technologies and social media platforms have generated new systemic vulnerabilities by divorcing the certification and amplification of science. This paper argues for certified amplification, a Mertonian-styled norm that enjoins their recoupling, and introduces a taxonomy of strategies institutions have adopted to close the certification-amplification gap. The examples illustrating each taxonomic type collectively paint a picture of an ethos emerging in (...) a decentralized fashion across a heterogeneous range of objects, communication modalities, and institutional contexts. (shrink)
Considerable attention has focused on studying reviewer agreement via inter-rater reliability (IRR) as a way to assess the quality of the peer review process. Inspired by a recent study that reported an IRR of zero in the mock peer review of top-quality grant proposals, we use real data from a complete range of submissions to the National Institutes of Health and to the American Institute of Biological Sciences to bring awareness to two important issues with using IRR for assessing peer (...) review quality. First, we demonstrate that estimating local IRR from subsets of restricted-quality proposals will likely result in zero estimates under many scenarios. In both data sets, we find that zero local IRR estimates are more likely when subsets of top-quality proposals rather than bottom-quality proposals are considered. However, zero estimates from range-restricted data should not be interpreted as indicating arbitrariness in peer review. On the contrary, despite different scoring scales used by the two agencies, when complete ranges of proposals are considered, IRR estimates are above 0.6 which indicates good reviewer agreement. Furthermore, we demonstrate that, with a small number of reviewers per proposal, zero estimates of IRR are possible even when the true value is not zero. (shrink)
The Doris Duke Charitable Foundation's competitive career development award selects awardees annually. This paper describes changes DDCF made to its grants making process to improve gender representation in its applicant and awardee pools.
The White Coats for Black Lives and #ShutDownSTEM movements have galvanised biomedical practitioners and researchers to eliminate institutional and systematic racism, including barriers faced by Black researchers in biomedicine and science, technology, engineering, and mathematics. In our study on Black–White funding gaps for National Institutes of Health Research Project grants, we found that the overall award rate for Black applicants is 55% of that for white applicants. How can systems for allocating research grant funding be made more fair while improving (...) their efficiency? (shrink)