Abstract
Student evaluations of teaching are ubiquitous in the academe as a metric for assessing teaching and frequently used in critical personnel decisions. Yet, there is ample evidence documenting both measurement and equity bias in these assessments. Student Evaluations of Teaching (SETs) have low or no correlation with learning. Furthermore, scholars using different data and different methodologies routinely find that women faculty, faculty of color, and other marginalized groups are subject to a disadvantage in SETs. Extant research on bias on teaching evaluations tend to review only the aspect of the literature most pertinent to that study. In this paper, we review a novel dataset of over 100 articles on bias in student evaluations of teaching and provide a nuanced review of this broad but established literature. We find that women and other marginalized groups do face significant biases in standard evaluations of teaching – however, the effect of gender is conditional upon other factors. We conclude with recommendations for the judicious use of SETs and avenues for future research.
Similar content being viewed by others
Notes
For now, the entirety of this discussion and related research is binary in its orientation. We recognize that gender is more complex than women and men and acknowledge that gender identity that does not overtly conform to the binary likely complicates evaluations of teaching further than the existing body of knowledge has even identified
A full list of articles and article summaries are available at < redacted >
Though see Basow and Montgomery (2005), which finds no significant interactions between student and faculty gender
Research also finds that the role of attractiveness is more relevant to women, who are more likely to get comments about their appearance (Mitchell & Martin, 2018; Key & Ardoin, 2019). This is problematic given that attractiveness has been shown to be correlated with evaluations of instructional quality (Rosen, 2018)
References
Abel, M. H., & Meltzer, A. L. (2007). Student ratings of a male and female professors’ lecture on sex discrimination in the workforce. Sex Roles, 57(3–4), 173–180
Abrami, P. C. (2001). Improving judgments about teaching effectiveness using teacher rating forms. New Directions for Institutional Research, 2001(109), 59–87
Adams, M. J. D., & Umbach, P. D. (2012). Nonresponse and online student evaluations of teaching: understanding the influence of salience, fatigue, and academic environments. Research in Higher Education, 53(5), 576–591
Anderson, K. J. (2010). Students’ stereotypes of professors: An exploration of the double violations of ethnicity and gender. Social Psychology of Education, 13(4), 459–472
Anderson, K. J., & Kanner, M. (2011). Inventing a Gay Agenda: Students’ Perceptions of Lesbian and Gay Professors 1. Journal of Applied Social Psychology, 41(6), 1538–1564
Anderson, K. J., & Smith, G. (2005). Students’ preconceptions of professors: Benefits and barriers according to ethnicity and gender. Hispanic Journal of Behavioral Sciences, 27(2), 184–201
Aguirre Jr, A. (2000). Women and Minority Faculty in the Academic Workplace: Recruitment, Retention, and Academic Culture. ASHE-ERIC Higher Education Report, Volume 27, Number 6. Jossey-Bass Higher and Adult Education Series. Jossey-Bass, 350 Sansome St., San Francisco, CA 94104-1342.
APSA. (2011). Political science in the 21st century edited by report of the task force on political science in the 21st century
Arbuckle, J., & Williams, B. D. (2003). Students’ perceptions of expressiveness: Age and gender effects on teacher evaluations. Sex Roles, 49(9–10), 507–516
Arreola, R. A. (2004). Developing a comprehensive faculty evaluation system. Magna Publications
Bachen, C. M., McLoughlin, M. M., & Garcia, S. S. (1999). Assessing the role of gender in college students’ evaluations of faculty. Communication Education, 48(3), 193–210
Baker, P., & Copp, M. (1997). Gender matters most: the interaction of gendered expectations, feminist course content, and pregnancy in student course evaluations. Teaching Sociology: 29–43
Barbezat, D. A., & Hughes, J. W. (2005). Salary structure effects and the gender pay gap in academia. Research in Higher Education, 46(6), 621–640.
Bos, A. L., Sweet-Cushman, J., & Schneider, M. C. (2019). Family-friendly academic conferences: a missing link to fix the “leaky pipeline”? Politics, Groups, and Identities, 7(3), 748–758.
Basow, S. A., & Distenfeld, M. S. (1985). Teacher expressiveness: More important for male teachers than female teachers? Journal of Educational Psychology, 77(1), 45
Basow, S. A., & Howe, K. G. (1987). Evaluations of college professors: Effects of professors’ sex-type, and sex, and students’ sex. Psychological Reports, 60(2), 671–678
Basow, S. A. (1995). Student evaluations of college professors: When gender matters. Journal of Educational Psychology, 87(4), 656
Basow, S. A. (2000). Best and worst professors: Gender patterns in students’ choices. Sex Roles, 43(5–6), 407–417
Basow, S. A., & Montgomery, S. (2005). Student ratings and professor self-ratings of college teaching: Effects of gender and divisional affiliation. Journal of Personnel Evaluation in Education, 18(2), 91–106
Basow, S. A., & Silberg, N. T. (1987). Student evaluations of college professors: Are female and male professors rated differently? Journal of Educational Psychology, 79(3), 308
Bennett, S. K. (1982). Student perceptions of and expectations for male and female instructors: Evidence relating to the question of gender bias in teaching evaluation. Journal of Educational Psychology, 74(2), 170
Benton, S. L., & Cashin, W. E. (2012). Student ratings of teaching: a summary of research and literature (IDEA Paper no. 50). Manhattan, KS: The IDEA Center
Bian, L., Leslie, S.-J., & Cimpian, A. (2017). Gender stereotypes about intellectual ability emerge early and influence children’s interests. Science, 355(6323), 389–391
Boring, A. (2017). Gender biases in student evaluations of teaching. Journal of Public Economics, 145, 27–41
Boring, A., Ottoboni, K., & Stark, P. (2016). Student evaluations of teaching (mostly) do not measure teaching effectiveness. ScienceOpen Research
Bray, J. H., & Howard, G. S. (1980). Interaction of teacher and student sex and sex role orientations and student evaluations of college instruction. Contemporary Educational Psychology,5(3), 241–248
Burns-Glover, A. L., & Veith, D. J. (1995). Revisiting gender and teaching evaluations: Sex still makes a difference. Journal of Social Behavior and Personality, 10(4), 69
Centra, J. A. (2000). Evaluating the Teaching Portfolio: A Role for Colleagues. New Directions for Teaching and Learning, 83, 87–93
Centra, J. A., & Gaubatz, N. B. (1998). Is there gender bias in student ratings of instruction. Journal of Higher Education, 70, 17–33
Chamberlin, M. S., & Hickey, J. S. (2001). Student evaluations of faculty performance: The role of gender expectationis in differential evaluations. Educational Research Quarterly, 25(2), 3
Chapman, D. D., & Joines, J. A. (2017). Strategies for Increasing Response Rates for Online End-of-Course Evaluations. International Journal of Teaching and Learning in Higher Education, 29(1), 47–60
Chávez, K., & Mitchell, K. M. (2020). Exploring bias in student evaluations: Gender, race, and ethnicity. PS: Political Science & Politics, 53(2), 270-274.
Chism, N. V. N. (2007). Peer Review of Teaching. A Sourcebook. Bolton Massachusetts: Anker
Eagly, A. H., & Karau, S. J. (2002). Role congruity theory of prejudice toward female leaders. Psychological Review, 109(3), 573
El-Alayli, A., Hansen-Brown, A. A., & Ceynar, M. (2018). Dancing backwards in high heels: Female professors experience more work demands and special favor requests, particularly from academically entitled students. Sex Roles, 79(3–4), 136–150
Elmore, P. B., & LaPointe, K. A. (1974). Effects of teacher sex and student sex on the evaluation of college instructors. Journal of Educational Psychology, 66(3), 386.
Elmore, P. B., & LaPointe, K. A. (1975). Effect of teacher sex, student sex, and teacher warmth on the evaluation of college instructors. Journal of Educational Psychology, 67(3), 368
Esarey, J., & Valdes, N. (2020). Unbiased, reliable, and valid student evaluations can still be unfair. Assessment & Evaluation in Higher Education
Ewing, V. L., Stukas Jr, A. A., & Sheehan, E. P. (2003). Student prejudice against gay male and lesbian lecturers. The Journal of Social Psychology, 143(5), 569–579
Fan, Y., Shepherd, L. J., Slavich, E., Waters, D., Stone, M., Abel, R., & Johnston, E. L. (2019). Gender and cultural bias in student evaluations: Why representation matters. PLoS One, 14(2), e0209749
Feldman, K. A. (1992). College students’ views of male and female college teachers: Part I—Evidence from the social laboratory and experiments. Research in Higher Education, 33(3), 317–375
Fischer, E., & Hänze, M. (2019). Bias hypotheses under scrutiny: investigating the validity of student assessment of university teaching by means of external observer ratings. Assessment & Evaluation in Higher Education, 44(5), 772–786
Franklin, J. (2001). Interpreting the numbers: Using a narrative to help others read student evaluations of your teaching accurately. New Directions for Teaching and Learning, 87, 85–100
Franklin, J., & Theall, M. (1995). The relationship of disciplinary differences and the value of class preparation time to student ratings of teaching. New Directions for Teaching and Learning, 1995(64), 41–48
Freeman, H. R. (1994). Student evaluations of college instructors: Effects of type of course taught, instructor gender and gender role, and student gender. Journal of Educational Psychology, 86(4), 627
Greenwald, A. G., & Gillmore, G. M. (1997). No pain, no gain? The importance of measuring course workload in student ratings of instruction. Journal of Educational Psychology, 89(4), 743
Hamermesh, D. S., & Parker, A. (2005). Beauty in the classroom: Instructors’ pulchritude and putative pedagogical productivity. Economics of Education Review, 24(4), 369–376
Harris, M. B. (1975). Sex role stereotypes and teacher evaluations. Journal of Educational Psychology, 67(6), 751
Ḥaṭiva, N. (2013a). Student ratings of instruction: a practical approach to designing, operating, and reporting. Oron Publications
Ḥaṭiva, N. (2013b). Student ratings of instruction: Recognizing effective teaching. Oron Publications
Hessler, M., Pöpping, D. M., Hollstein, H., Ohlenburg, H., Arnemann, P. H., Massoth, C., et al. (2018). Availability of cookies during an academic course session affects evaluation of teaching. Medical Education, 52(10), 1064–1072
Himelein, M. J. (2018). Pitfalls of using student comments in the evaluation of faculty. Academic Briefing: Expert Advice for Higher Ed Leaders. https://www.academicbriefing.com/human-resources/faculty-evaluation/pitfalls-of-using-student-comments-evaluation-of-faculty/
Kaschak, E. (1978). Sex bias in student evaluations of college professors. Psychology of Women Quarterly, 2(3), 235–243
Kaschak, E. (1981). Another look at sex bias in students’ evaluations of professors: Do winners get the recognition that they have been given? Psychology of Women Quarterly, 5(5_suppl), 767–772
Key, E., & Ardoin, P. (2019). Students rate male instructors more highly than female instructors. We tried to counter that hidden bias. Washington Post. Accessed 3 Sep 2019. https://www.washingtonpost.com/politics/2019/08/20/students-rate-male-instructors-more-highly-than-female-instructors-we-tried-counter-that-hidden-bias/
Kierstead, D., D’agostino, P., & Dill, H. (1988). Sex role stereotyping of college professors: Bias in students’ ratings of instructors. Journal of Educational Psychology, 80(3), 342
Leslie, S.-J., Cimpian, A., Meyer, M., & Freeland, E. (2015). Expectations of brilliance underlie gender distributions across academic disciplines. Science, 347(6219), 262–265
Lindahl, M. W., & Unger, M. L. (2010). Cruelty in student teaching evaluations. College Teaching, 58(3), 71–76
Linse, A. R. (2017). Interpreting and using student ratings data: Guidance for faculty serving as administrators and on evaluation committees. Studies in Educational Evaluation, 54, 94–106
MacNell, L., Driscoll, A., & Hunt, A. N. (2015). What’s in a name: Exposing gender bias in student ratings of teaching. Innovative Higher Education, 40(4), 291–303
Marsh, H. W. (1980). Research on students’ evaluations of teaching effectiveness. Instructional Evaluation, 4(5), 5–13
Marsh, H. W. (1982a). Factors affecting students’ evaluations of the same course taught by the same instructor on different occasions. American Educational Research Journal, 19(4), 485–497
Marsh, H. W. (1982b). Validity of students’ evaluations of college teaching: A multitrait–multimethod analysis. Journal of Educational Psychology, 74(2), 264
Marsh, H. W. (1984). Students’ evaluations of university teaching: Dimensionality, reliability, validity, potential baises, and utility. Journal of Educational Psychology, 76(5), 707
Martin, E. (1984). Power and authority in the classroom: Sexist stereotypes in teaching evaluations. Signs: Journal of Women in Culture and Society, 9(3), 482–492
McPherson, M. A., Todd Jewell, R., & Kim, M. (2009). What determines student evaluation scores? A random effects analysis of undergraduate economics classes. Eastern Economic Journal, 35(1), 37–51
Mengel, F., Sauermann, J., & Zölitz, U. (2018). Gender bias in teaching evaluations. Journal of the European Economic Association, 17(2), 535–566
Miles, P., & House, D. (2015). The Tail Wagging the Dog; An Overdue Examination of Student Teaching Evaluations. International Journal of Higher Education, 4(2), 116–126
Miller, J., & Seldin, P. (2014). Changing Practices in Faculty Evaluations: Can Better Evaluation Make a Difference? Academe, 100(3), 35–38
Miller, J., & Chamberlin, M. (2000). Women are teachers, men are professors: A study of student perceptions. Teaching Sociology, 28(4), 283
Mitchell, K. M. W., & Martin, J. (2018). Gender bias in student evaluations. Political Science & Politics, 51(3), 648–652
Murray, H. G. (1984). The impact of formative and summative evaluation of teaching in North American universities. Assessment and Evaluation in Higher Education, 9(2), 117–132
Murray, H. G. (1997). Does evaluation of teaching lead to improvement of teaching? The International Journal for Academic Development, 2(1), 8–23.
Perna, L. W. (2005). The benefits of higher education: Sex, racial/ethnic, and socioeconomic group differences. The Review of Higher Education, 29(1), 23–52.
Peterson, D. A. M., Biederman, L. A., Andersen, D., Ditonto, T. M., & Roe, K. (2019). Mitigating gender bias in student evaluations of teaching. PLoS One, 14(5), e0216241
Piatak, J., & Mohr, Z. (2019). More gender bias in academia? Examining the influence of gender and formalization on student worker rule following. Journal of Behavioral Public Administration, 2(2)
Reid, L. D. (2010). The role of perceived race and gender in the evaluation of college teaching on RateMyProfessors. Com. Journal of Diversity in Higher Education, 3(3), 137
Ridgeway, C. L. (2011). Framed by gender: How gender inequality persists in the modern world Oxford University Press
Rivera, L. A., & Tilcsik, A. (2019). Scaling Down Inequality: Rating Scales, Gender Bias, and the Architecture of Evaluation. American Sociological Review, 84(2), 248–274
Rosen, A. S. (2018). Correlations, trends and potential biases among publicly accessible web-based student evaluations of teaching: a large-scale study of RateMyProfessors. com data. Assessment & Evaluation in Higher Education, 43(1), 31–44
Rowden, G. V., & Carlson, R. E. (1996). Gender issues and students’ perceptions of instructors’ immediacy and evaluation of teaching and course. Psychological Reports, 78(3), 835–839
Seldin, P., Miller, J. E., & Seldin, C. A. (2010). The teaching portfolio: A practical guide to improved performance and promotion/tenure decisions. John Wiley & Sons
Sidanius, J., & Crane, M. (1989). Job evaluation and gender: The case of university faculty. Journal of Applied Social Psychology, 19(2), 174–197
Sinclair, L., & Kunda, Z. (2000). Motivated stereotyping of women: She’s fine if she praised me but incompetent if she criticized me. Personality and Social Psychology Bulletin, 26(11), 1329–1342.
Smith, B. P., & Hawkins, B. (2011). Examining student evaluations of black college faculty: does race matter? Journal of Negro Education, 80(2)
Spooren, P., Brockx, B., & Mortelmans, D. (2013). On the validity of student evaluation of teaching: The state of the art. Review of Educational Research, 83(4), 598–642
Sprague, J., & Massoni, K. (2005). Student evaluations and gendered expectations: What we can’t count can hurt us. Sex Roles, 53(11–12), 779–793
Stark, P., & Freishtat, R. (2014). An evaluation of course evaluations. ScienceOpen. Center for Teaching and Learning, University of California, Berkley. Retrieved https://www.scienceopen.com/document
Storage, D., Horne, Z., Cimpian, A., & Leslie, S.-J. (2016). The frequency of “brilliant” and “genius” in teaching evaluations predicts the representation of women and African Americans across fields. PLoS One, 11(3), e0150194
Subtirelu, N. C. (2015). “She does have an accent but…”: Race and language ideology in students’ evaluations of mathematics instructors on RateMyProfessors. com. Language in Society, 44(1), 35–62
Theall, M., & Franklin, J. (2001). Looking for bias in all the wrong places: A search for truth or a witch hunt in student ratings of instruction? New Directions for Institutional Research, 2001(109), 45–56
Uttl, B., White, C. A., & Gonzalez, D. W. (2017). Meta-analysis of faculty’s teaching effectiveness: Student evaluation of teaching ratings and student learning are not related. Studies in Educational Evaluation, 54, 22–42
Uttl, B., White, C. A., & Morin, A. (2013). The numbers tell it all: students don’t like numbers! PLoS One, 8(12), e83443
Wachtel, H. K. (1998). Student evaluation of college teaching effectiveness: A brief review. Assessment & Evaluation in Higher Education, 23(2), 191–212
Wagner, N., Rieger, M., & Voorvelt, K. (2016). Gender, ethnicity and teaching evaluations: Evidence from mixed teaching teams. Economics of Education Review, 54, 79–94
Wallace, S. L., Lewis, A. K., & Allen, M. D. (2019). The State of the Literature on Student Evaluations of Teaching and an Exploratory Analysis of Written Comments: Who Benefits Most? College Teaching, 67(1), 1–14
Wallisch, P., & Cachia, J. (2019). Determinants of perceived teaching quality: the role of divergent interpretations of expectations
Wigington, H., Tollefson, N., & Rodriguez, E. (1989). Students’ ratings of instructors revisited: Interactions among class and instructor variables. Research in Higher Education, 30(3), 331–344
Whitworth, J. E., Price, B. A., & Randall, C. H. (2002). Factors that affect college of business student opinion of teaching and learning. Journal of Education for Business, 77(5), 282–289
Wright, S. L., & Jenkins-Guarnieri, M. A. (2012). Student evaluations of teaching: combining the meta-analyses and demonstrating further evidence for effective use. Assessment & Evaluation in Higher Education, 37(6), 683–699
Youmans, R. J., & Jee, B. D. (2007). Fudging the numbers: Distributing chocolate influences student evaluations of an undergraduate course. Teaching of Psychology, 34(4), 245–247
Young, S., Rush, L., & Shaw, D. (2009). Evaluating Gender Bias in Ratings of University Instructors’ Teaching Effectiveness. International Journal for the Scholarship of Teaching and Learning, 3(2), n2
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors hereby acknowledge no financial or non-financial conflict of interest.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kreitzer, R.J., Sweet-Cushman, J. Evaluating Student Evaluations of Teaching: a Review of Measurement and Equity Bias in SETs and Recommendations for Ethical Reform. J Acad Ethics 20, 73–84 (2022). https://doi.org/10.1007/s10805-021-09400-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10805-021-09400-w