Abstract
Though rapid collection of big data is ubiquitous across domains, from industry settings to academic contexts, the ethics of big data collection and research are contested. A nexus of data ethics issues is the concept of creep, or repurposing of data for other applications or research beyond the conditions of original collection. Data creep has proven controversial and has prompted concerns about the scope of ethical oversight. Institutional review boards offer little guidance regarding big data, and problematic research can still meet ethical standards. While ethics seem concrete through institutional deployment, I frame ethics as produced. Informed by my ethnographic research at a large public university in the U.S., I explore ethics through two models: ethics as institutional procedures and ethics as acts and intentions. The university where I conducted fieldwork is the development grounds for a predictive model that uses student data to anticipate academic success. While students consent to data collection, the circumstances of consent and the degree to which they are informed are not so apparent, as many data are a product of creep. Drawing from interviews and participant observation with administrators, data scientists, developers, and students, I examine data ethics, from a larger institutional model to everyday enactments related to data creep. After demonstrating the limits of such models, I propose a remodeling of ethics that draws on recent works on data, justice, and refusal to pose generative questions for rethinking ethics in institutional contexts.
Similar content being viewed by others
Availability of Data and Materials
Not applicable.
Code Availability
Not applicable.
Notes
Derived from “function creep.” Notably, Koops (2021) has described function creep as a “pejorative” term for innovation.
Much of this research is on “learning analytics,” which encompasses a broad range of analytics projects including predictive tools, in-classroom analytics, and pedagogical interventions. The specific ethnographic context I discuss is less focused on learning, and so “data analytics” or “predictive analytics” is more descriptive of my participants’ work.
The flexibility of “models” and “modeling” is evident in existing scholarship on science, technology, and ethics, and my employment of these terms follows from such work. For example, Reardon et al. (2015) use “modeling” and “models” to discuss their efforts to create space and collective interventions on issues of science and justice, Moss and Metcalf (2020) report on ethics models, and in talks Hoffmann (2020) has also discussed “models” more expansively than I do here to further complicate notions of models and modeling. I have also used "modeled" to describe the formation of certain kinds of predicted subjects in data analytics work (see Whitman 2020b).
This is a reference to governance in George Orwell’s 1984 and extensive surveillance in everyday life.
References
Ahmed, S. (2017). No. feministkilljoys. Retrieved July 7, 2021 from https://feministkilljoys.com/2017/06/30/no/.
Ananny, M., & Crawford, K. (2018). Seeing without knowing: Limitations of the transparency ideal and its application to algorithmic accountability. New Media & Society, 20(3), 973–989. https://doi.org/10.1177/1461444816676645
Barocas, S., & boyd, d. (2017). Engaging the ethics of data science in practice. Communications of the ACM, 60(11), 23–25. https://doi.org/10.1145/3144172
Benjamin, R. (2016). Informed refusal: Toward a justice-based bioethics. Science, Technology, & Human Values, 41(6), 967–990. https://doi.org/10.1177/0162243916656059
Bernard, H. R. (2011). Research methods in anthropology: Qualitative and quantitative approaches. AltaMira Press.
Bourdieu, P. (1977). Outline of a theory of practice. Cambridge University Press.
Bourdieu, P. (1980). The logic of practice. Stanford University Press.
boyd, d. (2016). Untangling research and practice: What facebook’s “emotional contagion” study teaches us. Research Ethics, 12(1), 4–13. https://doi.org/10.1177/1747016115583379
boyd, d., & Crawford, K. (2012). Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon. Information, Communication & Society, 15(5), 662–679. https://doi.org/10.1080/1369118X.2012.678878
Cifor, M., Garcia, P., Cowan, T.L., Rault, J., Sutherland, T., Chan, A., Rode, J., Hoffmann, A.L., Salehi, N., & Nakamura, L., (2019). Feminist data manifest-no. Retrieved January 12, 2021 from https://www.manifestno.com/.
Cooky, C., Linabary, J. R., & Corple, D. J. (2018). Navigating big data dilemmas: Feminist holistic reflexivity in social media research. Big Data & Society, 5(2), 1–12. https://doi.org/10.1177/2053951718807731
Cool, A. (2019). Impossible, unknowable, accountable: Dramas and dilemmas of data law. Social Studies of Science, 49(4), 503–530. https://doi.org/10.1177/0306312719846557
Corple, D. J., & Linabary, J. R. (2020). From data points to people: Feminist situated ethics in online big data research. International Journal of Social Research Methodology, 23(2), 155–168. https://doi.org/10.1080/13645579.2019.1649832
Custers, B. (2016). Click here to consent forever: Expiry dates for informed consent. Big Data & Society, 3(1), 1–6. https://doi.org/10.1177/2053951715624935
Department of Health and Human Services. (2018). Revised common rule. Retrieved December 20, 2020 from https://www.hhs.gov/ohrp/regulations-and-policy/regulations/finalized-revisions-common-rule/index.html/.
D’Ignazio, C., & Klein, L. F. (2020). Data feminism. The MIT Press.
Dencik, L., Hintz, A., & Cable, J. (2016). Towards data justice? The ambiguity of anti-surveillance resistance in political activism. Big Data & Society, 3(2), 1–12. https://doi.org/10.1177/2053951716679678
Drachsler, H., & Greller, W. (2016). Privacy and analytics—It’s a DELICATE issue: A checklist for trusted learning analytics. In Proceedings of the sixth international conference on learning analytics & knowledge—LAK ’16 (pp. 89–98). Presented at the the Sixth International Conference, Edinburgh, United Kingdom: ACM Press. Doi: https://doi.org/10.1145/2883851.2883893.
Flick, C. (2016). Informed consent and the facebook emotional manipulation study. Research Ethics, 12(1), 14–28. https://doi.org/10.1177/1747016115599568
Froomkin, A. M. (2019). Big data: Destroyer of informed consent. Yale Journal of Health Policy, Law, and Ethics, 18(3), 1–29.
Ganesh, M. I. (2019). The difference that difference makes. Spheres: Journal for Digital Cultures, 5, 1–6.
Greene, D., Hoffmann, A.L., & Stark, L. (2019). Better, nicer, clearer, fairer: A critical assessment of the movement for ethical artificial intelligence and machine learning. In Proceedings of the 52nd Hawaii international conference on system sciences (pp. 2122–2131).
Haggerty, K. D. (2004). Ethics creep: Governing social science research in the name of ethics. Qualitative Sociology, 27(4), 391–414. https://doi.org/10.1023/B:QUAS.0000049239.15922.a3
Haraway, D. (1988). Situated knowledges: The science question in feminism and the privilege of partial perspective. Feminist Studies, 14(3), 575–599. https://doi.org/10.2307/3178066
Hilgartner, S., Prainsack, B., & Hurlbut, J. B. (2017). Ethics as governance in genomics and beyond. In U. Felt, R. Fouché, C. A. Miller, & L. Smith-Doerr (Eds.), The handbook of science and technology studies (pp. 823–852). The MIT Press.
Hockey, J., & Forsey, M. (2012). Ethnography is not participant observation: Reflections on the interview as participatory qualitative research. In J. Skinner (Ed.), The interview: An ethnographic approach (pp. 69–87). Bloomsbury Academic.
Hoffmann, A. L. (2020). Something had been ruined forever: Interrupting AI ethics. Presented at NeurIPS.
Hoffmann, A. L. (2021). Even when you are a solution you are a problem: An uncomfortable reflection on feminist data ethics. Global Perspectives, 2(1), 1–5. https://doi.org/10.1525/gp.2021.21335
Holland, D., Lachicotte, W., Jr., Skinner, D., & Cain, C. (1998). Identity and agency in cultural worlds. Harvard University Press.
Hope, A. (2018). Creep: The growing surveillance of students’ online activities. Education and Society, 36(1), 55–72. https://doi.org/10.7459/es/36.1.05
Jasanoff, S. (2005). Designs on nature: Science and democracy in Europe and the United States. Princeton University Press.
Jasanoff, S. (2011). Constitutional moments in governing science and technology. Science and Engineering Ethics, 17(4), 621–638. https://doi.org/10.1007/s11948-011-9302-2
Jasanoff, S. (2016). The ethics of invention: Technology and the human future. W. W. Norton & Company.
Jasanoff, S. (2017). Virtual, visible, and actionable: Data assemblages and the sightlines of justice. Big Data & Society, 4(2), 1–15. https://doi.org/10.1177/2053951717724477
Jones, K. M. L. (2019a). “Just because you can doesn’t mean you should”: Practitioner perceptions of learning analytics ethics. Portal: Libraries and the Academy, 19(3), 407–428. https://doi.org/10.1353/pla.2019.0025
Jones, K. M. L. (2019b). Learning analytics and higher education: A proposed model for establishing informed consent mechanisms to promote student privacy and autonomy. International Journal of Educational Technology in Higher Education, 16(1), 1–22. https://doi.org/10.1186/s41239-019-0155-0
Keyes, O., Hutson, J., & Durbin, M. (2019). A mulching proposal: Analysing and improving an algorithmic system for turning the elderly into high-nutrient slurry. In Extended abstracts of the 2019 CHI conference on human factors in computing systems (pp. 1–11). Presented at the 2019 CHI Conference, Glasgow, Scotland: ACM Press. Doi: https://doi.org/10.1145/3290607.3310433.
Keyes, O., Hitzig, Z., & Blell, M. (2021). Truth from the machine: Artificial intelligence and the materialization of identity. Interdisciplinary Science Reviews, 46(1–2), 158–175. https://doi.org/10.1080/03080188.2020.1840224
Kitto, K., & Knight, S. (2019). Practical ethics for building learning analytics. British Journal of Educational Technology, 50(6), 2855–2870. https://doi.org/10.1111/bjet.12868
Knox, J., Williamson, B., & Bayne, S. (2020). Machine behaviourism: Future visions of ‘learnification’ and ‘datafication’ across humans and digital technologies. Learning, Media and Technology, 45(1), 31–45. https://doi.org/10.1080/17439884.2019.1623251
Koops, B. (2021). The concept of function creep. Law, Innovation and Technology, 13, 29–56.
Kosinski, M., & Wang, Y. (2020). Author’s note. August 10. Retrieved July 8, 2021 from https://docs.google.com/document/d/11oGZ1Ke3wK9E3BtOFfGfUQuuaSMR8AO2WfWH3aVke6U/edit.
Leurs, K. (2017). Feminist data studies: Using digital methods for ethical, reflexive and situated socio-cultural research. Feminist Review, 115(1), 130–154. https://doi.org/10.1057/s41305-017-0043-1
Liboiron, M. (2017). Compromised agency: The case of babylegs. Engaging Science, Technology, and Society, 3, 499. https://doi.org/10.17351/ests2017.126
Mamo, L., & Fishman, J. R. (2013). Why justice? Introduction to the special issue on entanglements of science, ethics, and justice. Science, Technology, & Human Values, 38(2), 159–175.
Markham, A. N., Tiidenberg, K., & Herman, A. (2018). Ethics as methods: Doing ethics in the era of big data research—Introduction. Social Media + Society, 4(3), 1–9. https://doi.org/10.1177/2056305118784502
Mattson, G. (2017). Artificial intelligence discovers gayface. Sigh. Scatterplot. Retrieved September 30, 2020 from https://scatter.wordpress.com/2017/09/10/guest-post-artificial-intelligence-discovers-gayface-sigh/.
Metcalf, J. (2016). Big data analytics and revision of the common rule. Communications of the ACM, 59(7), 31–33. https://doi.org/10.1145/2935882
Metcalf, J. (2017). ‘The study has been approved by the IRB’: Gayface AI, research hype and the pervasive data ethics gap. Medium. Retrieved September 30, 2020 from https://medium.com/pervade-team/the-study-has-been-approved-by-the-irb-gayface-ai-research-hype-and-the-pervasive-data-ethics-ed76171b882c/.
Metcalf, J., & Crawford, K. (2016). Where are human subjects in big data research? The emerging ethics divide. Big Data & Society, 3(1), 1–14. https://doi.org/10.1177/2053951716650211
Metcalf, J., Moss, E., & boyd, d. (2019). Owning ethics: Corporate logics, silicon valley, and the institutionalization of ethics. Social Research, 86(2), 449–476.
Mittelstadt, B. D., Allo, P., Taddeo, M., Wachter, S., & Floridi, L. (2016). The ethics of algorithms: Mapping the debate. Big Data & Society, 3(2), 205395171667967. https://doi.org/10.1177/2053951716679679
Moats, D., & Seaver, N. (2019). “You social scientists love mind games”: Experimenting in the “divide” between data science and critical algorithm studies. Big Data & Society, 6(1), 1–11. https://doi.org/10.1177/2053951719833404
Moss, E., & Metcalf, J. (2020). Ethics owners: A new model of organizational responsibility in data-driven technology companies. Data & Society.
Neff, G., Tanweer, A., Fiore-Gartland, B., & Osburn, L. (2017). Critique and contribute: A practice-based framework for improving critical data studies and data science. Big Data, 5(2), 85–97. https://doi.org/10.1089/big.2016.0050
O’Connell, A. (2016). My entire life is online: Informed consent, big data, and decolonial knowledge. Intersectionalities, 5(1), 68–93.
Ortner, S. B. (2006). Anthropology and social theory: Culture, power, and the acting subject. Duke University Press.
Prinsloo, P., & Slade, S. (2015). Student privacy self-management: Implications for learning analytics. In Proceedings of the fifth international conference on learning analytics and knowledge (pp. 83–92). ACM Press. Doi: https://doi.org/10.1145/2723576.2723585.
Prinsloo, P., & Slade, S. (2016). Student vulnerability, agency and learning analytics: An exploration. Journal of Learning Analytics, 3(1), 159–182. https://doi.org/10.18608/jla.2016.31.10
Prinsloo, P., & Slade, S. (2017a). Big data, higher education and learning analytics: Beyond justice, towards an ethics of care. In B. Kei-Daniel (Ed.), Big data and learning analytics in higher education (pp. 109–124). Springer.
Prinsloo, P., & Slade, S. (2017b). An elephant in the learning analytics room: The obligation to act. In Proceedings of the seventh international learning analytics & knowledge conference (pp. 46–55). Presented at the LAK ’17: 7th International Learning Analytics and Knowledge Conference, Vancouver, British Columbia, Canada: ACM. Doi: https://doi.org/10.1145/3027385.3027406.
Reardon, J. (2017). The postgenomic condition: Ethics, justice and knowledge after the genome. The University of Chicago Press.
Reardon, J., Metcalf, J., Kenney, M., & Barad, K. (2015). Science & justice: The trouble and the promise. Catalyst: Feminism, Theory, Technoscience, 1(1), 1–49. https://doi.org/10.28968/cftt.v1i1.28817
Richterich, A. (2018). The big data agenda: Data ethics and critical data studies. University of Westminster Press.
Rubel, A., & Jones, K. M. L. (2016). Student privacy in learning analytics: An information ethics perspective. The Information Society, 32(2), 143–159. https://doi.org/10.1080/01972243.2016.1130502
Sandler, J., & Thedvall, R. (2017). Exploring the boring: An introduction to meeting ethnography. In J. Sandler & R. Thedvall (Eds.), Meeting ethnography (pp. 1–23). Routledge.
Scholes, V. (2016). The ethics of using learning analytics to categorize students on risk. Educational Technology Research and Development, 64(5), 939–955. https://doi.org/10.1007/s11423-016-9458-1
Schrag, Z. M. (2011). The case against ethics review in the social sciences. Research Ethics, 7(4), 120–131. https://doi.org/10.1177/174701611100700402
Selwyn, N. (2019). What’s the problem with learning analytics? Journal of Learning Analytics, 6(3), 11–19. https://doi.org/10.18608/jla.2019.63.3
Shilton, K. (2013). Values levers: Building ethics into design. Science, Technology, & Human Values, 38(3), 374–397. https://doi.org/10.1177/0162243912436985
Simpson, A. (2007). On ethnographic refusal: Indigeneity, ‘voice’ and colonial citizenship. Junctures, 9, 67–80.
Simpson, A. (2017). The ruse of consent and the anatomy of ‘refusal’: Cases from indigenous North America and Australia. Postcolonial Studies, 20(1), 18–33. https://doi.org/10.1080/13688790.2017.1334283
Slade, S., & Prinsloo, P. (2013). Learning analytics: Ethical issues and dilemmas. American Behavioral Scientist, 57(10), 1510–1529. https://doi.org/10.1177/0002764213479366
Slade, S., & Prinsloo, P. (2014). Student perspectives on the use of their data: Between intrusion, surveillance and care. In Proceedings of the European distance and e-learning network research workshop (pp. 291–300). Presented at the Challenges for Research into Open & Distance Learning, Oxford.
Stark, L. (2012). Behind closed doors: IRBs and the making of ethical research. The University of Chicago Press.
Stark, L., & Hoffmann, A. L. (2019). Data is the new what? Popular metaphors & professional ethics in emerging data culture. Journal of Cultural Analytics. https://doi.org/10.22148/16036
Sun, K., Mhaidli, A.H., Watel, S., Brooks, C.A., & Schaub, F. (2019). It’s my data! tensions among stakeholders of a learning analytics dashboard. In Proceedings of the 2019 CHI conference on human factors in computing systems (pp. 1–14). Presented at the 2019 CHI Conference, Glasgow, Scotland: ACM Press. Doi: https://doi.org/10.1145/3290605.3300824.
TallBear, K. (2013). Native American DNA: Tribal belonging and the false promise of genetic science. University of Minnesota Press.
Taylor, L. (2017). What is data justice? The case for connecting digital rights and freedoms globally. Big Data & Society, 4(2), 1–14. https://doi.org/10.1177/2053951717736335
Thaler, R. H., & Sunstein, C. R. (2008). Nudge: Improving decisions about health, wealth, and happiness. Yale University Press.
Wang, Y., & Kosinski, M. (2017). Deep neural networks are more accurate than humans at detecting sexual orientation from facial images. Journal of Personality and Social Psychology, 114(2), 246–257.
Weinberg, L. (2020). Feminist research ethics and student privacy in the age of AI. Catalyst: Feminism, Theory, Technoscience, 6(2), 1–10.
Whitman, M. (2020a). “We called that a behavior”: The making of institutional data. Big Data & Society, 7(1), 1–13. https://doi.org/10.1177/2053951720932200.
Whitman, M. (2020b). Bodies of data: The social production of predictive analytics. West Lafayette: Purdue University.
Williamson, B., Potter, J., & Eynon, R. (2019). New research problems and agendas in learning, media and technology: The editors’ wishlist. Learning, Media and Technology, 44(2), 87–91. https://doi.org/10.1080/17439884.2019.1614953
Willis, J. E., Slade, S., & Prinsloo, P. (2016). Ethical oversight of student data in learning analytics: A typology derived from a cross-continental, cross-institutional perspective. Educational Technology Research and Development, 64(5), 881–901. https://doi.org/10.1007/s11423-016-9463-4
Wood, L. A., & Kroger, R. O. (2000). Doing discourse analysis: Methods for studying action in talk and text. SAGE Publications.
Yeung, K. (2017). ‘Hypernudge’: Big data as a mode of regulation by design. Information, Communication & Society, 20(1), 118–136. https://doi.org/10.1080/1369118X.2016.1186713
Zeide, E. (2016). Student data privacy: Going beyond compliance. National Association of State Boards of Education, 16, 21–25.
Zook, M., Barocas, S., boyd, d., Crawford, K., Keller, E., Gangadharan, S. P., Goodman, A., Hollander, R., Koenig, B. A., Metcalf, J., & Narayanan, A. (2017). Ten simple rules for responsible big data research. PLOS Computational Biology, 13(3), 1–10. https://doi.org/10.1371/journal.pcbi.1005399
Zwitter, A. (2014). Big data ethics. Big Data & Society, 1(2), 1–6. https://doi.org/10.1177/2053951714559253
Acknowledgements
I wish to thank the topical collection editors, Nina Frahm and Kasper Schiølin, for their feedback on this manuscript at various stages, and a special thanks to Nina in particular for her input on revisions. Geneva Smith provided early helpful commentary on an initial draft. Members of the Science, Knowledge, and Technology working group at Columbia University generously gave constructive feedback; thank you to Gil Eyal, Diane Vaughan, Josh Whitford, Joonwoo Son, Larry Au, Ari Galper, and Maïlys Gantois. My thinking about ethics and justice has been influenced by conversations with Kendall Roark while working with her on research supported by a Purdue Mellon Global Grand Challenges Grant for Big Data Ethics. Finally, I also want to thank the three anonymous reviewers for their deep readings and critique, which were tremendously useful in improving the manuscript.
Funding
This work was supported by a Purdue Research Foundation Research Grant.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Not applicable.
Ethics Approval and Consent to Participate
This study was approved by the Institutional Review Board at Purdue University. Informed consent was obtained from all participants in the study.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Whitman, M. Modeling Ethics: Approaches to Data Creep in Higher Education. Sci Eng Ethics 27, 71 (2021). https://doi.org/10.1007/s11948-021-00346-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11948-021-00346-1