1 Introduction

Over the last decade, the concept of ‘Deep Learning’ (LeCun et al. 2015) in the field of machine learning (ML) has become a central topic of research in artificial intelligence (AI). What is distinctive about this generation of ML is that its algorithms can change (giving rise to the idea of algorithmic agency which we discuss later in the paper) the internal parameters they use ‘to compute representations in each layer from representations in the previous layer’ thereby enabling them to learn from datasets—supervised, unsupervised, etc.—in different ways, and their convolutional neural networks have been ‘designed to process data that comes in the forms of multiple arrays, for example, a colour image composed of three 2D arrays containing pixel intensities in the three colour channels’ (LeCun et al. 2015. p. 436 and 439 italicisation in original). Speculating at the time, LeCun et al. (2015, p. 436) observed that ‘deep learning will have many more successes in the near future because it requires very little engineering by hand, so it can easily take advantage of increases in the amount of available computation and data. New learning algorithms and architectures that are currently being developed for deep neural networks will only accelerate this progress.’ Noting both the prescience of this observation as well as the powerful reservations that Hinton (2023), among others, has recently expressed about Silicon Valley’s direction of ML development, the paper adopts a socio-cultural and -material perspective on the relationship between human + machine learning. This perspective accepts that ML, unlike previous generations of AI or the concept of a cultural tool in Socio-cultural theory more generally, is capable of ‘some kind of learning’ (author), because it generates patterns and predictions from data. As a consequence, the human + machine assemblages emerging in professional work, which we define below, can be viewed as a form of collective learning, albeit different from human collective learning where activities and artefacts can be reimagined.

Our starting point to make this case is Hutchins’ original work (1995a,b on ‘distributed cognition’ and his argument that his concept of ‘cultural ecosystems’ constitutes a unit of analysis to investigate collective human + machine working and learning (Hutchins 2013). We argue that first, the former offers a way to reveal the cultural constitution of and enactment of human + machine cognition and, in the process, the limitations of the computational and connectionist assumptions about learning that underpin, respectively, Good Old-Fashioned AI (GOFAI) and Deep Learning. Second, the latter offers a way to identify, when amplified with insights about algorithmic agency from Socio-Materialism (Orlikowski and Scott 2015; Jaton 2020) and new kinds of hybrid human + machine activity from Cultural-historical Activity Theory (Ekbia and Nardi 2012; 2017), how ML is further rearranging and reorganising the distributed basis of cognition in assistive assemblages as a result of its capability to learn from data and thereby generate new issues for humans to engage with.

We make this argument by first, explaining that cognition, for Hutchins as a result of the Vygotskian provenance of his thinking (Hutchins 1995a, p. 283–4), has always been distributed between mind, technology and environment and that cultural practices have always facilitated that distribution processes. Second, using his symmetrical account of the distribution of cognition to demonstrate that computational (i.e. GOFAI) and connectionist accounts (deep learning) of cognition are predicated on, but eviscerate, the role of cultural practices. Third, demonstrating how Hutchins’s symmetrical unit of analysis offers a way out of a possible impasse between Sociomaterialism and Cultural-historical Activity as regards the symmetry/asymmetry between humans, technologies and practices as ontological categories (Kaptelinin and Nardi 2006; Orlikowski and Scott 2008) by allowing a more nuanced view of the agency of humans and technologies to co-exist alongside one another. The paper concludes by arguing that our amplification of Hutchin’s unit of analysis has enabled us to: (1) develop an inter-theoretical socio-cultural and -material (SCM) perspective on the relationship between human (HL) + ML, and (2) outline a set of conjectures researchers could use to guide their investigations into the ongoing deployment of and challenges associated with the interaction between HL + ML. In drawing this conclusion, we echo Dingemanse et al.’s. (2023) recent invocation to the cognitive science community to recognise that ‘interaction co-constitutes cognition.’

2 Machine + human learning: cognitive science and distributed cognition perspectives

2.1 Cognition, cultural practices and cultural ecosystems

There is a curious tension running through Hutchins’s (1995a,b) formulation of his concept of distributed cognition. Hutchins developed the concept, as he (Hutchins 1995a. p. 367–70) and others have noted (Bazerman 1996, p. 51; Latour 1996, p. 55) as both a critique of prevailing ‘disembodied’ cognitive science computational assumption about the mind as an ‘information processing system’ and the basis of a cultural perspective on the development of cognition, yet he retains the term computational throughout Cognition in the Wild. We commence our argument that Hutchins’s concept of distributed cognition and his unit of analysis—cultural ecosystem—constitutes a symmetrical socio-cultural and -material and therefore non-computational perspective on cognition by explaining the above apparent paradox.

Computational and social processes are, for Hutchins (1995a, p. 283) ‘inextricably intertwined’ because they both have ‘consequences’ for one another. Hutchins explains their intertwined relationship through recourse to Vygotsky’s (1987) ‘general genetic law of cultural development’. The law formed the basis of Vygotsky’s (1987, p. 57) conceptualisation of learning as the ‘internalisation’ of information, knowledge and skill first acquired in interpersonal SCM practices and, following their transformation and personalisation, the ‘externalisation’ of learners’ knowledge and skill through their enactment of those SCM practices. Vygotsky, therefore, identified, for Hutchins (2008, p.2018) the way in which humans organise interactions with the world: ‘Cultural practices organize the interactions of persons with their social and material surroundings. These interactions are the locus of inter-psychological processes. Culturally constituted inter-psychological processes change through historical time. They are also targets for internalization as intra-psychological processes.’ The subtlety of Vygotsky’s law is that, as Hutchins (1995a, p. 283) noted, although he identified how humans internalise cultural practices, Vygotsky never assumed that they externalised them in identical ways: hence the improvisatory and novel ways that people in the same field personalise their use of cultural practices.

Over several decades, Hutchins (1995a, b) demonstrated the ways in which cognition was distributed through his research on navigation teams on the bridge of a ship or the cockpit of a descending aircraft by showing how cognitive processes related to knowledge acquisition, memory and problem-solving procedures, which were too complex to be fully internalised and externalised by any individual actors. Consequently, they are embedded in the social and material division of labour between humans and machines. For instance, a cockpit system—one outcome of the above division of labour—performs cognitive and computational tasks of remembering that can only be explained by the symmetrical interaction between the internal cognitive activity of an aeronautical cultural ecosystem which is composed of: (1) pilots, the intersubjective sharing of representations among the pilots and the use of material artefacts such as devices, technical systems and pieces of paper that contribute to cockpit memory; and, (2) supported by the provision of data from external navigation satellite systems—another outcome of the aeronautical division of labour—that regularly update the cockpit with details about changing weather conditions as well as other details about the flight paths of other aircraft (Hutchins 1995b; Hollan et al. 2000). In the symmetrical perspective the mediating technologies ‘stand with the user as resources used in the regulation of behaviour’ and ‘transform the task the person has to do by representing in the domain where the answer or the path to the solution is apparent’ (Hutchins 1995a p.155). Hence, artefact mediation has always entailed, for Hutchins (1995a p.290), the coordination of the ‘many structural elements’ that organise collective behaviour, rather than as in the cognitive science tradition ‘standing between’ an independent, clearly delineated person and a task.

Furthermore, by extending Vygotsky’s law of genetic development to explicitly take account of the division of labour, Hutchins shed light on how cognition emerges from distributed processes, and therefore, distributed cognition is a perspective on all kinds of cognition. Noting that ‘the inter-psychological level has properties of its own some of which may not be properties of any of the individuals who make it up,’ Hutchins (1995a p.284) draws attention to how that level is created and sustained by the network of SCM interactions, practices and artefacts which as they stabilise over time they historically result in the establishment of cultural, as exemplified by aeronautical navigation. In highlighting the role SCM practices play organising how humans interact with the world by (1) ‘furnishing the world with the cultural artefacts that comprise most of the structure with which we interact’ and (2) ‘orchestrat[ing] our interactions with natural phenomena and cultural artefacts that produce cognitive outcomes’ (Hutchins 2008 p.2018), Hutchins allows us, therefore, to appreciate that computationalism (and as we shall argue below connectionism) eviscerates the role of cultural practices. In the case of the former, he makes visible that first, humans extend their cognitive capacities into the world to use the environment as a ‘partner or cognitive ally’ (Hollan et al. 2000 p.192) to accomplish complex work tasks, by devising SCM practices and artefacts to provide context and resources to entwine for individual and collective social and computational processes. Hutchins (2010b) notes, for example, how navigators compute a ship’s speed after their gyrocompass breaks down by drawing on a set of learnt cultural practices (e.g. three-minute rule, bodily practices and gestures of plotting lines of positions) and materials (e.g. maps, dividers) that ultimately enable them to engage in an activity where ‘what is seen is not simply what is visible’ in the physical environment (Hutchins 2010b p.433). Second, when we give increased attention to real-world activity our understanding of canonical instances of cognitive process changes. Acknowledging that ‘private disembodied thinking is undoubtedly an important kind of thinking,’ Hutchins (2010a p. 712) observes ‘It is also deceptive. Far from being free from the influences of culture, private reflection is a deeply cultural practice that draws on and is enacted in coordination with rich cultural resources.’ For these reasons, he concluded that studies of cognition require a unit of analysis—cultural ecosystem—that can take account of the SCM practices which facilitate the distribution of cognition among humans and machines (Hutchins 1995a p.353–356).

2.2 Conceptions of cognition: computational and connectionist

The reason for this evisceration of cultural practices can be found in the guiding metaphor of the classical ideas about human learning in the literature on AI, until the ‘connectionist turn,’ that human cognition is an artefact capable of symbol manipulation (Boden 2006; 2018). For instance, Simon (1996, p.18–19) argued that the main commonality between human cognition/brain and computers can be conceptualised as the two belonging to the same ‘family of artifacts’ that manipulate symbols (or process information) to meet the demands of the environment. Learning as information processing in the symbolic AI tradition is then equivalent to computations carried out through ‘sequential calculation using symbols that have both physical reality and a semantic, representational value’ (Dupuy 2009 p.64).

Historically, the computational conception of learning has been influential because of the close exchange of ideas between psychologists and AI researchers (Boden 2006) resulting from a shared vision to establish ‘a unified science that would discover the representational and computational capacities of the human mind’ and find their correlates in the human brain (Miller 2003 p.144). This exchange goes back to the 1960s when Psychology, one of the foundational disciplines of Cognitive Science, borrowed concepts and vocabulary from computer science to re-imagine the human mind in computational terms and bring it back as a credible topic of psychological studies (Neisser 1976). With the help of AI terminology such as parallel processing, feature extraction, executive routines, procedures and programmes (Neisser 1976), the human mind was re-conceptualised as a virtual machine or information processing system (Boden 2006, 2018). In return, AI researchers drew on ideas from Psychology and Neuroscience to conceptualise specific models of performance of users and computers (e.g. Card et al. 1983/2008). Cognitive Science thus became ‘the study of mind as machine’ (Boden 2006 p.9) and AI became ‘the science of the mind in the machine’ (Cardon et al. 2018 p.185).

A different account of computation arose, however, as a result of the subsequent connectionist turn in AI (Schmidhuber 2015). Connectionism's main appeal, according to Childers et al. (2023, p.73), lay in its’parsimonious model’ of what they initially refer to as the ‘mind’ but subsequently clarify as the ‘brain’ because, unlike the complex model associated with computationalism, the ‘connectionist model consisted of a simple network made of three (or more) layers’ A major influence on connectionism, according to Cardon et al. (2018), was the behaviourist ideas which informed and inspired cybernetics, rather than the science of the mind in the machine. The fundamental learning principle of early behaviourism can be described as establishing associations and connections between the stimulus (input) and the desired behaviour or response (output). Similarly, early connectionist work was premised on the idea that organisms and machines learn by correcting their erroneous responses (outputs) to inputs, hence learning corresponds to a ‘self-correcting’ mechanism that occurs as the organism/machine is ‘adapting its behaviour according to its own mistakes’ (Cardon et al. 2018 p.185). This key principle of learning through self-correction of output has been taken further in contemporary neural networks that, according to LeCun et al. (2015, p.436), underpin the deep learning generation of ML. They arrive at predictions ‘step by step, through tiresome mechanical processes of gradual adjustment’ and ‘operate on the basis of continuous infinitesimal adjustments’ of output (i.e. algorithm) and input (i.e. a training data set) (Pasquinelli and Joler 2021 p.1271). This assumption has been embedded in the deep learning generation of ML, resulting in learning being premised on cognition ‘without a subject’ (Dupuy 2009 p.19) that processes information from the world in a ‘statistical’, ‘sub-symbolic’, and ‘distributed’ way, since that information is represented by the state of an entire network and each unit can be part of many different overall patterns (Boden 2018 p.150–151).

It is, however, beyond this paper’s scope to judge whether the classic computational view that cognition resembles digital processing where strings are produced in sequence according to the instructions of a (symbolic) program, and the connectionist view of mental processing as the dynamic and graded evolution of activity in a neural net, are very different or whether some kind of accommodation can be established between the two different perspectives (Stanford, 1997 Section 5). For the concern of this paper, what emerged from the computationalist and connectionist accounts of cognition are the assumptions that (a) human cognition and information processing are analogous and (b) learning can, therefore, be viewed as a form of information processing.

2.3 Cultural ecosystems: a symmetrical unit of analysis for researching HL + ML working and learning

There have been, for Hutchins, significant costs in cognitive science—initially computationalism and we extend his argument to apply to connectionism—for failing to consider the role of cultural practices in facilitating cognition, although he does not deny that connectionism has revealed significant insights about cognition and socio-cognitive processes (Hutchins 2010a). The root of the avoidance of any investigation of the relationship between culture and cognition lay, originally, in cognitive science in an ‘overattribution’ of the notion that ‘intelligence is ‘inside the inside/outside boundary’, with the result that cognitive scientists make indirect inferences about cognitive processes that they cannot observe and attribute to “intelligent systems a set of structures and processes that could have produced the observed evidence’ (Hutchins 1995a, p. 355–6).

By putting symbols in the head and viewing computation as an information processing process, cognitive science paved the way for an, initially, uneasy alliance with information theory and ‘speculations by McCullough and Pitts that neurons could be characterized as on-/off devices and …the brain might be sees as a digital machine’ (Hutchins, 1995a, p. 357). Subsequently, the series of breakthroughs in the AI-connectionist community whole-heartedly retained this assumption (Childers et al. 2023; Cardon et al. 2018). Intelligent human behaviour and learning is from a connectionist perspective, as Boden (2018 p.136–140) observed, based on the ‘fire together, wire together’ neuropsychological principle, and entails ‘making adaptive changes in the weights and also sometimes in the connections of artificial neural networks.’ The key to understanding learning as information processing in connectionist approaches is that ‘information’ is seen as a signal devoid of meaning rather than as a symbolic representation or coded information (Cardon et al. 2018 p.184).

The connectionist turn in AI has undoubtedly enabled the deep learning generation of algorithms to develop the functionality to learn from data (see inter alia. Alpaydin 2016; Russell 2019; Woolridge 2021). From Hutchins’s perspective, it retains nevertheless with its predecessor GOFAI the same cognitive science tendency to assume that intelligence lies inside the inside/outside boundary. Consequently, it can be argued that deep learning, in common with GOFAI, has a ‘tendency to put much more inside than should be there’ (Hutchins, 1995a, p. 356). For this reason, the argument that Hutchins advanced in Cognition in the Wild that studies of human + machine cognition require a unit of analysis—cultural ecosystem—to take account of the SCM practices which facilitate the distribution of cognition applies as much to deep learning as GOFAI.

When the relationship between humans + machines is viewed as a cultural ecosystem, it is possible to identify first, the cultural practices which provide regularities, structure and predictability to human + machine working and learning and how those regularities etc., in turn, make cultural practices and their associated artefacts ‘learnable’ and, as such, describable in computational terms as ‘underlying formal processes’ and generalised sets of procedures and rules (Hutchins 2013 p.46). Second, the way in which these cultural practices have ‘cognitive consequences for individuals’ (ibid.) and machines since they are ‘both enabling and constraining’ (Hollan and Hutchins 2009 p.242) cognition; ‘creating’, ‘scaffolding’ and ‘holding in place’ certain ways of thinking, acting and problem-solving (Hutchins 2011 p.440); and ‘blind[ing] us to other ways of thinking, leading us to believe that certain things are impossible when in fact they are possible when viewed differently’ (Hollan et al. 2000 p.187). Hutchins’s unit of analysis allows us, therefore, to move beyond even attempts by writers, such as Clark (2008) who have considered the ecological assemblies surrounding an individual person, and to focus on diverse forms of working and learning that mutually constitute one another and the larger spatial and temporal scales they frequently operate in.

We can appreciate the value of Hutchins’s unit of analysis when we consider the way in which the deep learning generation of machine learning is being developed and deployed. Over the last decade, it has been possible to identify two different types of approaches. The best-known example is the ‘surveillance’ (Zuboff 2019) variety associated with Silicon Valley companies (e.g. Google), which is based on a platform business model designed to monetise platform users’ ‘behavioural surplus.’ The lesser known one we refer to as an—assistive assemblage. This term denotes, on the one hand, the way that expert communities deploy ML to support their activity; and, on the other hand, following Deleuze and Guattari, ML’s ‘dual form’, that is, the content of its design and its mode of expression (Poster and Savat 2010, p. 15–6). Typically, assistive ML algorithms are co-developed by produce–user teams, consisting of computer and data scientists and domain-specific experts, before being interfaced with data pertaining to the specific issue that a producer–user team is working on (Navarrete-Dechent et al. 2018; Choy et al. 2018). Such ML-enabled tools are used by, for example, chemical engineers to optimise models and plan processes in real-time with high accuracy (Dobbelaere et al. 2021), in architectural design to provide expanded opportunities to model design and fabrication options for clients (Tamke et al. 2018) and in healthcare to develop predictive models for cancer diagnosis (Van der Schaar, 2020b).

Using a cultural ecosystem as a unit of analysis for assistive ML assemblages, it is possible to identify how the entanglement of cultural practices and the material facilitates the re-distribution and re-arrangement of cognition as algorithms are constituted and interfaced with data, before being deployed by producer–user teams in their own work and their work with down-the-line user groups and beneficiaries. In doing so, we go with and beyond the grain of Hutchins’s use of his own unit of analysis because he never explored the constitution of algorithms (an issue is discussed in the next section). Focussing on the interaction between members of healthcare producer–user teams and down-the-line user groups, we can see that the former are involved with designing algorithms that are transparent, understandable and validated clinically and, therefore, trusted by down-the-line user groups, such as doctors, nurses and beneficiaries (patients) (van der Schaar and Zame 2018, p.2). To achieve this goal, the producer–user team have developed three new SCM practices to assist members of assemblages to distribute cognition in the constitution and deployment phases. They are practices to enable producer–user teams and down-the-line user groups to ‘interpret’ and ‘explain’ ML-generated patterns and predictions generated, and practices to build a ‘culture of trust’ (van der Schaar 2020a) among user communities about their validity. These SCM practices both enable and constrain the distribution of cognition about health conditions, such as cancer among members of producer–user teams and down-the-line user groups. Moreover, they scaffold and hold in place particular ways to respond to the data, for example, the ML-generated prognosis in relation to the next stage of treatment and the type of out-patient support health systems make available to cancer patients, or the questions about the prognosis based on the data that deep learning algorithms have generated.

The significance of a cultural ecosystem as a unit of analysis is that it enables researchers to have a symmetrical perspective on interactions between humans, artefacts and SCM practices within the human and machine assistive assemblages emerging in professional work contexts. The symmetry between the elements of a cultural ecosystem is understood primarily in methodological terms since for Hutchins (2010a, b p.426): ‘the proper unit of analysis for cognition should not be set a priori, but should be responsive to the nature of the phenomena under study. For some sorts of phenomena, the skin or skull of an individual is exactly the correct boundary (…) For other phenomena, setting the boundary of the unit of analysis at the skin will cut lines of interaction in ways that leave key aspects of the phenomena unexplained or unexplainable.’ Hutchins, however, never used his unit of analysis to explore the distribution of cognition associated with the constitution of technology; instead, he accepted technology as a cultural tool with associated cultural practices that could be analysed within his unit of analysis. To do so, we consider insights from Socio-Material theory and Cultural-historical Activity Theory and then demonstrate how these insights can be used to make an inter-theoretical argument to amplify Hutchins unit of analysis to take account of ML’s capability to engage in ‘some kind of learning’ (author).

3 Human and machine learning: insights from sociomateriality and cultural-historical activity theory

3.1 Sociomateriality and distributed cognition

The term sociomaterial (SM) as Scott and Orlikowski (2014, p. 876–7) note, ‘provides for multiple potential underpinnings.’ Rather than attempt an overview of that diversity of underpinnings, we instead explain the assumptions and influences that the writers we focus on (Jaton and Orlikowski) share as well as with Hutchins. One common foundation is that they adopt a relational ontology. A primary influence on both Orlikowski (see Feldman and Orlikowski 2011) and Jaton’s respective conceptualisation of technology at work is Latour’s (1993) assumption about the existence of a ‘symmetrical’ perspective when researching the human + machine relationship (see Latour 1993a for his analysis of the relationship between Hutchins and his work). Another is to see cultural practices as being bound up with the material means through which they are performed. These assumptions enable Jaton and Orlikowski to, as we highlight below, take explicit account of the operation of algorithms in distributed systems of cognition by highlighting the ways in which SCM practices have been embedded in ML algorithms and how these algorithms shape and are shaped by their use in workplace SCM practices.

Drawing on Latour and other influences (Barad 2007; Schatzki 2002; Suchman 2007), Orlikowski and colleagues’ relational ontology perspective accepts that the social and material in work practice ‘start out and forever remain in relationship’ (Slife, 2005 p.159 in Orlikowski and Scott 2008 p.455). Accordingly, they view the relationship between technology and work as a form of ‘materialised practice’, that is, never having a separate existence or independent characteristics outside of relations in practice. Echoing Hutchins’ concept of emergent cognitive properties of ecosystems as assembled from elements that are not initially pre-established nor settled, Orlikowski and Scott consider material and social relations as ‘mangled together in the process to produce specific, situated instantiations’ (Jones 1998 p.299 in Orlikowski and Scott 2008 p.460).

3.2 Concept of algorithmic agency

The implication for ML in general and for the symmetrical perspective on mediating technologies we outlined, is that ML constitutes dynamic artefacts and new materialised practices. ML can be seen as an ‘algorithmic phenomena’ (Orlikowski and Scott 2016 p.5) that no longer operate on a defined set of computational IF–THEN rules and simple inputs to achieve specific outcomes or perform discrete tasks that could be performed manually by human agents. From Orlikowski and Scott’s (2016 p.90) perspective, the new algorithmic technologies consist of ‘complex, dynamic, and interconnected algorithms’; a ‘relational mash of software code, weighted priorities and filtering processes’ which have been designed to ‘gather, store, assemble and distil’ information about the world (Orlikowski and Scott 2014 p.34). When undertaking a task they have a dynamic capacity to filter, aggregate and ‘process and organise massive amounts of heterogenous data’ of unprecedented ‘volume, velocity and volatility’ (ibid.) and often almost in real-time thus, unlike previous generations of cultural tools (author), reorganising and reconfiguring the tasks and activities of user-produced groups in fundamental and manifold ways.

In conceiving of algorithms as ‘algorithmic-apparatus-in-practice’, that is, ‘materialised practices that perform in the world’, Orlikowski and Scott (2019, p.170) make, unlike Hutchins, algorithmic artefacts’ agency explicit. Algorithmic agency is, however, very different from the evolution of agency from animals to humans (Tomasello 2023). It exists as a result of the way in which (1) human intentions have been embedded into algorithms’ SCM practices, for example, software code, including its weighted priorities, as well as the filtering and aggregation process and (2) this embedding process gives algorithms the capacity to act as executable procedures with an emerging and dynamic temporality and introduce categories that operate with continuous data that flow from distributed sources (Scott and Orlikowski 2014).

These practices, according to Orlikowski and Scott, afford algorithms a perspective on the world, enabling them to be seen to exercise agency by producing new meanings, categories and possibilities for action. Through their capacity to include and exclude aspects of the world and simplify and aggregate information, algorithms ‘dont just search and sort reality, they also create it’ (Orlikowski and Scott 2015 p.214). Illustrating their argument with the online rating practices of TripAdvisor, Scott and Orlikowski (2012 p.113) reveal how the emergence of such algorithmic technologies has transformed the practices of hoteliers and travellers by creating a ‘homogenising effect’ through collapsing and blending previously distinct categories such that ‘different things will be paid attention, connected and compared’ and, in the process, create a SCM ecosystem where travellers become users and different classes of hotels become beneficiaries or rivals. Within this new assemblage, TripAdvisor continually transforms the tasks for users and the actions needed to complete the task by generating data from user activity, prompting users to leave reviews, tacitly nudging travellers in a particular direction by employing automatically generated personalised emails and offering them new categories to make sense of the travel and themselves by, for example, offering users new identities such as ‘star contributor’ and informed traveller (Orlikowski and Scott 2015).

The SCM work practices into which algorithmic technologies have been embedded are continually reconstructed and re-assembled and this is particularly evident in relation to the deep learning generation of ML. Working with Scott (Scott and Orlikowski 2014), Orlikowski showed how the new SCM assemblages re-arrange and re-distribute cognition between humans and machines even further by including new actors in an activity (e.g. users providing public evaluations), excluding others (e.g. expert reviewers), and offering new SCM constraints for travelling (e.g. producing classifications that prioritise certain criteria for evaluation, and determining the weight of criteria when ranking hotels). Scott and Orlikowski’s works make visible the extent to which these re-assembled practices underpinned by ML technologies become new and dynamic structural elements that as they are coordinated organise collective behaviour in new ways.

3.3 Constitution of algorithmic agency

Orlikowski’s approach is, therefore, anticipatory: it conceives algorithmic agency as an ongoing process. This issue has recently been further opened up by Jaton (2020, p. 13) by providing a relational, in his terms, ‘processual’ socio-cultural (drawing on Star and Strauss 1999 and Theureau 2003), and sociomaterial, via his Latourian influences, ontological account of the creation of an ML algorithm for image recognition. Positioning his work as an alternative to sociological work, such as Orlikowski’s on the agency of algorithms, which is primarily concerned with what algorithms do in practice after they have been deployed in distributed ecosystems, Jaton (2020, p.7) identified three different types of SCM practices that facilitate their constitution and hence the basis of their agency. ML algorithms are assembled through: building practices that establish ‘ground-truth databases’ (i.e. conceptualising the problem that an ML algorithm will address and assembling an associated dataset curated to reflect the ground truth); and, ‘programming’ (writing computer programmes to compute data) and ‘formulating’ (transforming social phenomena into ‘mathematical entities’ that can be manipulated in vectoral space) practices.

These SCM practices contribute to the creation of an ‘emergent and intertwined agency’ between humans and machines (Bowker in Jaton 2020 p.i)—in Jaton’s case a research group and their partners at a university—by shedding light on how algorithms are created in a distributed assistive SCM assemblage and then interfaced with data, thereby generating their capacity to have a point of view on the world. Using his three SCM practices, Jaton identifies how an algorithm is assembled: ground-truthing allows problems to be identified, defined, discussed and data to be curated; in other words, cultural practices and entities are translated into mathematical objects; programming embeds connectionist assumptions about perception into algorithms as code is written; and, formulating facilitates an algorithm interaction via neural nets engaging in pattern recognition to encoding different elements of visual, textual or numerical data (Jaton 2020 p. 275–80). When examined as a distributed process underpinned by these three SCM practices, deep learning algorithms act as, for Jaton (2020 p.85), a ‘retrieving entity’ that retrieves and reproduces the preconceptions embedded in ground-truth databases to generate new patterns which, may, be perceived to be verifiable or questioned on the grounds of bias. Bias in ML is, however, for Jaton (2021, p. 3), not a problem that can be improved with technical means; rather, it is constitutive of ML. For example, when the computer scientists he studied recast the saliency detection problem in relation to the assumptions underpinning ML (i.e. from the saliency detection model as a binary object-related problem to a model that can process a larger range of images and detect the contours of faces), Jaton (2020, p. 64) highlights how the connectionist assumptions about perception built into the algorithm placed closure around objects of inquiry and, therefore, post-inquiry deliberations about ML-generated outcomes.

In attending to the distributed cognitive processes of a research group, such as the group’s varied expertise that informed the creation of the new object of inquiry in saliency detection, the shared goal of publishing in a peer-reviewed journal and a range of existing algorithms and programs used to devise new algorithms and mathematical formulae, Jaton makes the invisible visible. He allows a previously hidden set of SCM practices pertaining to the constitution of ML algorithms and the constitution of HL + ML assemblage to become a topic of discussion and negotiation, in addition to well-established technical perspectives on algorithms that examine them as almost exclusively computational entities. In the process, Jaton reveals the subtle but significant difference between the form of algorithmic agency Orlikowski identified—reality creation—and the form of agency he identified—retrieving-generating. These two modes of algorithmic agency reflect the different but interconnected forms of expertise associated with creating and deploying ML technologies at work both of which need to be deployed in assistive HL + ML assemblages, which we turn to below.

3.4 Cultural-historical activity theory and distributed cognition

Writers in Cultural-historical Activity Theory (CHAT) stress slightly different issues when discussing its relationship to Distributed Cognition. Cole and Engeström (1993, p.42), observe that ‘when one takes mediation through artefacts as the central distinctive characteristic of human beings, one is declaring one’s adoption of the view that human cognition is distributed’, whereas the asymmetry of humans and machines is for, Kaptelinin and Nardi (2006), a fundamental ontological and methodological principle because humans have biological and cultural needs upon which they act. They nevertheless acknowledge that technologies and artefacts have a form of agency and, as such, Kaptelinin and Nardi have identified how agency and cognition are distributed for different kinds of human + machine interaction according to their forms of mediation, division of labour, social rules, etc. (see e.g. Nardi 1996, 2005; Ekbia and Nardi 2014). Hence, CHAT is closer to Distributed Cognition when the analysis of human and machine interaction considers the cognitive aspects of that interaction, however, ‘when a cognitive system of like nodes is proposed, distributed cognition has more in common with actor-network theory’ (Kaptelinin and Nardi 2006 p.204). Having acknowledged the complex relationship between CHAT and Distributed Cognition, we tread a fine line between them by focussing on the establishment of new human and machine activity and the respective agency of both.

3.5 Agency and new SCM activity

In comparison to Hutchins, Jaton and Orlikowski, CHAT recognises that an object of activity (i.e. the historically developed and societally defined purpose of an activity) allows us to understand why different sets of concerns resulted in human + machines assemblages which direct ‘cognition and action in activity settings including the use of available technology’ in particular ways (Kaptelinin and Nardi 2012 p.30). In the case of assistive assemblages, the object directs attention towards the interpretability–explainability–trustability nexus.

Originally, Nardi (1996 p.43) followed the CHAT tradition and contended that human and technical agents could not be considered symmetrical because the former is creative and intentional beings whereas the latter are artefacts and ‘an artifact cannot know anything’ (ibid.). Subsequently, the emergence of algorithmic technologies led Nardi (2010 p.153) to accept that some can embody ‘a powerful agency not strictly under human control’ and ‘regulate human behaviour in an expectant manner, drawing them in or pushing them away from certain kinds of activities’ (Ekbia and Nardi 2012 p.158). For example, Nardi highlights how online social media platforms can bring people to work together in joint object-oriented activities such as organising a political protest, thereby learning new forms of activism (Kou et al. 2020), or prompt individuals in online gaming communities to continually develop their learning about a gaming software and offer their expertise to other players (Nardi 2010; Ekbia and Nardi 2012). Echoing Orlikowski and Jaton, Nardi argues SCM tools have a form of agency that is ‘delegated’ by the designers and commissioners’ intentional decisions and actions regarding (a) the problem space the technological artefact is addressing and the conditions and constraints it imposes on how to accomplish the task, and (b) how the users are expected to interact with the artefact. She makes the mediated basis of these new structural elements explicit, however, by noting that they are ‘conditional’ on the distributed cultural ecosystem in which they are deployed, and this cannot be fully anticipated by the creators of tools (Kaptelinin and Nardi 2012 p.41).

Exploring the tight coupling between the accumulation of capital and the development of algorithmic technologies in which the latter often embody the interests and needs of the former, Ekbia and Nardi (2014, p.1) state they have identified a new form of human + computer activity—heteromation—pushing ‘critical tasks to end users as indispensable mediators.’ They noted that in the diverse distributed human + machine assemblages they examined, software artefacts are no longer programmed according to the tenets of GOFAI cognitive science, but instead to ‘expectantly leave gaps to be closed by human intelligence’ (Ekbia and Nardi 2012 p. 167). Hence, heteromated human + computer relationships now include humans in a mediating role rather than replacing their work with computers. Ekbia and Nardi (2014 Section 1.2 para 1; 2017) pursue the implication of their argument through reference to the assemblage of non-ML human–machine ecosystems such as the use of social robots in eldercare in Japan and branchless banking in Brazil. They show how the uptake of technology in both cases was made possible by the mediating working and learning of care workers and merchants who, respectively, extended the technological system within this human + technology assemblage. They either imbued technology with meaning and emotion through narratives and symbols or enabled the technology to work for the users through direct intervention (e.g. helping them with tests, providing advice on technology and services such as loans and saving accounts).

The concept of heteromation is, however, more generative than Ekbia and Nardi acknowledge. It offers a way to elaborate and extend the implications of our earlier observations that ML is algorithmic-apparatus-in-practice which producers–users (e.g. clinicians and patients) can assemble for their own particular needs. The concept of heteromation allows us to see that as producers–users in fields such as healthcare develop and introduce ML, they can (a) negotiate which aspects of the work process are automated (e.g. detection and classification of tumours) and which are heteromated (e.g. diagnostics) and (b) anticipate new internal and external challenges of an HL + ML assemblage. One challenge for producer–user groups is to develop new SCM practices to enable them to explain the ML-generated patterns and predictions to one another to create trust about their veracity, before encouraging wider user groups to use ML as a resource to inform their professional judgement and action. Another challenge is to develop new SCM practices to explain ML-mediated professional judgements to patients and their families to secure their trust in the courses of action being recommended. Heteromation is, therefore, not only ‘potentially creating a new form of expertise’ for professionals and users in the contexts Ekbia and Nardi (2017 p.135) study, but also in other contexts and types of assemblages.

4 Human + machine learning: a SCM symmetrical perspective

4.1 Amplified cultural ecosystems as the unit of analysis

The previous two sections of this paper have made an inter-theoretical argument based on insights from socio-cultural and -material perspectives to engage with the way in which the deployment of ML in assistive assemblages is not only further rearranging and reorganising the distribution of cognition in those contexts, but also the challenge this poses for researching human + machine interaction.

Our starting point for this exploration was our claim that Hutchins’s concept of cultural ecosystems constitutes a unit of analysis to investigate human + machine working and learning because it enables researchers to identify first, the SCM practices which provide regularities, structure and predictability to working and learning and how those regularities etc., in turn, make SCM practices and their associated artefacts learnable. Second, the way in which these SCM practices have cognitive consequences for individuals by both enabling and constraining cognition by holding in place certain ways of thinking, acting and problem-solving, for example, in the case of HL + ML working and learning the interpretability–explainability–trustability nexus.

We nevertheless acknowledged that Hutchins’s concepts of distributed cognition and cultural ecosystem were developed before the emergence of algorithms which are capable of learning from data sets and, as such, ‘bring the future into the present’ by generating predictions from patterns detected in data, for example, ML role in the development of vaccines to combat COVID (Nowotny 2021 p. 10–11). We have, therefore, amplified his unit of analysis—cultural ecosystem—with insights from Sociomateriality (Jaton and Orlikowski) and Cultural-historical Activity Theory (Ekabia and Nardi) to take account of first, algorithms as apparatus-in-practice, that is, SCM practices that as dynamic structural elements perform in the world via the two different kinds of algorithmic agency—reality creation and retrieving-generating—which we have identified. Second, the way that the introduction of ML in work contexts results in a new division of labour—heteromation—with the result that SCM work practices are re-arranged and reorganised and, in the process, heteromation is creating a new form of expertise for members of professional and user communities.

We justified our decision to draw on the work of the above SCM perspectives and writers by noting that they have in common several methodological concerns. These include a concern to (a) operate with a nondualist account of the relationship between humans and machines and (b) understand these relations and the outcomes of human–machine interaction through concepts such as mediation and assemblage that stress performative mutuality and reciprocity. Moreover, they also accept that resources for working and learning have always been distributed between cognition, technology and the environment and acknowledge the agency of objects and, therefore, technological developments result in a re-distribution and re-organisation of working and learning. Hence, these writers have complementary points of view, even if there are sometimes differences of emphasis, thereby allowing for dialogue.

Specifically, our amplification of Hutchins has allowed us to show that first, the distribution of cognition between humans, technology and the environment is reorganised as ML becomes an integral feature of the type of assistive assemblage because emerging algorithms have a form of agency which the algorithms that underpinned the paradigmatic bounded forms of distributed cognition, such as navigation that Hutchins researched, lacked. Second, the SCM practices of ground-truthing, programming and formulating build connectionist assumptions into an algorithm and, in so doing, create its point of view by enabling algorithms to learn from datasets and generate predictions in relation to the field pertaining to that dataset. Third, the heteromation of the work process in assistive HL + ML assemblages calls for new SCM practices to be developed, for example, the interpretability–explainability–trustability nexus, to hold distributed cognition together among producer–user and down-the-line user teams and beneficiaries.

4.2 Researching human + machine working and learning from a cultural ecosystem perspective: a set of conjectures

To assist researchers in using our amplification of Hutchin’s symmetrical cultural ecosystem as a unit of analysis, we have formulated six SCM conjectures they could use to investigate the new modes of working and learning called for in the assistive assemblages emerging in professional work contexts. The conjectures propose that attention should focus on: the constitution of assistive HL + ML assemblages (Conjecture 1), the creation of ML algorithmic agency in those assemblages (Conjecture 2), the multi-faceted consequences of algorithmic agency in HL + ML assemblages (Conjecture 3) and more specifically how the HL + ML assemblage is reshaping: the working and learning that scaffolds the use of ML technology (Conjecture 4), the work process (Conjecture 5) and the assumptions about what constitutes machine as opposed to human learning (Conjecture 6).

Conjecture 1. To reveal the extent to which an assistive HL + ML assemblage is transparent or opaque, it will be helpful to attend to the interplay between purpose and materiality in all its guises.

Focussing on the interplay between the SC concern for the purpose (i.e. object of activity) of human activity and SM concerns for materiality can reveal why different types of HL + ML assemblages—surveillance or assistive (the focus of this paper)—have emerged or are emerging. From this perspective, assistive HL + ML assemblages are the outcome of the underlying needs and intentions of the team who commissioned the assemblage of an algorithm and data source which, in turn, generates affordances and constraints in relation to a range of users and beneficiaries. By examining how these plans, intentions and negotiations are actualised, light can be shed on the extent to which members of producer–user groups first, defined and enacted the rationale, including algorithmic design, underpinning an assistive Hl + ML assemblage in a transparent or opaque way among themselves. Second, involved other parties, for example, user groups and external beneficiaries in a transparent debate about the purpose, choices and practices associated with a new HL + ML assistive assemblage, to develop a culture of trust among them about the outcomes it will generate.

Key research questions: What is the purpose of a HL + ML assistive assemblage? Who was involved in formulating and negotiating the purpose? How transparent was the process and how far are the outcomes trusted and by whom?

Conjecture 2. To account for the mode of algorithmic agency in an HL + ML assemblage, it will be helpful to attend to the interplay between computational assumptions and ground-truthing-programming-formulating SCM practices.

Further focussing on the interplay between purpose and materiality can reveal how connectionist and SCM principles and practices are both central to the algorithmic constitution. From this perspective, algorithms emerge from the interplay between the connectionist assumption that perception and processing of inputs constitutes a model of human learning, and ground-truthing-programming-formulating SCM practices a team used to embed that assumption into an algorithm to generate predictions from the data with which it has been interfaced. Paying attention to the interplay between SCM and computational practices can reveal how a particular mode of agency—reality creation or retrieving-generating—was embedded into an algorithm. This occurs as a team deploys their expertise to select a particular ground truth for investigation, i.e. the definition and conceptualisation of the issue to be investigated, and programme and format an algorithm to use its binary function to process curated numerical, textual or visual data. Furthermore, by observing team dynamics, it is possible to reveal (a) which members were involved in the process of discussion and deliberation about the selection of the data source that the algorithm would investigate; (b) how far they made the assumptions that underpinned that data transparent to minimise the risk they were seen as sources of bias at a later stage, and (c) whether they appreciated that the ground truth they had formulated would, inevitably, introduce preconditions and parameters that influenced its interrogation of data and the predictions generated.

Key research questions: Who was involved with the formulation of ground truths and the selection of data that ML algorithms have been formatted to interact with? Who was involved in the identification process? How were they identified? What criteria and considerations influenced the selection process? How far were the criteria and considerations made explicit when they were shared with user groups?

Conjecture 3. To determine ML algorithms’ multi-faceted consequences in HL + ML assemblages, it is helpful to adopt a retrospective (how far algorithmic consequences were considered during the commissioning and development of HL + ML assemblages) and prospective (how far producer–user groups anticipated that ML-generates predictions will need to be interpreted) perspective.

By focussing on how the interplay between connectionist and ground truthing, programming and formulating practices influence the constitution of algorithms, one can reveal how algorithmic agency can have multi-faceted consequences in HL + ML assemblages. Adopting first a retrospective perspective on algorithmic agency, it is possible to identify how far the teams that commissioned and developed an algorithm had anticipated or were surprised by the way their delegation of agency to algorithms had resulted in the creation of new realities in HL + ML assemblage, and whether producer–user groups were responding to algorithmic ‘nudges’ or problematising them. Algorithmic agency can also contribute to the large-scale transformation of work practices well-documented in ‘surveillance-type’ assemblages (e.g. TripAdvisor’s transformation of travelling for guests and hoteliers alike) that are also taking shape in assistive assemblages, as when ML algorithm, by virtue of its capacity to make predictions based on an unprecedented number of features and their interaction, can suggest new risk factors, or collapse well-established but broad categories of patients by suggesting new categories of patients and personalised medical procedures. Second, by adopting a prospective perspective on algorithmic agency, one may identify how far producer–user groups were aware that M-generated predictions would have to be interpreted and explained, rather than accepted at face value, if they were to become a trusted resource among all interested parties.

Key research questions: To what extent does an algorithm have agency to (a) re-shape old and create new realities? and (b) generate predictions? How far was this envisioned when the algorithm was programmed and formulated?

Conjecture 4. To identify how cognition is being re-distributed and re-arranged in cognition in HL + ML, it will be helpful to attend to the emerging SCM practices required to scaffold human learning with machines capable of learning.

Focussing on the interplay between SCM practices can reveal the different ways in which cognition is being re-distributed and re-arranged as producer–user groups, down-the-line user groups and beneficiaries in HL + ML assemblages interact in different ways in response to the ML-generated predictions. Viewed from this perspective, it is important to identify what kind of new SCM practices are being established by first, producer–user groups to interpret the ML-generated predictions among themselves to determine its validity and significance for down-the-line user groups. Second, producer–user groups to explain the validity and significance of those predictions. To down-the-line user groups to enable them to use it to inform their professional judgement and the courses of action they will recommend to their clients (beneficiaries). Third, down-the-line user groups and beneficiaries to develop a culture of trust, rather than suspicion, as regards ML-generated predictions.

Research questions: How are predictions generated by ML being interpreted and explained? Are predictions and explanations trusted? How are they justified to producer–user groups such as clients, professional groups and beneficiaries? How are predictions being deployed to inform their professional judgement and action?

Conjecture 5. To examine the learning and working challenges of people working with the deep learning generation of ML, it will be helpful to observe changes in the work process (e.g. through automation and heteromation) and how those changes re-shape the way in which different professionals and user groups work together.

By further focussing on the interplay between SCM practices, it is possible to determine how working and learning are re-distributed and re-arranged in HL + ML assemblages. From this perspective, the introduction of ML, with its capacity to learn and generate predictions, has implications for the division of labour and learning practices. It will, therefore, be helpful to explore the extent to which producer–user groups first, deploy ML in the case of the former to (a) automate an entire work process or only phases of a work (e.g. through the use of chatbots to provide initial triage) and (b) heteromate subsequent work processes or phases of work (e.g. including ML in the coordination of care for each patient. Second, identify how automation–heteromation re-structuring has reshaped SCM practices, routines and protocols that previously enabled producer–user and user groups to coordinate their work, and how far this reshaping requires the establishment of new criteria to facilitate coordination and decision-making. In the case of learning, identify how all parties in an HL + ML assemblage are using the new SCM interpreting and explaining practices to justify the value of decisions underpinned by ML-generated predictions among themselves and thereby develop trust in it.

Key research questions: How has the introduction of ML reorganised distributed cognition in emerging HL + ML assemblages? What are the newly emerging relations and division of labour between humans and machines? What aspects of work have been automated and/or heteromated? How are SCM practices being deployed to develop trust in predictions and justify the decisions made based on them?

Conjecture 6. To account for the new forms of expertise and learning challenges related to working with HL + ML assemblages, it will be helpful to attend to the knowledge and beliefs of producer–user groups about the constitution of HL + ML assemblages and the extent to which when these groups take actions they are aware that while assemblages afford some new insights they can ‘blind’ them to others.

By focussing on the re-distribution and re-arrangement of cognition and expertise between humans and machines, it is possible to reveal whether this re-distribution and re-arrangement generates new learning challenges for actors internal and external to an HL + ML assemblage. From this perspective, further light may be shed on how the relationship between connectionist and SCM assumptions and practices impacts expertise in HL + ML assemblages. It will, therefore, be helpful to consider how far members of producer–user groups are first, aware that the connectionist assumptions that have been built into an algorithm and the data with which it has been interfaced create an algorithmic point of view and, as such, place closure around objects of inquiry and as a result post-inquiry deliberations about ML-generated outcomes. Second, are exploring ways to relate these different conceptions of learning and agency to develop new forms of expertise and strengthen professional judgement to engage with ML’s ‘speaking back’ and surprising them thus countering its tendency to encourage certain actions and ‘blind’ them to other actions. Third, are developing practices and artefacts that will ‘hold in place’ two different conceptions of learning and agency as a resource for working with ML.

Key research questions: What new types of expertise does the introduction of ML entail? What types of revised or new SCM work practices are emerging? What conception of learning is underpinning that SCM work practice? What do producer–user groups need to know about human learning and machine learning to underpin their work with ML?

5 Conclusion

The paper is part of a tradition of making an inter-theoretical argument based on either the incorporation of insights from SCM theories (Hasse 2020) or the reappraisal of SC theories and perspectives (Karanasios et al. 2021), to analyse the challenges posed by ML. Our amplification of Hutchin’s unit of analysis for human + machine interaction—cultural ecosystem—has broadened his concept by taking account of ML’s agentic capability to learn as well as the new heteromated division of labour and, therefore, the work context for assistive assemblages. Furthermore, by formulating a number of conjectures to reflect our amplification of Hutchins’ unit of analysis, we have provided researchers with a non-dualistic and non-human-exceptionalist methodological approach to investigate how cognition is re-distributed symmetrically by new SCM practices among machines and humans in emerging assistive HL + ML assemblages. The conjectures are non-dualistic because, following Hutchins (2013 p. 1 and 12), we view human cognition as ‘embedded’ and ‘distributed’ in cultural ecosystems and co-constituted through interaction between humans and non-humans. Moreover, this interaction can be ‘fractural’ in other words, denser and sparser depending upon the enactment of its purpose. The conjectures are also non-human-exceptionalist, because they invite researchers to focus on co-constituted human + machine interaction and connectivity, rather than on the rich dimensions of human learning and creativity that the Vygotskyan tradition that was an influence on Hutchins has drawn attention to.

Finally, the paper was conceptualised before ChatGPT and other Generative Pre-Trained Performers (GPTs) were launched. We acknowledge that GPTs could be incorporated as an additional element in an assistive assemblage to facilitate ground-truthing or programming. GPTs are unlikely at present, however, to replace the pattern generation and predictive forms of ML that are the focus of our paper. Consequently, we suggest that the conjectures we propose can nevertheless aid researchers in examining the transformative potential GPTs’ incorporation in an assistive assemblage may have on professional practices in a range of settings.