1 Introduction

Human–machine interaction is situated in an interdisciplinary debate, in which very different definitions and meanings of the core terms interaction and information are discussed due to different contexts. The interaction between human and machine can be interpreted in both ways: on the one hand, the machine belongs to the acting individual as her extending mind (Clark & Chalmers, 1998). This is based on the theory of extended cognition, which states that cognitive processes occur not only in biological individuals but also in the surrounding world, including the technical and social environment. The machine, e.g., a mobile phone, is part of the shared world and provides an efficient connection between people (Smart, 2017: 363). On the other hand, we expect that the machine can be an interaction partner that is not just a tool but a real counterpart for successful collaboration (Brinck & Balkenius, 2018; Sandry, 2015). In the special case of social robots or social AI, both aspects play a fundamental role in exploring the relationship between the human agent and the artificial partner, improving the user’s experience in using machines (Breazeal et al., 2016). In order to adequately describe the conditions for successful human–machine interaction, a suitable concept of information is needed which is not limited to a syntactic description of information but must also contain a semantic meaning that can be understood through use. The classic information concept in communication theory is not sufficient to explain how meaning arises, how it is incorporated into communicative action patterns. We argue that information as pragmatic action patterns can be understood as physically embodied units that is not purely formal but generated in interactive processes and stabilized in use.

We start with the concept of triangulation in social cognition research. The triadic relation in social interaction offers not only an appropriate model to describe interpersonal interaction but also points out the conditions for social cognition. In the frame of the 4E cognition model, human cognition is to be understood as embodied, embedded, enactive, and extended (Menary, 2010; Rowlands 2010). Having illustrated the triangulation model in the study of social cognition, we will turn to current technological developments in terms of the approach to Active Technological Environments. We discuss the post-phenomenological approach of mediated technology and the theory of material engagement. The central focus is on examining the interactive relationship between human cognitive abilities and their technological environments. By combining the human-technology relationship in general with the 4E cognitive model, we draw attention to the accessibility of interaction in the current description of the human–machine relationship. In accordance with the theoretical strategies of social cognition and technical philosophy, we propose a pragmatic interactive approach to understanding the concept of information in human–machine interaction.

2 Triangulation in Interaction

The concept of triangulation in social interaction is an alternative approach to the classical dyadic sender-receiver model of communication. The basic assumption of the latter is that both partners, the sender and the receiver, obtain symmetrical positions because of their shared function of information processing. In this model, meaning in communication is replaced by the term information. For example, in the theory of Shannon and Weaver (1949) information remains purely syntactic and is explicitly designed independently of levels of meaning. However, in this model, it is not clear how humans acquire the ability to receive, transform, and understand meaning in social interactions. The transformation of explicit information implies material conditions and normative frames of meanings. These conditions are committed and entitled in the process of social practices, in which pragmatic patterns of language uses are stabilized in the form of pattern of communicative actions (Brandom, 1994).

Persons need experience to learn and to develop skills and abilities from implicit knowing-how to explicit knowing-that (Ryle, 1949). The concept of triangulation aims at explaining this issue. The triangle consists of an embodied relation between two or more partners and the shared world/objects around them. As Davidson (2001: XV) has put it, “this three-way relation among two speakers and a common world I call ‘triangulation’.” Triangulation is understood as “a result of a threefold interaction, an interaction which is twofold from the point of view of each of the two agents: each is interacting simultaneously with the world and with the other agent” (ibid.: 128). Specifically, triangulation arises from the two-way interaction of three parties, eventually forming a basic model of social cognition and interactive relationships. Each agent in the triangulation is entitled to interact with the other and shares an exchangeable “self-other” relationship with the other agent. Triangulation has been a guiding principle in the research of social interaction between persons (Fuchs, 2012; Tomasello, 1999). In recent years, human–machine interaction research has increasingly focused on social cognition and social interaction (Bartneck et al., 2020; Brinck & Balkenius, 2018; Fischer et al., 2011; Saunders et al., 2016). Therefore, we seek to clarify the basic model of social cognition that can lead to a better understanding of interaction between humans and other types of agents.

3 Human–Human Interaction

The basic assumption for triangulation is the dyadic engagement, which can be differentiated in two kinds: either interaction between one agent and objects or between two agents. According to the literature in Early Child Development, both primary dyad relations require joint engagement or joint attention (Trevarthen, 1979, Tomasello, 1999: 62, Fuchs & De Jaegher, 2009). In expanding a child’s understanding of the other’s perspective, these two kinds of dyadic relations are combined and transformed into triadic relation through declarative pointing and intersubjective gesture from the other agent (Fuchs, 2012: 13). In a nutshell, declarative pointing affords the access to symbolic interaction (Fuchs, 2012: 13, Werner & Kaplan, 1963: 63f.). During that process of embodied symbolic interaction, like in the act of pointing, objects transform into shared symbolic objects. This opens the enclosed primary dyad for triadic relation, whereas the other agent shifts into another perspective on the objects. This triadic relation gives rise to an object triangulation, which necessarily leads to co-awareness of self and other, namely, to shared intentionality (Tomasello et al., 2005; Fuchs, 2012: 13). To put it more precisely, in triangulation, the other agent is perceived as an intentional agent.Footnote 1 This has been discussed by Tomasello. In his research, the triadic relation is necessary for a child to get into the shared intentional space, in which they learn how to engage with others and objects. For a young child, the objects are discovered by observing how the adults use them in shared situations. “By engaging this imitative learning, the child joins the other person in affirming what ‘we’ use this object ‘for’” (Tomasello, 1999: 84). In the frame of Early Developmental Psychology, the triadic relation between the child, the caregiver, and the object provides the possibilities for entering the joint intentional space and achieving the knowledge of how to engage with the object. “Children now come to comprehend how ‘we’ use the artifacts and practices of our culture — what they are ‘for.’ Monitoring the intentional relations of others to the outside world also means that the infant — almost by accident, as it were — monitors the attention of other person as they attend to her. This then starts the process of self-concept formation, in the sense of the child understanding how others are regarding ‘me’ both conceptually and emotionally” (ibid.: 91). That means, the social cultural group into which the child is born forms a certain habitus. In particular, social practices generate certain forms of habitus as structured frames in their long historical process. These frames shape cognitive and motivating conditions for further social actions. In other words, a single action is constrained by performance schemes based on earlier practical experiences. With these experiences, the child acquires pragmatic patterns of actions, which are then also combined with its language learning. Accordingly, the “active instructions from adults” (ibid.: 79) also play a fundamental role in raising a child in their familiar surroundings.

Object-triangulation (human–human-object) has mostly been investigated in social cognition research, although the involvement of a third person plays a fundamental role in the development of social cognition. This type of interpersonal triangulation includes a third person (human–human + human) and is not necessarily tied to an explicit object. The third person plays a significant role for understanding other’s perspective, since the third person provides a view from the outside as observer or witness. The integration of a third view on the dyadic relation itself leads to a triadic relation of interaction, which provides the possibility of the self-other metaperspective (Fuchs, 2012: 14; Fivaz-Depeursinge & Corboz-Warnery, 1999). The key point is that the agent can become aware of her own point of view and that of others. This awareness of different perspectives allows her to shift and compare distinct views in a flexible way: “[T]his shifting and comparing of perspectives is only possible from a vantage point at a higher level, namely from a self-other metaperspective which provides an equidistance to both one’s own and the other’s point of view — a ‘bird’s eye view’” (Fuchs, 2012: 16).

The development of understanding other’s perspective begins with sharing perspectives in joint attention in the first year after birth. At around 2.5 years of age, children are able to take on different perspectives and at least at an age of 4–5 they start to really confront perspectives, meaning that they are now able to understand others’ perspectives and beliefs (Moll and Meltzoff 2011a, b). In this developmental process, sensorimotor skills are embodied with learning language abilities. First, to acquire explicit knowledge of oneself and of others means to run through triadic interactive experiences, which are embodied, e.g., in meaningful gestures, and embedded in shared situational contexts. The origin of social understanding in interpersonal interaction is founded on intercorporality and interaffectivity, which implies “a primary co-awareness of self-with-other” (Fuchs, 2012: 24). Second, to achieve a metaperspective requires social interaction, which involves reciprocity and reversibility of standpoints and speech roles (Fuchs, 2012; Stawarska, 2009). Especially in verbal communication, the self-other metaperspective plays a crucial role in flexible perspective shifting. As Merleau-Ponty (2000: 150) has pointed out, “The ‘I’ arises when the child understands that every ‘you’ that is addressed to him is for him an ‘I’; that is, there must be a consciousness of the reciprocity of points of view in order that the word ‘I’ may be used … The pronoun ‘I’ has its full meaning only when the child uses it not as an individual sign to designate his own person […] but when he understands that each person is an ‘I’ for himself and a ‘you’ for others”. This means that in achieving the capability of perspective switching by language learning, it is possible to build recursive sentences such as “I believe that you believe what I believe.” This linguistic competence enables further mentalizing of others (Fuchs, 2012: 25). It is notable that the development of self-other metaperspectival capacities can only be proceeded through embodied and embedded social interactions. The presumed functional ability of communication is shaped by bodily social interaction and situatedness. The related current approach to cognition is prominent as 4E cognition: embodied, embedded, enactive, and extended. The important implication of the 4E cognition is that human cognition is shaped and structured by dynamic interactions of humans and their physical and social world. If successful social interaction, based on 4E cognition, shapes our functional ability of communication, the sender-receiver model of communication cannot be taken for granted. The question is how pragmatic interaction patterns between persons and between humans and machines are to be conceptualized.

4 Human–Machine Interaction

With triangulation as the guiding principle for social cognition, human–machine interaction can also be described with this basic model. The interaction between human and machine takes place in a shared environment in which both human and machine can interact. The technological environment affords space for human and machine to work together. Technological environments are in turn constituted by human practices, from making stone tools to our current information and communication technologies. The concept of machine is so closely interwoven with human practice that humans and machines must always be understood together.

Machines can be mechanical and electrical or electronic, self-regulated, or self-organized automata. On the one hand, machines can be understood as simple tools or as complex mathematical models on the other. Simple machines in the pre-industrial sense are tools like a hammer, for example, or machines that are made up of various components. They do not move on their own, and they have to be moved. The development of mechanical machines benefits from the concepts of mechanisms that emerged during the Renaissance, i.e., precise knowledge of specific processes in component complexes. The term machine gradually came to be applied to apparatuses, which are made of moving parts for applying mechanical power. If a machine has a continuous energy drive and can therefore run independently, it is commonly called an automaton. In this sense, the term machine is usually ambivalent.Footnote 2 Today, in the age of information technologies, computers as “information artifacts” provide a richer meaning of machine and wider possibilities of user experience. It seems that modern machines can work more and more independently and self-organized, but the institutionalization of machine production and usage to serve human purposes is embedded in more complicated social processes and systematical practices.

Human–machine interaction (HMI) is currently defined as interactive system, namely, as “combination of hardware and/or software and/or services, and/or people that users interact with in order to achieve specific goals” (ISO 9241–11:2018: 3.1.5). This general definition also includes the specific research fields such as human–computer interaction (HCI) or human–robot interaction (HRI). Specially, HRI is an inter- and multidisciplinary research area which involves engineering, psychology, design, anthropology, sociology, and philosophy (Bartneck et al., 2020:9). The aim of this discipline is to develop social robots that are able to take on social roles, e.g., “co-workers, tutors, and assistants in the medical field and to provide services and care settings, in education, and in people’s home” (ibid.: 201).

This new form of machine expands expectations of how humans and machines work together in socio-technological environments. As mentioned in the previous section, the implication of the 4E cognition approach is that cognitive processes do not only take place in the individual, but rather are embodied, embedded, active, and extended in a physical and social environment. Applying the guiding principle of triangulation in the context of HMI, the machine takes on a role in this triangle. Therefore, it is not sufficient to focus on a concept of the seemingly given interaction partner, but rather on the question of what leads to a successful human–machine interaction. According to the mediation approach of philosophy and technology, “humans and technology should not be seen as two ‘poles’ between which there is an interaction; rather, they are the result of this interaction […] they are not pre-given entities but rather once that mutually shape each other in the relations that come about between them” (Verbeek 2015: 28). Moreover, “the relation between humans and technologies is in fact part of a larger relation, between human beings and their world, in which technologies play a mediating role” (ibid).

5 Human-Technology-World

In the discussion about the relationship between human and technology, two accounts in respect of the 4E-cognition approach developed above are helpful for our investigation of human–machine interaction. On the one hand, the postphenomenological approach in philosophy of technology focuses on the general relation between humans and the world (Ihde, 1990). On the other hand, an archeological as well as an anthropological framework of Material Engagement Theory (MET) is very prominent in the debate and raises the question of how material environment shapes human cognition (Malafouris, 2013). Both theories underline that our understanding of cognition relies on what we called 4E-cognition (Sect. 3) in connection with the general relation of human and technology (Ihde and Malafouris 2019).

The postphenomenological strategy first represented in the work of Don Ihde (1990) draws attention to several ways of describing human-world-relations mediated by technologies. He distinguished four main relations of human-technologies-world: (1) the embodied relation, (2) the hermeneutic relation, (3) alterity relation (quasi other), and (4) the background relation (ibid.: 72–112). For Ihde, the first relation, the embodied relation, is the initial starting point for our engagement with technologies: “I shall begin with a focus upon experientially recognizable features that are centered upon the ways we are bodily engaged with technologies […] I-as-body interact with my environment by means of technologies” (ibid.: 72). As far as one has learned to deal with a certain technology, artifacts become transparent and belong to her ordinary experience or are somehow part of her extended body such as eyeglasses. Second, Ihde talked about the hermeneutic relation, i.e., when an interpretation of an artifact is needed like in the case of reading a thermometer (ibid.; Verbeek 2001:127–133; Aydin et al., 2019). Whereas embodied and hermeneutic relations stress the mediated form of technologies, alterity relations characterize technologies as counterpart or as quasi-other in interaction like computers or cars, which have autonomic function. In the case of social robots, this relation is crucial for the direct interaction of people with an artificial counterpart. This supposed interaction partner appears in a double aspect; on the one hand, it is an objectified counterpart, and on the other hand, it still has the mediated form. According to Ihde (1990, 106), the otherness of the computer “remains a quasi-otherness, and its genuine usefulness still belongs to the borders of its hermeneutic capacities. Yet in spite of this, the tendency to fantasize its quasi-otherness into an authentic otherness is pervasive.” The user’s attitude towards the artificial interaction partner decides whether the social robot is perceived as a quasi-other or as an authentic other. At last, in the background relation, technologies do not play a fundamental role in our experience. Instead, they frame the context of our human experience in an implicit way. That means that these kinds of technologies can work without being perceived actively, e.g., a WLAN router.

The Material Engagement Theory (MET) goes back to the archeologist and anthropologist Lambros Malafouris, who investigated the relation of human cognition and tool use in cultural evolution. The main question is how material things “are capable of transforming and rearranging the structure of a cognitive task” (Malafouris, 2013: 247). This account refers to the extended mind approach, which claims that tools in the external world can be part of cognitive processes, rather than only brain or body processes. In this view, there is no clear boundary between humans and technologies, because of “the strong plasticity of human cognition” (Aydin et al., 2019: 329).

Both approaches, the postphenomenological strategy and the MET, share the same premises. First, they propose the reciprocal relation of human being and material objects as a relational ontological standpoint.Footnote 3 It “refers to the way human self-consciousness is technically and intersubjectively mediated” (Ihde and Malafouris 2019: 197). Based on this view, material objects are not merely passive tools for humans but also construct an “ecology” of material world which in turn shapes human cognitive “ecology” and extends the possibilities of human actions (ibid.: 198). Second, they focus on “issues of practice and experience, not on representations,” to bridge the gap between knowing and doing (ibid.). Their aim is to find the fundamental structures and features of the interplay between humans and their surrounding world by analyzing archeological as well as current technical phenomena. Accordingly, the inseparably intertwinement and co-constitution of humans and their technical environment are the core condition for humanization. The interdependency of humans and things unites the distinction between mind and matter. Furthermore, from the relational ontological point of view, human cognition can be understood as a process of co-constitution within the technical-cognitive ecology. In this ecological perspective, human cognition not only recognizes reality but also creates reality (Welsch, 2012: 138). Malafouris uses the word “metaplasticity” to describe the reciprocal relation of human cognition and the surrounding world without reducing plasticity to the brain (Malafouris, 2013: 46). Metaplasticity characterizes “the emergent properties of the enactive constitutive intertwining between brain and culture” (ibid.).

Following both strategies, postphenomenology and MET, Aydin et al. 2019: 322) attempt to give an advanced insight: today’s information and communication technologies present a new type of environmental technologies, which is called “Active Technological Environments”: “they are not just a mute and stable background for human existence, but they are actively involved with the human being and material objects for whom and which they form an environment.” Smart environments arise from technological developments like the Internet of Things (IoT). They are active because their presence works on us. In contrast to low-tech environments like traditional material environments including simple tools, these Active Technological Environments are actively doing something and can lead to a new active technical condition for human cognition and agency (ibid.: 331). Although artifacts have always been intertwined with human practice, and in that sense, low-tech environments are not just passive; the pragmatic dimension of action and interaction comes to the fore with this new technical condition. Aydin, González Woge, and Verbeek expanded Ihde’s background relation into a so-called immersion relation, “in which technologies merge with our world, and at the same time have a bidirectional intentional relation with humans” (ibid.: 336).Footnote 4 The current Internet of Things supported by semantic technologies connects cyberspace and the physical environment. In this new technological environment, physical things can be linked to each other, receiving sensory data and processing it centrally or de-centrally, and interacting with humans. That is, the technical environment and interlinked things influence people’s experience and shape the way one acts, in a sense of “giving direction,” which corresponds to the literal sense of the Latin word “intendere” (Verbeek 2009: 235). The further question is what role the Active Technological Environments play in human–machine interaction.

According to the postphenomenological account, technologies mediate human and world in particular ways, including organizing ordinary experiences, sustaining embodied habits, and “scaffolding their cognitive abilities” (Aydin et al. 2019: 324). This account provides a broader perspective of cognitive abilities, which bridges the dichotomy of human and world and focuses on the interaction between human activities and their active surroundings as Active Technological Environments. In this view, it is possible to characterize the human cognitive abilities in their technical-cognitive ecology as co-constitution by indicating the mediated, active role of technologies, rather than by the classic functionalist sender-receiver model. Indeed, the ideal functional sender and receiver as well as processable information are not predetermined but will be co-constituted in a broader dynamic process of cognitive ecology. With regard to the triangulation as guiding principle for social interaction (Sect. 2), human cognition can be understood as a dynamic process and as a result of social interaction including the physical, technical, social, and cultural environments.

6 Information as Pragmatic Action Pattern

As stated above, the experiences that sustain habits for action are mediated, scaffolded, and organized by technologies. That is, actions are framed by the material social environment. We understand these forms of action as pragmatic action patterns. These forms of action are units of physically embodied information that are not purely formal but are generated in an interactive process and stabilized in use. Particularly, information as pragmatic action pattern can be understood in different ways. First, this understanding of information corresponds to the 4E approach presented above. The pragmatic patterns are embodied in forms of engagement (Vollzugsform) with material and social surroundings. Cognitive processes are embedded in social technical frameworks. Minds are extended by technologies and the new possibilities for action are afforded by technical artifacts that constitute the material environment in enactive manners. In short, pragmatic patterns play the role of mediation as part of our technical-cognitive ecology. Second, pragmatic patterns can be objectified as technological artifacts with stabilized usage like our linguistic-pragmatic patterns with shared meanings. More specifically, our uses of technologies constitute our Active Technological Environments. And these environments constitute and process information as meaning in an active and non-directional manner. But how is meaning implemented in these pragmatic patterns?

This leads to the broader discussion on (socio-)technology in the field of social philosophy. This discussion is prominent in the work of Bourdieu. Like Gallagher and Ransom (2016: 341) pointed out, the context of action is situated in the social and cultural background, which is called “habitus” by Bourdieu. Bourdieu 1990: 54) wrote, “The habitus, a product of history, produces individual and collective practices — more history — in accordance with the schemes generated by history.” As has been mentioned in Sect. 3, social practices in their long historical process generate certain forms of habitus as structured frames. These frames shape cognitive and motivating conditions for further social actions. In other words, single action is constrained by performance schemes based on earlier practical experiences. Following Tomasello (1999: 79), the “habitus thus has direct effect of cognitive development in terms of the ‘raw material’ with which the child has to work.” From the ontogenetic point of view, the child is born into a particular habitus, which determines her learning experience and how she interacts with her physical and social environment in her later life.

If human cognition is structured by social practices, including the material and cultural achievements, then changing technologies (material- and socio-technologies) can also reshape human cognition. The approach of 4E cognition can be seen more clearly within the scope of the socio-technical institutionalization, which frames the developmental environment for an individual’s cognitive abilities.

The active role of the material environment can also be seen in the affordance theory by Gibson.Footnote 5 Gallagher and Ransom (2016: 341), in short, interpreted Gibsons’ notion of affordance as “it defines a range of possibilities for action that depend on both body and environment.” Accordingly, affordance space can be physical as well as social and cultural (ibid.). They pointed out that the occurring affordance space of an individual is defined by “evolution (the fact that she has hands, for instance), development (her life-stage: infant, adult, aged), and by social and cultural practices (normative constraints) — all of which enable and constrain the individual’s action possibilities” (ibid.). Regarding these three aspects, the engagement with the material world is a dynamic process of interaction. This has also been discussed by Tomasello (see Sect. 3). In his research, the theory of affordance is combined with children’s early learning. For a young child, the affordances of objects are discovered by observing how the adults use them in shared situations. Tomasello (1999: 84–86) illustrates three dimensions of affordance of artifacts: sensory motor affordance, conventional use (we use it as), and symbolic play (I can use it as). This is quite similar to Gallagher and Ransoms discussion of the three aspects of affordance space. In short, in the triadic relation between the child, the caregiver, and the artifact, we can see the three aspects more precisely. While entering the attentional scene and intentional space, a child becomes involved in material affordances, parental scaffolding for conventional use, and their own experience in social and cultural practices.

Norman linked the theory of affordance to human-centered design. The original definition of affordances by Gibson as action possibilities is transformed by Norman into user’s perception of action possibilities. According to Norman, affordance “refers to the perceived and actual properties of the thing, primarily those fundamental properties that determine just how the thing could possibly be used” and, therefore, “affordances provide strong clues to operations of things” (2002: 9). This adoption of the Gibsonian affordance theory shifts the focus to the user and her perspective. Norman (1999: 41) emphasizes that “affordances are of little use if they are not visible to the users.” He distinguishes physical from perceived affordances to focus the accessibility of usability: “What the designer cares about is whether the user perceives that some action is possible” (Norman, 2018). That means, affordances specifying the range of possible activities should be perceivable for the users, but “[R]eal affordances do not always have to have visible presence” (1999: 40). In this approach, the perceived affordances are more relevant to designers of interactive systems. “A good designer makes sure that appropriate actions are perceptible and inappropriate ones invisible” (2002: xii). Norman’s main point is to stress the interests and needs of possible users, “with emphasis on making products usable and understandable” (ibid.: 188).

Combining the four relations of human-technologies-world proposed by Ihde with the different aspects of affordance theory mentioned above, the embodied relation of human, technology, and world depends on the individual’s development in her social and cultural context. For example, a wearer of glasses is used to his experience of wearing and, therefore, can look through the glasses intuitively, whereas a 6-month-old baby can hardly wear the glasses without trying to look at them, grasp them, and to remove them. In other words, when designing technical products, it is necessary to take into account the different affordance spaces of different users.

By linking the general relations between human and technologies to the approach of 4E cognition, we have demonstrated the conditions for successful human–machine interaction. Furthermore, we have explained the generation and stabilization of pragmatic action patterns as units of information in interaction with the focus on socio-technical habitus and affordance. Within this theoretical framework, we discuss current development processes of successful human–machine interaction in the next section.

7 Machine in the Triangle

In respect of the ambivalence of the term machine mentioned above, today’s information and communication technologies provide a richer meaning of machine and wider possibilities of user experiences. Embedded in more complicated social processes and systematic practices, the study of human–machine interaction leads to the investigation of the processes of institutionalizing machine production and its use for human purposes.

As mentioned in Sect. 4, HMI is standardly defined as interactive system. The key concept for an interactive system is usability, which describes the extensions to “which a system, product, or service can be used by specified users to achieve specified goals […] in a specified context of use” (ISO-9241–11: 2018: 3.1.4). Users are generally persons, “who interact with a system, product or service” (ibid.: 3.1.7). It is noted that users can play different roles in the interaction: the enduser, who operates the system and makes use of the output, and people who support the system. This also refers to the team of developers, designers, supervisors, or communities who maintain the system. These roles of users focus on actions which have direct impact on the interactive system. HMI refers mostly to end users, who ultimately use the product. To ensure consistent operation and usability, developers and communities should consider needs, interests, and experiences of end users, including further investigation of end users’ perceptions and responses. However, developers have expert knowledge, and end users normally expect to be able to engage with technology without specialized knowledge. A human-centered design for developing interactive systems requires that the perspectives of users can be taken into account by designers.

If the affordance space of the user includes the physical and mental, we should keep in mind the relationship between the ecology of material world and the human cognitive ecology in terms of the 4E approach. This means that the process of machine development in context of HMI (including HCI and HRI) is embedded in the socio-technological framework constituted by human practices to extend the affordance space for action. In today’s software development, multidisciplinary skills and acknowledgement of different perspectives within the development team are essential. To enable successful interaction of the potential target user with a system, product, or service under development, it is necessary to have material affordances in the very first phase of development. This means that a prototype must be developed so that the user has access to the interaction. As soon as a prototype is available, the interaction between user and prototype can be tested, observed, and evaluated to meet the needs of the users.Footnote 6 In this way, the prototype can be further adapted to enable successful human–machine interaction. The process should be iterative, which provides the interaction of the users and the product on the one hand, and reciprocal interactions between end users and the development team on the other. The iterative process is central for developing and designing in the frame of Agile Software Development to obtain a more “effective, humane, and sustainable way of working” (Agile Alliance, 2021). In this process, iteration is represented in the spiral model with which the progressive adaptive development could be implemented. Unlike predictive models with plan-driven methods, adaptive methods are value-oriented in that they focus on users’ changing needs based on their experiences.

The iterative developmental process refers to three different constellations of interaction, (1) the collaboration of the end user and the development team, (2) the interaction between the end user and the product (system or service), and (3) the relation between the product and the team during the developmental process. The relations of end user, development team, and product build a triadic interaction that is consistent with the triangulation principle for social cognition presented in Sects. 2 and 3. Especially notable is that in the case of software development, the object with which the end user interacts is developed in the process of dynamic three-way interaction. That means, the latent usability of the product is not clearly defined in the first phase of prototype development, whereas the accessibility of use is of crucial importance. It relates to questions of whether the prototype can work and whether the usability can be perceived. At this point, it is worth asking whether the three-way interaction focuses on human–machine interaction or on human–human interaction with respect to the use of the machine. During this development process, both models play a role. On the one hand, it is about developing a machine that is capable of interaction. On the other hand, the collaboration between end user and development team is crucial for the realization of the machine as an extension of the users’ affordance spaces for action.

The user and the development team share the same intentional goal, which is to achieve a successful interaction between the end user and the product. The origin of the shared goal lies in the common world, which can be described in two ways: (1) Existing infrastructures of our global network offer compatibility between a new product and the current technological environment of the product. (2) Socio-cultural practices provide a common intentional space in which our actions with certain meanings are stabilized in their use. The needs of the individual are shaped by the structured frames of meanings, which we called habitus in Sect. 5. The first relates more to the concept of physical or material affordance, including technological infrastructure, such as the Internet of Things, as well as our familiar material environment, such as cutlery in a kitchen. The second refers to psychical and cultural aspects of affordance or to the affordance space of an individual, which enables and constrains her action possibilities. With respect to the Active Technological Environments mentioned in Sect. 5, current smart information and communication technology can provide new possibilities of actions. Semantic technologies based on global networking afford patterns of meanings which are embedded in our social interaction.

With the notion of pragmatic information, it is possible to describe how the access to usability is shaped by 4E cognition and how meaning is generated through triangulation and stabilized in our active use. On the one hand, pragmatic pattern works as a medium for conveying meaning. On the other hand, a specific descriptive meaning is formed by using it. With the general shift from classic mechanical engineering to current software engineering, artifacts or material objects are creating new Active Technological Environments. We have interacted with our surroundings for a long time in our history and in our social practice. The technologically and intersubjectively mediated social interactions expand our linguistic behavior, our communication possibilities, and the experience of the individual in engaging with one another by forming, sharing, and using pragmatic patterns.

8 Outlook

In the age of digitalization, the reciprocal influence of material and cognitive ecologies becomes apparent. Today’s digitalized networking provides the possibility to combine consumption and production. The roles of consumers or end users and producers as developers become blurred. This new phenomenon is characterized by the term prosumer. End users are not only consumers but also producers of software systems, products, or services by using them. With respect of the triangulation as guiding principle, the triangle consists of the prosumer, the development team, and the software system, product, or service. This leads to the questions of how we want to deal with the data that we produce and how data is used to describe our collective actions? In order to simulate our habitus, we have to be clear that our habitus arose intersubjectively. The condition for this is the ability to switch on multiple perspectives. A flexible shift of perspective should therefore be seen as a fundamental ability to equip with digital competence. As we have illustrated, explicit knowledge of oneself and of others is acquired through triadic interactive experiences and refers to the primary co-awareness with others. The origin of social understanding in interpersonal interaction is founded on intercorporality and interaffectivity. To achieve a meta-perspective of oneself and others through social interaction, reciprocity and reversibility of perspectives and social roles are involved. In the triadic constellation of prosumer, development team and software systems, products, or services, it is obvious that the semantic meaning comes from the social habitus. These shared frames of action are also part of the machine because the construction of the machine is based on these intersubjective schemes of performance. Our new information and communication technologies are changing our understanding of the world and of ourselves. Together with these new technologies, we are creating new environments in which we form, interpret, and learn to deal with patterns of action among ourselves and with machines.