Introduction

Hypotheses and Framework

The above question is part of a general hypothesis suggesting that animals are capable of adequate use of utterances in different kinds of communication. To utter (Bakhtin 1986) thus presupposes ability to use references (Genone and Lombrozo 2012), which presupposes existence of mental concepts, which presupposes ability to create, store, and handle differentiated percepts (Herzog et al. 2016: 1). These capabilities accordingly presuppose an overall orchestrating mind (Deacon 2013) that can handle different kinds of communication or what here will be termed (life-)genres (Voloshinov 1973; Luckmann 2009; Bawarshi and Reiff 2010; Ongstad 2019).

Further, evolution has moved from matter to matter and mind (Deacon 2013; Dennett 2018; Bertolero and Bassett 2019; Kawade 2013) and so has communication. To utter involves a simultaneous mixing of concrete extrinsic and abstract intrinsic aspects. Consequently, a study which intends to ‘meta-study’ other studies of animal references, will find itself in the midst of paradigmatic conflicts over mind/matter in−/compatibilities and in disputes over methodologies, approaches, and designs for studying combined nature/culture phenomena (Stegmann 2013). Although disputes over a ‘theory of mind’ (Cheney and Seyfarth 1990; Andrews and Beck 2017; Deacon 2013) are relevant to a discussion of animals’ ability to refer when uttering, this sub-study will, due to space, just take animal minds for granted.

To study other researchers’ studies of animal communication, implies, as hinted, studying utterances in the light of genres (Ongstad 2019). Such studies rarely use the term utterance. Exceptions are Fedurek et al. (2016) and Scott-Phillips and Kirby (2013). Since hardly any study in the field seems based on utterance/genre-theory, an outline is sketched in the following. Ongstad (2019) presents a more detailed framework. Here just necessary key terms are highlighted.

By combining, among others, Bühler (1934), Bakhtin (1986), Halliday (1994), and Habermas (1981), it is claimed that animal communication can be analysed as semiotic utterances, defined as a compound of five inter-related constitutive aspects. Firstly, of structured concrete, physical form, mainly concrete structured matter. Secondly, of something this form may point to, stand for, or refer to, mainly an abstract based in minds. Thirdly, of an act the form-content combination may function as, being both concrete and abstract. Fourthly and fifthly, of contextually incorporated time and space (chronotope), mainly concrete, possibly even mental. This socio-semiotic perception of utterances farewells a strict verbal and linguistic approach in favour of a general, communicative, semiotic understanding (Hoffmeyer and Kull 2011). The study has accordingly no ambition of researching animal utterances as ‘pre-language’. Yet, quite a few studies do so, as shown in Wheeler and Fischer (2012) and Rendall and Owren (2013).

The overall project this sub-study is part of, has two major hypotheses. Firstly, it is assumed that, at least for so-called higher order species, repeated and routinised use of certain structure-reference-act-time-space-compounds, in other words, utterances, may, in an evolutionary perspective, develop more generalised perceptions of certain kinds of communication (or genres). They are seen as socially shared and function as a system for a community of users or species (Ongstad 2019). Communicational systems are claimed to be systemic, which roughly means partly pre-structured, partly open. In the words of Cartmille (2015), for instance ape gestural communication as system can be characterised both by its flexibility and by a striking similarity in gestural repertoires across groups (Cartmille 2015: 66/SO’s italics). Also, Zuberbühler (2006) argues that studies of “[…] the Campbell’s Diana monkey system showed that alarm calls undergo semantic adjustments in the minds of the recipients, depending on the sequencing of alarm calls” (Zuberbühler 2006: 10–11/SO’s italics). For analyses of references the principle of partial openness is crucial since it both raises the question of existence of animal genres and challenges epistemologies and methodologies in the field.

In research on and theories of human communication several concepts have been coined for typifying higher order kinds of cultural communication, such as Bernsteinian code, Hallidayan register, Bordieuan habitus, Foucauldian discourse, Dawkinsian meme, or simply genre, advocated by theorists in different fields (Bawarshi and Reiff 2010). Similarities and differences between these notions, as well as inquiries into their weak and strong sides are discussed at length in Ongstad (1997) and more briefly in Ongstad (2019). In this project genre is preferred, since, in a cross-disciplinary, biosemiotic context, it can be stripped off its intricate connotations to specific ‘human’ fields it historically has been associated with. Genre can simply be defined as a kind of communication; a meaning close to its etymological origin (Cambridge Dictionary 2020).

In principle, this advanced, general level ‘beneath’ utterances usually leaves a species’ communicational system partly open. Beneath is used as an iceberg metaphor where an utterance is seen as the concrete tip over and at the surface and genre is the hidden, abstract under water part. Genres are further dynamic and contextual, since they even are situational, and not just given. While use of genres in the outset are both biologically and genetically conditioned, some might have to be learned. Utterances are uttered or interpreted by a mind at an individual-subjective level, while genres operate at a collective, social, and intersubjective level and thus functions in-between and among individuals. Genres are societal and cultural phenomena even in animal worlds (Ongstad 2019). Hence, applied semiotics needs to be social, not just sign-oriented. There are somewhat similar ideas already in the field, for instance, Tomasello (2014) argues:

[…] the organism must represent its experiences as types, that is to say, in some generalized, schematized, or abstract form. One plausible hypothesis is a kind of exemplar model in which the individual in a sense ‘saves’ the particular situations and components to which it has attended […]. There is then generalization or abstraction across these in a process that we might call schematization […]. (Tomasello 2014: 12).

Further, Cartmille (2015: 66) holds that ape gestures might be based on “[…] heritable prototypes “tuned” during development (Hobaiter and Byrne 2011), or they might arise from common actions ritualized into gestures resembling their shared roots.” It follows that animals should have minds capable of schematisation and to act functionally.

The second major assumption for the overall project is that repeated behavioral patterns may originate from activities such as hunting, feeding, playing, and mating. In other words, basic life conditions may generate so-called life-genres, kinds of communication, closely related to basic life-functions (Voloshinov 1973; Luckmann 1992, 2009). At least animals such as some vertebrates, (birds and mammals), and especially great apes may have to make sense of utterances interpreted through more general communicational ‘lenses’. Further, such animals may have to learn to balance use of life-genres in their lifeworld or Umwelt (Habermas 1981; Ongstad 2009; Uexküll 1921). It is according to a species’ lifeworld that phenomena make sense or not. The concept lifeworld is preferred over Umwelt since Habermas sees lifeworld as mental and communicative.

The sketched framework accordingly consists of four inter-related analytical levels, sign, utterance, genre, and lifeworld. Both utterances and genres are seen as constituted by the mentioned five joined, reciprocally defined, key aspects.

Although Fig. 1, illustrating the particular relationship between utterance and genre, has two levels, a full framework model would operate with four. Signs are parts of utterance as a whole. Utterances are interpreted as kinds of genres. Behind these three is the lifeworld, a deep mental ‘world’ from which sense and functionality of utterances and genres are considered. Since this study focuses utterances and how they might be influenced by genre(s), the levels sign and lifeworld are here backgrounded. To utter is to communicate but concerns individual utterers. Communication is a collective phenomenon and should be studied at the level of genre.

Fig. 1
figure 1

Five basic aspects constituting utterance as communication. Utterance and genre are modelled as a shortened or cut pentagonal pyramid with utterance as a concrete surface plane and genre as an underlying abstract part, marked by dotted lines. The pentagonal relationship between the five basic aspects applies for both levels. The double-headed arrows between the two planes symbolise the dynamic, diagonal, reciprocal relationship between of utterance and genre as well as the openness of the system. These processes work both i development of utterers/ interpreters (seen diachronically). [© The Author]

Complications Studying Communication

A study of this kind needs an understanding of communication that enables a productive connection between such fields as biocommunication and biosemiotics on the one hand and biology and ethology on the other. A definition partly in line with the framework’s epistemology is given by Witzany (2014: vii):

Communication is defined as the sign-mediated interaction between at least two

living agents, which share a repertoire of signs (which represents a kind of natural language) that are combined (according to syntactic rules) in varying contexts

(according to pragmatic rules) to transport content (according to semantic rules).

This view of communication does not reduce biology to ‘communication’. It adds to and incorporates it. As Witzany (2014: vii) further argues, biocommunication of animals integrates the biology of rather different species with their communicative competencies. Within the pentagonal Fig. 1 is even a triad of form, reference, and act. Fields studying these aspects are roughly syntax, semantics, and pragmatics (respectively). Historically unity between and segregation of these dimensions were probably first outlined by Morris (1938). His triad was sign-based, not only designed for language studies. Even more important, he argued that interrelationships between aspects challenged studies of (all) sciences.

Morris departs from semiosis. He calls relations between signs to objects the semantical dimension of semiosis and study of this dimension for semantics. The relation of the sign to interpreters is called the pragmatical dimension of semiosis, and study of these relations for pragmatics. (Morris 1938: 6). Signs relate to other signs and these relations are called the syntactical dimension of semiosis. He sees syntactics as study of how signs relate to one another (Morris 1938: 13). In his view the “[…] various dimensions are only aspects in a unitary process. Nevertheless, one might speak of a pure semiotics, and thus of pure semantics, pure syntactics, and pure pragmatics” (Morris 1938: 9). For each of these aspects it is possible to trace rules, and taken as a whole “A language in the full semiotical sense of the term is any intersubjective set of sign vehicles whose usage is determined by syntactical, semantical, and pragmatical rules” (Morris 1938: 35). Syntax, semantics, and pragmatics, to use today’s established terms, are components of the science of semiotics, but they are even mutually irreducible components (Morris 1938: 54). Believing that the 1930s headed toward specialised research in syntax, semantics, or pragmatics, Morris stressed the importance of interrelations of these disciplines within semiotics. Posner (1984: 1) saw this somewhat differently, claiming that among semioticians between the wars holism seemed to be the strongest trend:

They differed in the terms which they coined in order to refer to their goals, "Structuralism" (Jakobson), "Functionalism" (Mathesius and Mukarovský), "Philosophy of Symbolic Forms" (Cassirer), "Umwelt Research" (von Uexküll), "Structural Description" (Carnap), "Sematology" (Bühler), "Significs" (Mannoury), and "Glossematics" (Hjelmslev); but they discovered more and more that these terms only focused on different aspects of positions and oppositions they shared: against atomism and mechanism they all developed a holistic approach; against formalism they investigated sign function; against psychologism they showed the possibility of an inter-subjective analysis of meaning; against biographism and historicism they favored synchronic studies; against academic conservatism they introduced criteria for the criticism of sign behavior; against the self-isolation of the academic disciplines they practiced interdisciplinarity.

Epistemological positions within communicational studies can be paradigmatic, representing antagonistic schools of thoughts, regarding views on language, communication, and ‘reality’ (Kuhn 1962). Today many variations are at hand. Beaugrande found more than 80 kinds of grammar and Wikipedia lists a range of communicational theories (de Beaugrande 1998; Wikipedia 2020b). Yet, the number of utterance aspects relevant for this study can roughly be reduced to aspects constituting utterances in contexts - form, content, and act. What could be termed ‘monadic’ approaches focus just syntax, semantics, or pragmatics. Following the framework’s definition of utterance and applying Morris’ interrelated components, these may form dyads: syntax/semantics, semantics/pragmatics, and pragmatics/syntax, and even triads, by combining key aspects in different ways (Bühler, 1934; Searle 1971; Habermas 1981; Miller 1984; Halliday 1994; Kattenbelt 1994; Martin 1997; Altman 1999; Ongstad 2019). While most triadic theories tend to prioritise the pragmatic dimension, this study applies Habermas’ principle of balanced simultaneity between aspects (Habermas 1998). Finally, utterances occur in time-space contexts. Some studies might depart from environment or time (Agnus 2012; Bakhtin 1981; Magnus 2011; Watson 2014). Such studies are relevant, but are not included in this study, where focus is on the internal triad.

By  focusing semantics as such, one risks losing sight of the other dimensions syntax, pragmatics, and context (“the blindness of focusing”). This dilemma thus brings to surface a series of epistemological and methodological challenges of studying animal semantics. The overarching nature of the framework with its five aspects and four levels is designed to be broad and deep enough to reveal such backgrounding. The overall project hence aims at operationalising the framework by developing a simple design for meta-studies of other researchers’ studies as empirical data, mainly by content analyses and studies of communicative positionings (Ongstad 2007). Other adequate frameworks do exist, but tend to favor one or two aspects, for instance pragmatics/functions (Scott-Phillips and Kirby 2013) or form and function (Kleisner 2015).

Scope, Data Resources, and Approaches (‘Positioning’)

Narrowing the Scope

The overall project’s aim is to trace utterances that might reveal existence of animal life-genres generated by life-functions. A basis for life-functions are physiological functions, such as homeostasis, organization, metabolism, growth, adaptation, response to stimuli, and reproduction (Wikipedia 2020a). While the four former mainly concern inner (intrinsic) communicational processes, the three latter can be associated with extrinsic observable utterances in different communicational channels (Barbieri 2012; Wikipedia 2020b).

Vertebrates, including human beings may use four or more of these senses for communicational purposes (Finnegan 2014). Many studies of animal semantics have been concerned with just one of these sub-types, vocalisation, often restricted to a handful taxa, mostly motivated by dis−/proving evolutional connections to human language (Stegmann 2013). The many sub-kinds of vocalities, are, seen from genre theory, of particular interest since they might have different functions in a species’ lifeworld. In Witzany (2014) three chapters dwell with vocalisations, Jensvold et al. (2014) (chimps), Stoeger and de Silva (2014) (elephants), and Faragó et al. (2014) (wolfs and dogs). Table 1 is based on their investigations.

Table 1 Animal communicational modes and their sub-types

These sound types are examples of potential kinds of utterances and therefore potential life-genres. As differentiated sets of sounds they are integrated into a communicational system. Among these, calls are probably some of the most researched and discussed vocalities in the field’s literature, especially alarm calls. Gill and Bierema (2013: 451) have made a summary over of six bird species for which evidence exists for functional reference in alarm calls: Scream, Cut!_cut-cut-cut-KAAAAH!, Buzz, Trill, Alert, Attack, Perched hawk, Ki-ki, Long croak, Short croak, gargle, Mew-a, mew, Siren, Chicka, Jar, Seet, Chuck, Yeep/shirp, Seet, Chip, metallic chip. (The ‘Englishness’ of these onomatopoeic terms will not be discussed here.)

The above step-by-step narrowing, starting from all kinds of animal utterances ‘down’ to specific alarm calls in particular species shall make aware both what is ‘bypassed’ by the focusing and the size of the domain for further semantic investigation in the animal world. Besides, ‘stepwise’ narrowing even hints certain historical, epistemological routes studies of animal semantics have taken to trace missing links or reveal «red herrings» in animal semantics (Wheeler and Fischer 2012: 195).

Empirical Studies as ‘Data-Resources’

25 years ago, there was a lack of relevant empirical studies of animal utterances. Today the situation is altered, as the challenge rather is to simplify searches for adequate studies. Confronted with a surplus of research, the choice of studies is roughly restricted to three anthologies, two scientific journals, and a book, namely Stegmann (2013), Witzany (2014), Andrews and Beck (2017), Biosemiotics and Animal Behavior, and Håkansson and Westander (2013). All sources provide relevant, recent, cutting edge research and debates on animal communication in general, and animal semantics in particular (Manser 2013). In total this ‘excerpt’ makes up more than hundred studies published between 2010 and 2020. They are not necessarily representative for the whole field of animal semantics but seem sufficient to trace different kinds of studied content concepts (Ongstad 2019). By making content a focused dominant, prioritising reference, and thus semantics, as aspect and utterance as level, other communicational aspects and levels are backgrounded (Jakobson 1935).

Positioning/s

Given hypotheses, framework, and resources, how could the question Can animals refer? be answered? It seems precise, but difficulties occur: Is it sufficient to make likely that animals might be able? Is it enough to find just one case? How should refer be defined? These worries are mentioned but not addressed. The main focus is rather the sub-title, Meta-positioning studies of animal semantics making clear that positioning is the method and animal semantics the study-object. Communicatively animals are positioning when uttering, basically by accentuating either themselves or something or others. Further, researchers position animals’ communication by studying the weight of their expressivity, referenciality, and/or addressivity in utterances, to use Bakhtin’s own concepts (Bakhtin 1986: 60-101; Ongstad 2004). Meta-positioning is hence to analyse both these levels of communication in parallel while making explicit one’s own position when analysing (Ongstad 2007). In all three cases genres play a role for balancing between aspects (Ongstad 2014). Even if content, and thus semantics, is prioritised both by researchers and study, a key research aim will be to study critically the role of other key aspects.

Examining Resources

Searching Key Semantic Concepts

This sub-section presents terms for content in the field animal semantics (Manser 2013). They stem from a range of theories, schools of thought, and research traditions, most of them from the six resources. They were first collected in Ongstad (Ongstad 2019: Table 2): aboutness, code, concept, denotation, functional reference, indication, information, meaning, mental object, representamen, representation, symbol, semantic signal, representational function. It is not claimed that these notions necessarily cover the ‘same’ phenomenon, just that they seem coined as semantic aspects, not what communicative utterances look like or do, but what they could contain or be about. Except for three, they are commented upon briefly, considering relevance and adequacy for this study. Information, functional reference, and reference are discussed in detail. For the sake of visual order, terms are marked in italics and by separating them in paragraphs.

Table 2 Researched animal vocalities for chimps, elephants, dogs, and wolves

Aboutness (Adams and Beighley 2013: 404) is an interesting term when discussing a content’s degree of specificity. Although its low specificity could be functional, it will not be discussed in detail.

Concept is primarily a mental phenomenon and should be seen as source for content.

Code is in use in bio-communication but is hard to find an adequate place for in the framework.

Denotation is not relevant since a linguistic approach is refuted.

Indexing is pointing out and indication its result, implication, or consequence. This concept is closer to signal than to content (Bühler, 1934). Besides, its logic could indicate (sic!) that utterance is seen from an interpreter’s (meta-)perspective rather than being a content.

Meaning Adams and Beigley (Adams and Beighley 2013: 402–403) holds that it is too open-ended to be used validly in empirical studies. It is associated with verbal meaning in textual studies and quite controversial (Stegmann 2013).

Mental objects are content elements, but to avoid confusion objects are seen as something referred to, in the animal world probably mostly concrete and material.

Representamen is a semiotic concept, but since utterance is the study’s main focus and not sign, it is dropped.

Representation is not relevant since a linguistic approach is refuted.

Semantic signal as part of an utterance should primarily be seen as an act, and not be confused with content, when analysing.

Symbol could be a content element, related to a semantic sphere, but this term might be confused with the literary concept symbol.

Of the notions left, information, functional reference, and (just) reference, the two first dominate debates in the field of animal semantics. These discussions will be tapped into, not only searching for an adequate overall concept, but even to locate different communicational positions.

Disputes over ‘Information’

In his article What is information? Barbieri (2012) claims, based on paradigmatic discussions on information, that we are confronted with a new kind of non-computable observables. He differentiates between kinds of information, starting by stating:

The discovery of the double helix suggested in no uncertain terms that the sequence of nucleotides is the information carried by a gene (Watson and Crick 1953). […] Heredity became the transmission of information from one generation to the next, the short-term result of molecular copying (Barbieri 2012: 1).

Information and modern biology did in this sense establish a symbiotic relationship, although some biologists consider the concept information to be teleological, such as Wächtershäuser (1997) and even other, for instance physicalists, seeing information as a metaphor, and not the real thing. Barbieri thinks these biologists have overlooked that “life is artefact-making”. Accepting that there exist manufactured molecules, it becomes clear for Barbieri that sequences and coding rules are real, what he calls observables. From this he draws three conclusions, of which the third is particularly relevant for further clarifications:

The third consequence is a new understanding of information. Biological information is indeed the sequence of genes and proteins, but the nature of these sequences has so far eluded us. Now we realize that they are objective and reproducible but non-computable observables. They are nominable entities, a new type of fundamental observables without which we simply cannot describe the world of life (Barbieri 2012: 5).

In other words, life contains different kinds of, and even levels of semantics (Manser 2013). One type obviously works between lower parts in a living system where something simply is sent, such as a recipe for making new cells. To call this information or coded information should not be controversial. The challenge occurs when a more advanced system, for instance animals’ brains, have become so complex that basic life-functions in principle can be affected by a conscious mind (de Jong 2002; Deacon 2013; Suzuki 2016). Although all aspects of communication are involved in developing minds, semantics seems especially important.

As stated, the project will not compare verbal language and animal communication or try to reveal assumed direct lines between the two. Rendall and Owren (2013) have dealt with this issue, critically. They document how theorists in the field since the 1980s have imported linguistic constructs in analyses of animal communication (Rendall and Owren 2013: 154–155). In addition, they add updated examples from articles in Animal Behavior in 2011 of how information theories and different concepts are used in current animal communicational research (Rendall and Owren 2013: 158–161). They refute these approaches and claim that animal signs should be seen, with Peirce, as semiotic indexes, not as linguistic symbols.

Adams and Baighly (2013) defend the use of information arguing that what makes it relevant and worthwhile is exactly that of indication or what Grice called ‘natural meaning’ (Grice 1991; Adams and Baighly 2013: 416). It is included in their understanding that signals are (loosely) about objects or events in the environment. They argue that “[…] aboutness is sufficient for the purposes of explaining animal calls and signals. There is no genuine linguistic meaning or linguistic reference in animal calls, but they are about things in virtue of indicating them” (Adams and Baighly 2013: 416). Against this view Rendall and Owren again argues:

On the one hand, we concur with the authors’ argument that the indication function of non-human signalling is fundamentally different from the falsifiable reference value of human language. On the other hand, Adams and Beighly fail to address the deeper issue that information is not a scientifically useful construct if understood only as folk-intuition and metaphor (Rendall and Owren 2013: 419-420).

A problem with information from the perspective of utterance theory, is its weak connection to newer, updated theories of communication and semiotics. Historically it is closer to hard sciences (Battail 2009). Besides, it is not easy to combine, neither to the idea of a dynamic sign or to genre theories (Ongstad 2019). Although the project will not use information, empirical work based on information theory will not be refuted. Further, regarding Adams and Baighly’s concept aboutness, it might, in its extreme generality and openness, be useful for describing content in animal communication, although its general nature will make research more of an interpretational than an empirical matter.

Disputes over ‘Functional Reference’

‘Functional reference’ has been studied empirically by Macedonia and Evans (1993) and later by many others. It could be seen as a notion developed in studies of human communication, to describe animal, and particularly non-human primate, capacity to handle possible semantic elements. Wheeler & Fischer (2012: 195) (therefore?) abandon the concept, arguing that focusing context-specific calls play down the potentially more complex processes underlying responses to more unspecific calls: “In this sense, we argue that the concept of functional reference, while historically important for the field, has outlived its usefulness and become a red herring in the pursuit of the links between primate communication and human language.” They recommend removing it from the animal communication literature in favour of more accurate and linguistically neutral descriptions such as “context-specific signals”, “predator-specific alarm calls”, or “food-specific calls” (Wheeler and Fischer 2012: 204). These alternatives seem to dismiss the idea of a semantic content though. In other words, they prioritise function (pragmatics) and context, seemingly seeing operationalisation as feasible. From the perspective of utterance theory, admittedly stemming partly from text-theory, their argument might make sense in a more restricted perspective. However, their conclusion could be questioned, since in utterances, and thus in communication, all key aspects may have a joint function (Bühler, 1934; Morris 1938; Bakhtin 1986). A key point is simultaneity and thus balanced complementarity (Habermas 1998). They recommend ending search for true referenciality. If it exists it can only be associated with verbal language or scientific discourse, not with animal communication.

Scarantino & Clay’s have worked out a re-definition of functional reference arguing that semantics and pragmatics should be seen as complementary rather than alternative to one another when analysed in context. A common ambition for definitions of some concepts, such as code, semantic signal, and information, is a strive to make them into measurable, researchable closed categories, coined for independence of context. Scarantino and Clay (2015: e2) investigate studies that might threaten the idea of such context independency (Rendall et al. 1999). Such studies demonstrate how calls may change their referent depending on contextual cues: “Determining how context contributes to the derivation of meaning is integral to understanding what signals functionally refer to” (Scarantino and Clay 2015: e3).

Their redefinition should be seen as a combination of the sign and features of the context of production that contribute to determining the referent (Scarantino and Clay 2015: e4). They underline that the ‘standard’ definition of functional reference now should be seen a special case of their new definition, instantiated when signals have a referent independently of contextual cues. With this expansion the epistemological basis has been moved away from essentialism toward contextualism:

The traditional definition of functional reference requires that signals and referents are strongly correlated and that contextual cues play no role in determining reference. We have rejected both assumptions and formulated a new diagnostic test for functional reference that allows signals to functionally refer by virtue of contextual cues and in the absence of a strong correlation with their referents (Scarantino and Clay 2015: e7).

However, they reject a return to the notion meaning. They point to Wheeler and Fischer’s (2012: 204) emphasising the importance of context in shaping how receivers respond to signals. Scarantino and Clay have examined Wheeler and Fischer’s alternative meaning-based proposal, and concluded that a comparative advantage of the functional reference framework is that it “[…] avoids the ambiguities surrounding the notion of meaning and it has empirically verifiable criteria of application that constitutively tie what signals refer to with how receivers respond to them” (Scarantino and Clay 2015: e7). Wheeler and Fischer (2015: e12) are still not convinced. They believe the revised definition does little to rescue the concept of functional reference.

Fitch (2005), a defender of a Chomskyean view on language, stressing its systemic character (Hauser et al. 2002; Fitch 2010), has made an extensive review of studies of animals’ capacities to handle semantic entities, focusing debates over functional reference. He concludes that apes appear to lack ‘functionally referential’ calls and that a set of few calls does not “constitute any language”. He finds no evidence for a rich propositional semantics in the natural communication system of any nonhuman species, referring to Hauser (1996) (Fitch 2005: 205). A key criterion is intentionality, which all studied animals seemingly fails to meet, referring among others to Tomasello’s works. He concludes that a variety of experiments strongly suggests that calls are not intentionally referential on the part of signallers: “[…] callers do not appear to shape their calling in ways relevant to the knowledge (or lack thereof) of their listeners [...] ” (Fitch 2005: 205). Jackendoff and Pinker (2005) criticised Chomskyean syntax for being too narrow as model for description of animal communication.

There are some additional points to be made about functional reference. Firstly, that the notion mostly, and somewhat contra-dictionary, seems to be treated implicitly as both a monadic category and a pragmatic-semantic dyad. Secondly, that to study individual utterances and to study a species’ communicational system are two different, but yet connected enterprises. Fitch’s critique that calls are too few to constitute a language, could be critically rephrased - a species’ communicational system is not necessarily restricted to the number of different calls used. To conclude: functional – yes. Reference – yes. Functional reference – no.

Reference as Joint Concept for Animal Semantics?

Already in the 1990s Evans wrote on referential signals (Evans 1997). Both Rendall and Owren and Wheeler and Fischer have indirectly critiqued the concept reference. However, to refer might be simplified to mean to establish, in the mind of an utterer, a relation between a concrete, uttered form with something (abstract) else. This perception sees referenciality as one of five constituting aspects of utterances (Bakhtin 1986). It further connects entities in the inner and the outer world (Habermas 1998). It can be seen as general, as aboutness, and at this stage not yet necessarily specific (Adams and Beighley 2013). An investigation of the nature of different kinds of references would crave a further development of semiotic theories based on a neurological understanding of animals’ minds (van der Vaart and Hemelrijk 2014). Versions of newer Peircean semiotics are theoretically interesting, but seem generally hard to apply, and relevant empirical studies are few (Ongstad 2019). In this phase it seems premature to make final statements about the more specific nature of animal references. They are in any case likely to differ quite dramatically, as hinted in the tables.

A particular complication with ‘refer’, is its implicit direction in the chain of uttering. It points from the utterer to something, by which the faculty of referring might seem cognitively and communicatively more advanced than to take in, understand, or conceptualise uttered references. It even reveals an important theoretical conflict between active and passive communicational positions. The same holds for utterance and interpretation. This dilemmatic split asks for a broader discussion of animals’ mental categorisation of phenomena in their lifeworld (Habermas 1981; Ongstad 2019).

The inspection of different content concepts in the field has, in spite of criticisms, led to the temporary conclusion that existence of referenciality in animal utterances is possible, and that further investigations seem worthwhile and feasible. Referenciality and references are expected to be found along with expressivity of form and addressivity toward others in utterances in particular contexts. Referenciality can roughly be understood as aboutness, whereby reference might seem somewhat vague. This generality might even be an advantage though. In specific cases, the content element is probably mostly indexical.

Hence, what is searched is well documented animal studies of referential content elements which, at the end of the day, might contribute to a discussion of to which degree actual kinds of utterances can serve as examples of animal life-genres. Having landed on reference as a preferred concept for a possible semantic content in animal communication, a critical next question is to which degree such assumed references are understood by animal receivers, if at all. Direct dynamics between utterers and receivers should be seen as just a part of a species’ more overall communicational system (Scott-Phillips and Kirby 2013). In other words, a framework should contain a defined set of interrelated levels between which the research process should commute.

The next part positions antagonistic studies of biocommunication. Some debates look like ‘semantic wars’ over right and wrong concepts. However, the deeper conflict seems more paradigmatic, a rivalry between schools of thoughts and between different scientific positions (Posner 1984). Meta-positioning of studies of animal semantics, the second part of the article’s title, implies inspecting such debates from the extended perspective offered by the framework, moving beyond both reference and utterance as delimited aspects or elements. In the last parts focus therefore is moved from reference and utterance more toward genre and system. The variety of descriptions of animal semantics seems rooted in the choice of communicational theories. Priority is here given to studies of vocal animal utterances since these seem best suited for a paradigmatic discussion over how the question Can animals utter? has been answered.

Positioning Communicational Positions

Two Major Challenges

Meta-analyses of studies searching possible references in birds’ vocalisations, especially birdsong, can illustrate the challenge of balancing different epistemological regimes. Inspections have been faced with basically two major epistemological challenges, firstly interrelatedness of aspects or parts versus whole (Morris 1938). Secondly of the possible systemic nature of animal communication systems and life-genres (Ongstad 2019). The former will be presented by sketching a historical development from monads (of either syntactic, semantic, or pragmatic aspects) toward different dyadic combinations, and finally toward triads. The latter will be illustrated by examples of how researchers and projects have coped with these issues.

Interrelatedness – On Monads, Dyads, and Triads as Basic Research Designs

Within communication theory one could in the past often come across studies of monadic entities such as form (structure) or content (reference) or use (act) as closed, delimited categories (Nystrand et al. 1993). In other words, as objects for pure syntax, pure semantics, or pure pragmatics, mainly by leaving out the role of context (Morris 1938). Today such approaches in ethology and animal communication are rather rare. As Kleisner (2015: 369) claims: “Current, twenty-first century biology brings so many new findings that it may be the right time to reject the theoretically naive efforts to explain life from just one level of biological organisation.”

A clarifying text about the semantics of calls is Manser (2013), where she outlines briefly historical lines of animal semantics and discusses controversies around referential function, including debates in Stegmann (2013). Despite the interest of bridging animal communication with human language, her examination seems relevant for a meta-study of animal utterances and life-genres.

Early theory and research was often, in my word, monadic. Studies of animal communication departed from a Darwinian legacy, stressing the affective/emotional dimension. With Struhsacker’s 1967-study and Seyfarth et al.’s further development, focus moved from ‘affective’ to semantic signalling (Struhsaker 1967; Seyfarth et al. 1980a, 1980b). The debate that followed created even an early hybrid, functional reference. Historically, all three key aspects in utterances were at this point in play, but still separated. In Manser’s words and with my italics:

Traditionally, animal signals were considered as being ‘affective’, providing information about the internal motivational state of the signaller and/or the behaviour in which the signaller was likely to engage (Smith 1977, 1981). In addition, rather than using a linguistic approach based on a transfer of information and subsequent representation and trying to identify the underlying cognitive mechanisms of vocal communication, behavioural ecologists had defended a more functional approach. (Manser 2013: 492/SO’s italics).

Most studies touched upon seem aware of the challenge of separating utterances from their context, environment, habitat, or situation. Researchers search for combinations, especially of calls dependent of contextual information. Under the headline Pragmatics, Manser argues that it has become increasingly evident that “[…] even semantic signals with a high production and perception specificity depend to some extent on the context/circumstances in which they are produced” (Manser 2013: 494). Manser (2013: 494) ends by pointing out a difference between two particular fields: Primate communication has focused mainly on identifying cognitive mechanisms in communication over recent years. In contrast, investigations of bird communication have mainly adopted a functional approach and have been less concerned with the representational nature of vocalizations, with the exception of the work on chickens (Gyger et al. 1987; Evans and Evans 2007) and on several corvid species (Bugnyar et al. 2001). In the framework’s terms, these are pragmatic and semantic approaches. If combined, they make up a dyadic design, briefly exemplified in the following.

First: In dyads a ‘third’ aspect is given an insignificant role or is hardly mentioned. A recent example of a dyadic relational approach is Suzuki et al. (2020: 1). They have introduced a framework for exploring the interaction between syntax and semantics (i.e. the syntax-semantic interface) in animal vocal sequences: “[…] We outline methods to test the cognitive mechanisms underlying the production and perception of animal vocal sequences and suggest potential evolutionary scenarios for syntactic communication” (Suzuki et al. 2020: 1).

Second: Combination of form and function. Kleisner (2015: 371) holds:

In biological discourse, all aspects of organisms are usually treated within the framework of conceptual dualism of form and function. Scientists usually emphasise either the formal or the functional aspect and promote it to a general organising principle of life. Sometimes, form follows function, other times the order is reversed (Kleisner 2007; Russell 1916).

Third: Examples of combining the semantic/cognitive aspect with a pragmatic/social one (as Manser above suggested) are fewer in studies of animal communication. Scott-Phillips and Kirby (2013: 430–431) suggest a framework for animal communication by adopting and prioritising a functional definition. A behaviour’s function: “[…] is the task that it performs that at least in part explains why it is produced from one generation to the next.” Communication in their view is basically a matter of effects. In the spirit of Austin (1962) they underline that signals and thus utterances do things. With a notion from Jakobson (1935) they, in my view, make pragmatics the dominant. Content/semantics is not totally backgrounded though: “Only once we know what they do can we identify information, conventional meaning, and other associated phenomena […]” (Scott-Phillips and Kirby 2013: 433).

Their framework has been applied in various fields. They argue that an advantage with their view is how communicational systems emerge, an argument close to their claim that the social influence view is “[…] particularly compatible with evolutionary theory” Scott-Phillips and Kirby 2013: 433). Their outlined framework can be characterised as dyadic in the sense that pragmatics (functions, doings) are explicitly combined with semantics (information, ‘meaning’). Yet, it is not balanced and not triadic. Function seems given an upper hand in the dyad and the ‘third dimension’ seems absent. Nevertheless, there are overlaps with the framework applied in this meta-study. When I first was reading their article I underlined the passage: “[…] we need a theoretical framework and a vocabulary with which to discuss different possible grades of pragmatic competence” Scott-Phillips and Kirby (2013: 435). In the margin I added: Animals can feel, think, and act. When they utter they have to do this at once. Their emotions are in a certain state. The utterance is about something that may need thinking/consideration. And it does something. This ‘at-once’ perspective (of the three aspects) offers a more holistic approach and hints this meta-study’s position in the paradigmatic tug of war. Without going into details, another example of a semantic/pragmatic dyad is of course Scarantino and Clay (2015) as touched upon discussing functional reference. They see the two aspects as complementary. Triadic approaches in general theories of communication are plentiful. For an overview see Ongstad (2014).

In studies of animal communication explicit triadic approaches are still rare or non-existent (Francescoli 2017; Witzany 2014; Ongstad 2019). Zuberbühler (2006) comes close to a triadic design. His combination of aspects seems inherited with the triad established within the pentagonal model of communication, and seemingly in line with Witzany (2014: vii): He investigates structure, or in his own word syntax, the cognitive basis for alarm calls, or in his own word semantics, and use. He is not using the word pragmatics, which might imply that his theoretical approach (here) is not primarily or dominantly ‘functional’.

According to Zuberbühler the structure of alarm calls is (after all) quite simple and to a high degree mostly genetically related. As Seyfarth and Cheney (1997) have suggested, primates seem predisposed to divide animals into just a few groups and only develop alarm calls to a restricted number of predators. Regarding a particular species, East African vervet monkeys, individuals are claimed to be able to produce calls differentiating between five entities, large terrestrial carnivores, eagles, snakes, baboons, and unfamiliar humans (Zuberbühler 2006: 7).

Meta-studies of alarm calls have made clear that discussions of reference in animal communication should (normatively) search for a wider theoretical horizon and thus even a more differentiated methodological scope. Manser (2013) hints that cognitive and functional approaches could be combined in the future. This is still a dyad though. To these aspects of communication, one could add form (Prum 2018) and emotion (Altenmüller et al. 2013). Some researchers stress a syntax–semantics interface in studies of animal vocal communication (Suzuki et al. 2020). Such approaches ask for adding a pragmatic perspective. It should be kept in mind that the concept pragmatics is complex and challenging (Bar-On and Moore 2017). Finally, context has been pointed out as a crucial aspect in some investigated articles (Witzany 2014; Weible 2011; Wheeler and Fischer 2012; Fedurek et al. 2016), but hardly discussed in a triadic perspective or related to kinds of communication.

Traces of Systemic Openness of Utterances and Life-Genres

By leaving the level of utterance one faces phenomena and processes seemingly hard to catch with traditional methods, such as lifeworld (Habermas 1981), systemic system (Martin 1997), dynamic context (Duranti and Goodwin 1992; Ongstad 2005) and genricity (Frow 2015).

Zuberbühler (2006) argues that alarm calls in an evolutionary perspective may contain an existential factor for some, but not necessarily for other species. Existential could be interpreted here as a crucial life function in some animals’ lifeworld. He holds that to study receivers’ reactions or use, is a study of the context of communication, not just of the utterance itself (Zuberbühler 2006: 6). He further discusses monkey’s and non-human primates’ ability to tap into other animals’ alarm calls, so-called eavesdropping. He concludes that:

[…] primates may be unique in their abilities to attend to the semantic properties of other species’ alarm calls. […] Research on Campbell’s Diana monkey system showed that alarm calls undergo semantic adjustments in the minds of the recipients, depending on the sequencing of alarm calls (Zuberbühler 2006: 10-11).

To accept that reference relates intimately to other aspects, and that utterance relates closely to genre, context, and lifeworld, implies accepting the idea of openness and hence a systemic system. A closer inspection of Seyfarth et al. (2010) can illustrate what could be meant by openness. They discuss the importance of (semantic) information in studies of animal communication. Three headlines can indicate some claimed basic principles for relationships between utterances and their reception:

  1. a)

    Calls with similar acoustic features can elicit different responses.

  2. b)

    Calls with different acoustic features are judged to be similar.

  3. c)

    Animals’ responses depend on the relationship instantiated by two signals.

In my terms they claim that a species’ communicational system is not necessarily bound to an exact preciseness of the reference (information) in an utterance (a signal, a call, a song etc.). In their terms there exists a receiver’s flexibility (Seyfarth et al. 2010: 5) and contextual flexibility is crucial for animal communication (Snowdon 2008).

Regarding a) it is commonplace that animals might react differently to acoustically similar calls. They refer to Evans and Evans (1999) who found that chickens “[...] respond in very different ways to food calls and to ground predator alarm calls even though the calls have similar acoustic characteristics” (Seyfarth et al. 2010: 5). Evans and Evans (2007) showed that chickens also responded differently to the same food call, depending on whether they already knew about the presence of food (Seyfarth et al. 2010: 5). Seyfarth et al. then give more examples (of other animals) before they conclude that response differences to “[…] acoustically similar calls cannot be attributed to acoustic features alone, but they are consistent with the hypothesis that responses depend upon the integration of information acquired from calls and other contextual cues” (Seyfarth et al. 2010: 5).

Regarding b) Seyfarth et al. argue that several studies of alarm calls (Zuberbühler et al. 1999) are consistent with their own hypothesis: “[…] listeners acquire information from a call, store it in memory, and respond to subsequent vocalizations depending on some combination of acoustic features, information provided by the current vocalization and context, and information stored in memory” (Seyfarth et al. 2010: 5). In other words, a vocalized animal utterance seems to follow the Bakhtinian dialogical principle of openness for differential uptake (Freadman 1987) or in their words, receiver flexibility. However, lack of direct correspondence does not necessarily imply that there is no concurrence. The sounds’ reference values are interpreted relative to different needs and contexts. Information (their term) or reference (my term) within the call/utterance is extracted, interpreted, and related by genres and made sense of in a species’ lifeworld. The communicational system is thus systemic (Martin 1997).

Point c) concerns the role of context. Through the lenses of the framework singing bout can be seen as a genre, a communicational contest involving fight over territory. Seyfarth et al. refer to Beecher et al. (1996) who found that song sparrows, Melospiza melodia, can mirror a neighbour’s song, by picking a song type from their own repertoire that can ‘match’ what a neighbour sparrow just sang. Burt et al. (2001) found that subjects responded more strongly to a match than a nonmatch (Seyfarth et al. 2010: 6): “It was not just the acoustic properties of the song that determined the neighbour’s behaviour […], but whether or not the two song types matched” (Seyfarth et al. 2010: 6).

Their conclusion so far can be rephrased as follows: When interpreters of the same species react differently to (two) identical utterances coming from different locations, they combine reference elements in the utterance just heard with memory elements about where the utterer is typically located. The use of reference elements in utterances (calls) hence depends on a non-fixed context. Such differentiation is not only connected to place (territory) but even dominance in a relationship. Eavesdropping is common among songbirds - interpreters may adjust their reaction depending on, not just on the calls’ acoustic properties, but also on the relationship between callers (Seyfarth et al. 2010: 6).

Taken as a whole these three points (a, b, and c) disturb essentialists perceptions and definitions of reference as a closed category (as ‘just’ information). They illustrate that a physical oriented research design should be combined with a communicational one in order to grasp a possible deeper nature of reference of animal vocalisations. It also shows that utterances should not be seen as closed ‘objects’, but as contextually dependent. Besides, some animals may have schematised contextual ‘situations’ (Tomasello 2014) and through genrification stereotyping utterances (Frow 2015; Ongstad 2010). With Bakhtin (1986), such utterances have ‘sense’ and effect in a dynamic (dialogic) chain of ever new utterances and adjusted to new situations. Such patterns may tend to be ‘frozen’ into genres (typified utterances) since they have proved functional over time, and hence contributed to a flexible system for an animal community.

The project’s framework sees such genres as contextual in basically two ways. Not only are they situational, as they tend to stereotype patterns for interaction for participants in an event. In addition, utterances will not necessarily create the exact same reaction each time since genres are both ritualised and open-ended (Bakhtin 1986). They are just similar and possibly similar enough (Ongstad 2019). An actual call is a plain utterance. However, the setting, the relationship between utterers/callers and receivers/listeners, can be seen both as a context and a genre, simultaneously. An observer is therefore forced to interpret at a new level. The actual habitat may have a say. Manser (2013) explains for instance that of two similar types of apes living in the same region, one uses alarm calls, the other not (Macedonia and Evans 1993). This is not due to different possibilities for escape for the two species, but probably because of their different ways of foraging in the environment. In other words, for contextual reasons. The alarm-system is needed for the sympatric meerkats, Suricata suricatta, (Manser et al. 2002; Furrer and Manser 2009). Manser (2013: 492)  argues:

The evolution of predator type-specific alarm calls in meerkats seems more likely to be related to the cohesion of the foraging group and the need for individuals to coordinate their escape, while this is not the case for the ground squirrels which typically forage next to their burrow system.

Further, Manser has pointed to the role of predator type when interpreting acoustic structures in suricates’ alarm calls (Manser 2001). Yet another example of the possible open-ended character of alarm calls, seen as genre, is given by Zuberbühler et al. (1999a, 1999b). In playback experiments Diana monkeys transferred habituation across semantically similar calls but not acoustically similar ones (Manser 2013: 493): “This suggests they were not responding to the acoustic features alone; instead, their responses were mediated by the similarity of the meaning of the presented stimuli.” This is another example of the complementary nature of the relationship between utterances and context, and thus even of genre and context. Elsewhere, (Zuberbühler (2006: 7) has suggested that alarm calls might be the output of certain psychological states, invoked by certain types of events, in my word genres.

A (self-)critique of these interpretations could be that the open nature of utterances and genres makes it too easy to ‘discover’ ever new genres in animal communication. Openness even complicates validation. It increases the possibility to find new cases, but challenges proving. The project risks being stuck with just degrees of likelihood.

Conclusions on whether Animals ‘Refer’?

It should be asked critically what has been put aside during this meta-study. Admittedly inspections have brought to surface a few issues that have not been covered. Some of these were explicitly dropped, in spite relevance for the «Can-animals-refer-question». For instance a deeper discussions of mind (Andrews and Beck 2017), the issue of perceived (‘passive’) understanding of concepts (Pilley and Reid 2011), discussions related to Gricean criteria applied on studies of animal semantics (Moore 2016), and the problem of using English onomatopoetical terms for animals’ communicative sounds as universal research categories (Ongstad 2019). They were dropped due to lack of space, not to lack of relevance.

This study originally set out asking Can animals refer? During initial investigations it progressively became clear that the different ways this question were answered by the different research traditions, resulted in a rather splayed outcome. The second part of the title, Meta-positioning studies of animal semantics therefore became the reframed, ‘second’ prioritised focus, aimed to catch different communicational theories behind the many answers. What the article therefore in addition has been up to, is to position, communicatively, some of these tacit epistemologies.

By means of a broad definition of communication, related specifically to a framework designed to connect utterance and genre, debates over key content concepts used in animal studies have been inspected. Inspections have highlighted and given credit to studies underlining the importance of and the need for studying other aspects in relation to semantics. The meta-study further gives support to researchers interpreting information and (functional) reference in a wider context. Although one could be positive to such extensions, a main, normative conclusion nevertheless remains – studies of animal communication should theoretically be seen as abstract, real wholes (Morris 1938) while in parallel investigating their ‘parts’ or aspects empirically.

Regarding the specific issue of reference in animal utterances, the study has cautiously taken for granted that utterances at least seem to be about something (Adams and Beighley 2013). On the one hand solid proofs of uttered and received references that might have been indexical were not found. On the other hand, the possibility of an animal capacity to ‘passively’ store semantic elements as mental percepts, as hinted in the introduction (Herzog et al. 2016) has not yet been refuted: If some animals are able to relate adequately to received utterances containing possible specific content, for instance given as tasks to dogs by humans (Pilley and Reid 2011), it may increase the likelihood of the existence of an advanced ability functioning as an evolutionary pre-condition to achieve the faculty of referring. The active/passive dilemma therefore asks for further studies of (some) animals’ ability to understand and relate rationally, for instance to human terms and concepts. The next study will problematise, in the light of theories of mind and specific animal life-genres, how such and other content elements may relate to structure, addressivity, and dynamic context.