1 Introduction

In the last two decades, research in dynamic semantics has attained a breadth of insights into the relation between compositional semantics and dynamic semantics by viewing prima facie non-compositional phenomena as arising from computational side effects. The effectful approach to meaning allows one to view compositionally recalcitrant features—e.g., discourse referents, intensionality, and scope-taking—as giving rise to a rich, but uniform structure which integrates them with truth-conditional meaning. This integration is done by injecting truth-conditional meaning into the effectful structure in a natural way, such that the structure gives rise to a functor. This pattern of analysis has come in a variety of forms: continuations to study scope (Barker 2002; Barker and Shan 2014; de Groote 2001), graded applicative functors to study quantification (Kobele, 2018a), and monads to study discourse referents, anaphora, and indefiniteness, among other phenomena (Shan, 2002; Giorgolo & Unger, 2009; Giorgolo & Asudeh, 2012, Charlow, 20142020a2020b, i.a.).

Progress in understanding dynamic and scopal phenomena in terms of effects, however, has presented two basic methodological questions. On the one hand, given effectful treatments of individual phenomena (say, discourse referents, quantification, and conventional implicature), how does one integrate them into a semantic analysis encompassing them all? On the other hand, how does one study interactions between these phenomena, while simultaneously preserving their individual treatments in the result? In this paper, we address both questions by providing a general framework, based on algebraic effects, for characterizing individual dynamic semantic phenomena, as well as their interactions, in terms of algebraic laws.

Stated in other terms, our goal is to improve the compositionality properties of functor-based theories of dynamic semantics; i.e., by recasting them as algebraic theories:

  • At a meta-theoretical level, when two phenomena are described by two distinct theories within our framework, we provide a systematic recipe for obtaining a combined theory of both phenomena. The combination is monotonic, in the sense that the predictions of the original theories, regarding either phenomenon, remain unchanged in the combined theory.

  • In individual analyses, when two syntactically adjacent constituents feature two distinct (yet possibly interacting) phenomena, their meanings may always be combined compositionally, in order to obtain a meaning for their combination.

We begin in Sect. 2 with a background to monadic dynamic semantics, and we present the issue of compositionality that pertains to monads and monad transformers. In Sect. 3, we axiomatize our approach in terms of the meta-language we use to describe algebraic theories, and we show how meanings may be provided in terms of this meta-language. Section 4 provides an interpretation of this axiomatizations in terms of a simply typed \( \lambda \)-calculus with products. We discuss related work in Sect. 5 before concluding in Sect. 6.

2 Monadic Dynamic Semantics

Since the work of Shan (2002), monads have provided a popular interface for semantic analyses employing computational effects. Monads have been used to study anaphora (Giorgolo & Unger, 2009) and conventional implicature (Giorgolo and Asudeh 2012), and have more recently been taken up by Charlow (2014, 2020a, 2020b) to study the interactions among quantification, anaphora, indefiniteness, and binding in a framework that relies on monad transformers.

We assume a general familiarity with monads, but we briefly remind the reader of their structure, in order to introduce notation. A monad M is an endofunctor that takes a given type \( \alpha \) onto a type \(M \alpha \) of computations exhibiting structure that encapsulates some desired side effect, e.g., reading and writing to a store, or non-determinism.Footnote 1 Each monad M is associated with two operators, \( \eta \) (‘return’) and \(\,\star \,\) (‘bind’), having the following type signatures, for any types \( \alpha \) and \(\beta \):

$$\begin{aligned} \eta&: \alpha \rightarrow M \alpha \\ (\,\star \,)&: M \alpha \rightarrow ( \alpha \rightarrow M \beta ) \rightarrow M \beta \end{aligned}$$

The role of \( \eta \) is to inject pure (i.e., non-effectful) values into the structure provided by M, while \(\,\star \,\) sequences a computation of type \(M \alpha \) with an indexed computation of type \( \alpha \rightarrow M \beta \) to produce a sequenced computation of type \(M \beta \).

2.1 Using Monad Transformers: Charlow (2014)

Charlow (2014) introduces a monadic dynamic semantics that combines analyses of anaphora, indefiniteness, and quantification by relying on monad transformers. In particular, Charlow uses a Powerset monad to characterize indefiniteness, and then applies a State monad transformer, in order to obtain a system to characterize both indefiniteness and anaphora in the same grammar. He then applies a Continuation monad transformer, in order to provide a setting to study quantification. Crucially, the analyses that he provides for individual phenomena are extended compositionally to obtain analyses of their combinations with new phenomena.Footnote 2

The Powerset monad \(\textsf {P}\) allows one to analyze indefinite noun phrases (and the expressions with which they compose) as denoting sets, encoded as functions of type \( \alpha \rightarrow t\):

$$\begin{aligned} \textsf {P} \alpha&= \alpha \rightarrow t\\ \eta&: \alpha \rightarrow \alpha \rightarrow t\\ \eta a&= \{a\}(= \lambda x.x = a)\\ (\,\star \,)&: ( \alpha \rightarrow t) \rightarrow ( \alpha \rightarrow \beta \rightarrow t) \rightarrow \beta \rightarrow t\\ m \,\star \,k&= \bigcup _{x \in m}k x(= \lambda y.\exists x : m x \wedge k x y) \end{aligned}$$

This way, the noun phrase a linguist, for instance, will denote the set \(\{x \ |\ \textsf {ling} x\}\) and may be composed with an intransitive verb such as sleeps by injecting the latter into the monad via \( \eta \): \( \eta \textsf {sleep}\). To compose them, Charlow employs monadic functional application (which he overloads with forward and backward application, to be disambiguated by the types of arguments). Functional application (FA) is defined as follows for an arbitrary monad M:

$$\begin{aligned} {\textbf {FA}}\;:\;&M ( \alpha \rightarrow \beta ) \rightarrow M \alpha \rightarrow M \beta \\ \text {or}\;&M \alpha \rightarrow M ( \alpha \rightarrow \beta ) \rightarrow M \beta \\ {\textbf {FA}}\,m\,n\; =\;&m \,\star \, \lambda f.n \,\star \, \lambda x. \eta (f x)\\ \text {or}\;&m \,\star \, \lambda x.n \,\star \, \lambda f. \eta (f x) \end{aligned}$$

Now, a linguist sleeps may be interpreted as \({\textbf {FA}} \{x \ |\ \textsf {ling} x\} ( \eta \textsf {sleep})\), which can be reduced to \(\{\textsf {sleep} x \ |\ \textsf {ling} x\}\); that is, a set of truth values containing \({\textit{True}}\) iff some linguist sleeps.

To incorporate anaphora, he invokes the following State monad transformer, which takes an underlying monad M onto a new monad \(\textsf {S}_T M\), for some fixed type s of states:

$$\begin{aligned} \textsf {S}_T\,M\, \alpha&= s \rightarrow M ( \alpha \times s)\\ \eta&: \alpha \rightarrow s \rightarrow M ( \alpha \times s)\\ \eta a&= \lambda s. \eta \langle a, s\rangle \\ (\,\star \,)\; :\;&(s \rightarrow M ( \alpha \times s)) \rightarrow \\&( \alpha \rightarrow s \rightarrow M (\beta \times s)) \rightarrow \\&s \rightarrow M (\beta \times s)\\ m \,\star \,k&= \lambda s.m s \,\star \, \lambda \langle x, s^\prime \rangle .k x s^\prime \end{aligned}$$

In this example M will be instantiated to the Powerset monad, and s to the type of lists of individuals. This transformation of the Powerset monad to provide State functionality allows such lists to be accessed and updated throughout semantic composition as lists of discourse referents. To allow the indefinite a linguist to introduce a discourse referent, for example, Charlow defines the following operation, \((\cdot )^\triangleright {}\), for an underlying Powerset monad, though which we give for an arbitrary underlying monad M in the presence of State functionality:

$$\begin{aligned} (\cdot )^\triangleright {}&: \textsf {S}_T M \alpha \rightarrow \textsf {S}_T M \alpha \\ m^\triangleright {}&= m \,\star \, \lambda \langle x, s\rangle . \eta x (x::s) \end{aligned}$$

Here, the operation \(::\) conses a new individual onto a list, thus providing it as a discourse referent. Now, one can associate the sentence a linguist sleeps with a discourse referent by having a linguist introduce it (given an updated instance of FA):

$$\begin{aligned}&{\textbf {FA}} ( \lambda s.\{\langle x, s\rangle \ |\ \textsf {ling} x\}^\triangleright {}) ( \eta \textsf {sleep})\\&\quad = \lambda s.\{\langle \textsf {sleep} x, x::s\rangle \ |\ \textsf {ling} x\} \end{aligned}$$

Thus the meaning of a linguist has changed, given our use of the State-transformed Powerset monad. The new meaning is, in fact, straightforward to obtain from the old meaning, however, in terms of a function lifting values from \(M \alpha \) to \(\textsf {S}_T M \alpha \):

$$\begin{aligned} lift _\textsf {S}&: M \alpha \rightarrow \textsf {S}_T M \alpha \\ lift _\textsf {S}\, m&= \lambda s.m \,\star \, \lambda x. \eta \langle x, s\rangle \end{aligned}$$

It is in this sense that the addition of State functionality to meanings stated with respect to the Powerset monad is (in principle) compositional. Both the monadic combinators of the Powerset monad, and the meanings it is used to characterize, may be injected into the State setting.

Charlow uses this strategy to introduce analyses of quantificational noun phrases into the monadic setting. Taking inspiration from the continuation-style treatments of quantifiers of Barker (2002) and Barker and Shan (2014), he employs the following Continuation monad transformer, \(\textsf {C}_T\).

$$\begin{aligned} \textsf {C}_T M \alpha&= ( \alpha \rightarrow M t) \rightarrow M t\\ \eta&: \alpha \rightarrow ( \alpha \rightarrow t) \rightarrow t\\ \eta a&= \lambda c.c a\\ (\,\star \,)\; :\;&(( \alpha \rightarrow M t) \rightarrow M t) \rightarrow \\&( \alpha \rightarrow (\beta \rightarrow M t) \rightarrow M t) \rightarrow \\&(\beta \rightarrow M t) \rightarrow M t\\ m \,\star \,k =\&\lambda c.m ( \lambda x.k x c) \end{aligned}$$

The underlying monad, in this case, is the State-transformed Powerset monad. Like the State monad transformer, the Continuation monad transformer also comes with a lifting function \( lift _\textsf {C}\):

$$\begin{aligned} lift _\textsf {C}&: M \alpha \rightarrow \textsf {C}_T M \alpha \\ lift _\textsf {C}\, m&= \lambda k.m \,\star \,k \end{aligned}$$

Now, a quantificational noun phrase such as every philosopher can be given the meaning \( \lambda c, s.\{\langle \forall x.\textsf {phil} x \rightarrow \exists y, s^\prime : \langle True, s^\prime \rangle \in c x s, s\rangle \}\).Footnote 3 Moreover, a sentence such as every philosopher sees a linguist, may be composed as follows, given a version of FA appropriate to the Continuation monad:

$$\begin{aligned}&{\textbf {FA}} ( \lambda c, s.\{\langle \forall x.\textsf {phil} x \rightarrow \exists s^\prime : \langle True, s^\prime \rangle \in c x s, s\rangle \})\\&({\textbf {FA}} ( \eta \textsf {see})) (lift_\textsf {C} ( \lambda s.\{\langle y, s\rangle \ |\ \textsf {ling} y\}))))\\&\quad = \lambda c, s.\left\{ \langle \forall x.\textsf {phil} x \rightarrow \exists s^\prime :\langle True, s^\prime \rangle \in \bigcup _{\textsf {ling} y}(c (\textsf {see} y x) s), s\rangle \right\} \end{aligned}$$

Finally, as Charlow shows, such meanings of type \(\textsf {C}_T (\textsf {S}_T \textsf {P}) t\) may be lowered to ones of type \(\textsf {S}_T \textsf {P} t\) by applying them to the \( \eta \) of the State-transformed Powerset monad:

$$\begin{aligned}&{\textit{lower}}_\textsf {C} \left( \lambda c, s.\left\{ \langle \forall x.\textsf {phil} x \rightarrow \exists s^\prime :\langle {\textit{True}}, s^\prime \rangle \in \bigcup _{\textsf {ling} y}(c (\textsf {see} y x) s), s\rangle \right\} \right) \\&\quad = \left( \lambda c, s.\left\{ \langle \forall x.\textsf {phil} x \rightarrow \exists s^\prime :\langle {\textit{True}}, s^\prime \rangle \in \bigcup _{\textsf {ling} y}(c (\textsf {see} y x) s), s\rangle \right\} \right) \eta \\&\quad = \lambda s.\{\langle \forall x.\textsf {phil} x \rightarrow \exists s^\prime : \langle {\textit{True}}, s^\prime \rangle \in \{\langle \textsf {see} y x, s\rangle \ |\ \textsf {ling} y\}, s\rangle \}\\&\quad = \lambda s.\{\langle \forall x.\textsf {phil} x \rightarrow \exists y : \textsf {ling} y \wedge \textsf {see} y x, s\rangle \}\\&\quad = \eta (\forall x.\textsf {phil} x \rightarrow \exists y : \textsf {ling} y \wedge \textsf {see} y x) \end{aligned}$$

Such lowered meanings may, in turn, be lifted back into the Continuation monad, e.g., in order to further compose them with quantificational meanings:

$$\begin{aligned}&lift _\textsf {C} ( \eta (\forall x.\textsf {phil} x \rightarrow \exists y : \textsf {ling} y \wedge \textsf {see} y x))\\&\quad = \lambda k. \eta (\forall x.\textsf {phil} x \rightarrow \exists y : \textsf {ling} y \wedge \textsf {see} y x) \,\star \,k\\&\quad = \lambda k.k (\forall x.\textsf {phil} x \rightarrow \exists y : \textsf {ling} y \wedge \textsf {see} y x)\\&\quad = \eta (\forall x.\textsf {phil} x \rightarrow \exists y : \textsf {ling} y \wedge \textsf {see} y x) \end{aligned}$$

Note that once a quantifier has been lowered, its scope is fixed. Thus Charlow composes \( lift _\textsf {C}\) and \({\textit{lower}}_\textsf {C}\) in this way, as an operator \({\textit{reset}}\), in order to delimit the scope of quantifiers to finite clause boundaries. At the same time, as he shows, lowering does not affect the capacity of indefinite noun phrases and discourse referents to take scope; their side effects are still potent, as they are represented in terms of the State-transformed Powerset monad. As a consequence, the limited scopal possibilities for quantifiers and the flexible scoping behavior of indefinites and discourse referents may be modeled within the same continuation-based setting.

2.2 Monads and Compositionality

The above discussion provides only a schematic presentation of the system of Charlow (2014). What we hope to have conveyed, however, is the manner in which the system, as a theory of indefiniteness, anaphora, and quantification, is monotonic and compositional in the senses introduced earlier. The theory of indefiniteness may be stated on its own, in terms of the Powerset monad, and then embedded into the combined theory of indefiniteness and anaphora, using a monad transformer. This combined theory may likewise be embedded into the combined theory of indefiniteness, anaphora, and quantification. To say that the embedding is monotonic and compositional is to say that it constitutes a (monad) homomorphism. Every lifting function \( lift \) has the property, in general, that it preserves the monadic combinators: \( lift \,( \eta a) = \eta a\), and \( lift \,(m \,\star \,k) = lift \,m \,\star \, \lambda x. lift \,(k x)\). Thus the theory stated with respect to the underlying monad is never truly forgotten and may, in fact, be used when convenient, i.e., before applying a \( lift \).

What we aim to show in this paper is that the algebraic approach that we advocate has this property to an even greater degree. Indeed, the simple monadic approach requires, in many cases, determining a monad ahead of time that combines all of the effects which may occur in a given analysis. For example, say that one wants a theory of quantification on its own, independent of a theory of indefiniteness and anaphora. Then, one may employ the Continuation monad (as akin to Barker 2002; Barker and Shan 2014); in this case, the definitions of \( \eta \) and \(\,\star \,\) remain identical to those stated above, except for their types: the result type of the continuation is now simply t, so that \(M \alpha = ( \alpha \rightarrow t) \rightarrow t\). In turn, every philosopher may be given its usual generalized-quantifier meaning, i.e., \( \lambda k.\forall x : \textsf {phil} x \rightarrow k x\). Incorporating theories of indefiniteness and anaphora, however, will now prove more difficult. The State monad transformer will provide a monad that takes a type \( \alpha \) onto the type \(s \rightarrow ( \alpha \times s \rightarrow t) \rightarrow t\):

$$\begin{aligned} \textsf {S}_T \textsf {C} \alpha&= s \rightarrow \textsf {C} ( \alpha \times s)\\&= s \rightarrow ( \alpha \times s \rightarrow t) \rightarrow t\\ \eta a&= \lambda s, c.c \langle a, s\rangle \\ m \,\star \,k&= \lambda s, c.m s ( \lambda \langle x, s^\prime \rangle .k x s^\prime c) \end{aligned}$$

Indeed, this result may appear, at first, to be suitable for a combined analysis of quantification and anaphora, but note, for example, that the value returned within the underlying Continuation monad will systematically have the type of a product. As a result, a \({\textit{lower}}\) operation will be required to have the type \((s \rightarrow (t \times s \rightarrow t) \rightarrow t) \rightarrow s \rightarrow t \times s\), but it is not obvious what the appropriate definition of such an operation would be.Footnote 4 Rather, in order to achieve the desired result, it seems that one must start with the Powerset monad, then incorporate anaphora, and then finally, incorporate quantification. Much more generally, the lifting functions \( lift _X\) associated with each monad transformer X are often unidirectional, requiring that a choice of result fixes the underlying monad. Thus, ensuring that the resulting monad has a certain desired behavior will limit the flexibility with which one is able to combine different sources of functionality. In contrast, as we will show, the algebraic approach assigns a type with the minimum required effect to each meaning, and the combination of meanings with different effects systematically computes the appropriate result. There is thus no priority associated with one effect or another.Footnote 5

A second difference between our algebraic approach and the monadic approach is that the types we compute for meanings exhibiting multiple effects is more informative: it yields a linguistically meaningful summary of the effects an expression gives rise to, as we will show.

3 Algebraic Effects via Graded Monads

As a way forward, we propose a double move: to simultaneously make the monadic approach modular and make its types more fine-grained.

First, we propose that semantic side effects be studied algebraically, in terms of equational laws characterizing the individual phenomena, which may then be combined. This move is inspired by Maršík (2016) and Maršík and Amblard (2014, 2016), who develop a typed extension of the \( \lambda \)-calculus to study algebraic effects in semantics.Footnote 6 Unlike the approach of Maršík, we show how effects employed by semanticists—e.g., state and non-determinism—may be recast algebraically (while remaining in a pure setting), leading to more extensible grammars.

Second, we propose to track the relevant effects at the level of types, by using a graded monad.Footnote 7 In contrast to plain monads, graded monads are indexed with an abstraction of the effect that they perform, hereafter referred to simply as the “grade”. The unit \( \eta \) of the monad is associated the unit grade (1). The grade of the composition of effects under \(\,\star \,\) is the composition of their grades, written with the operator \((\cdot )\) (see Fig. 1). Graded monads have been applied previously in the field of programming language theory to describe the semantics of algebraic effects (Katsumata, 2014; Mycroft et al., 2016; Orchard et al., 2019). In natural language semantics, they have been employed in the analysis of presupposition projection and anaphora (Grove, 2019). In our analysis, different phenomena are assigned different grades independently of each other. This means that the interpretations associated with individual phenomena may be freely composed, in order to yield grammars that combine the relevant effects.

One can then describe the interactions between effects using two sets of laws. The first set concerns the abstract level of grades. The second set concerns the concrete level of \( \lambda \)-terms and operations. These two sets of laws are related: any law between terms generates a corresponding law between grades; that is, any law governing terms is only allowable if there exists a corresponding law governing the behavior of grades. To illustrate, consider the unit and associativity laws on terms, which are part of the definition of a graded monad (Fig. 1). For types to be preserved in the statement of associativity for \(\,\star \,\), the \((\cdot )\) operator must be associative. Likewise, for types to be preserved in the identity laws regulating the behavior of \( \eta \), 1 must be the left and right unit of \((\cdot )\). That is, grades must form a monoid.

Fig. 1
figure 1

Definition of a graded monad

From now on, we develop not just an equational theory of terms and grades, but a theory of reduction. That is, we use a reduction relation between terms written ‘\(\longrightarrow \)’, and one between grades written ‘\(\leadsto \)’. These relations are the (respective) reflexive transitive congruence of the laws that we list below. By definition, two terms \(t_1\) and \(t_2\) are equal if they are inter-reducible; likewise for grades. At this point, our theory encompasses only the graded monad laws. At the introduction of any new law, we will ensure that the reduction relations on both grades and terms are confluent. In particular, we will ensure that the asserted laws are compatible with associativity; i.e., \(g_1\cdot (g_2\cdot g_3)\) and \((g_1\cdot g_2) \cdot g_3\) should always reduce to the same grade. Similarly at the level of terms: any proposed reduction rule should respect the Associativity law. We further discuss the importance of confluence in Sect. 5.1.

3.1 Compositional Dynamic Semantics

As recalled in Sect. 2, monadic semantics in the style of Shan (2002) aims to augment the interpretation of each syntactic category with an effect. In the present framework, this effect is graded. For example, if a sentence is interpreted as a truth value of type t in a non-effectful semantics, it is interpreted in our framework as a truth value associated with an effect with some grade g, i.e., of type \(\textsf {M}_g t\).

Moreover, whereas in Montague semantics, one uses functional application, we additionally employ the graded applicative functor structure arising from the graded monad, characterized by (either of) the operators \((\triangleright {})\) and \((\triangleleft {})\):

$$\begin{aligned} (\triangleright {})&: \textsf {M}_p ( \alpha \rightarrow \beta ) \rightarrow \textsf {M}_q \alpha \rightarrow \textsf {M}_{p \cdot q} \beta \\ m \triangleright {}n&= m \,\star \, \lambda f.n \,\star \, \lambda x. \eta (f x) \\ (\triangleleft {})&: \textsf {M}_p \alpha \rightarrow \textsf {M}_q ( \alpha \rightarrow \beta ) \rightarrow \textsf {M}_{p \cdot q} \beta \\ m \triangleleft {}n&= m \,\star \, \lambda x.n \,\star \, \lambda f. \eta (f x) \end{aligned}$$
Fig. 2
figure 2

Rules for forward and backward application

Table 1 A lexicon fragment

For illustration, we present a small applicative categorial grammar fragment in Table 1, and, in Fig. 2, two rules of interpretation corresponding to functional application and two rules which make use of the applicative functor structure of our system. The need for seemingly redundant rules corresponding to simple functional application (above in Fig. 2), in addition to applicative combination (below in Fig. 2), arises from the fact that some meanings manipulate effectful values directly: their type is of the form \(\textsf {M}_p \alpha \rightarrow \textsf {M}_q \beta \), rather than \(\textsf {M}_q ( \alpha \rightarrow \beta )\). As such, they cannot be combined by either of the \(\triangleright {}\) or \(\triangleleft {}\) operators.Footnote 8 We additionally admit a rule (\( \mu \)) which collapses a meaning of type \(\textsf {M}_{g_1} (\textsf {M}_{g_2} \alpha )\) into one of type \(\textsf {M}_{g_1 \cdot g_2} \alpha \) by sequencing it (via \(\,\star \,\)) with the identity function. In the following pages, we write ‘\( \mu \,m\)’ in place of ‘\(m \,\star \, \lambda x.x\)’ to be concise.

Using only the (\(\backslash \)) rule, we may interpret john walks as a value whose grade is 1, i.e., one without any dynamic effect.

figure a

The definition of \((\triangleleft {})\) and the monad laws allow this result to be reduced:

$$\begin{aligned}&( \eta \textsf {j}) \triangleleft {}( \eta \textsf {walk})\\ =\;&\eta \textsf {j} \,\star \, \lambda x. \eta \textsf {walk} \,\star \, \lambda f. \eta (f x)\\ \longrightarrow \;&\eta (\textsf {walk}\,\textsf {j})&\quad \qquad \mathrm{(by\,Left\,Identity)} \end{aligned}$$

3.2 Anaphora

We can extend our analysis to account for anaphora. For any type \( \alpha \), we may posit a grade \(\mathsf {Get}[d : \alpha ]\), along with a new primitive, \(\mathsf {get}_d\):

$$\begin{aligned} \mathsf {get}_d : \textsf {M}_{\mathsf {Get}[d : \alpha ]} \alpha \end{aligned}$$

The purpose of \(\mathsf {get}_d\) is to retrieve a discourse referent d, whose type is \( \alpha \), from the linguistic context.Footnote 9 For instance, one can consider \( \alpha \) to be e, the semantic type of entities, although any semantic type is supported, in principle.

The grade \(\mathsf {Get}[d : \alpha ]\) records that one presupposes the existence of a discourse referent with label d and type \( \alpha \). For example, \(\mathsf {get}_d\) may be used to interpret a pronoun, with the typing \(\mathsf {get}_d : \textsf {M}_{\mathsf {Get}[d : e]} e\). The labels used for discourse referents are equipped with a decidable equality relation, but otherwise, they carry no meaning.Footnote 10 It should be noted that labels occur only inside grades—in Sect. 4, we show how the primitives may be interpreted into a label-free calculus. Finally, thanks to the typing rule for \(\,\star \,\), a phrase which uses some number of discourse referents lists them all in its grade. For example, we might have the type \(\textsf {M}_{\mathsf {Get}[d_{masc} : e] \cdot \mathsf {Get}[d_{fem} : e]} t\) for the sentence he likes her.

Our goal is to formalize how grades interact. Since we do not keep track of the order in which discourse referents are introduced, we have the following equality on grades:

$$\begin{aligned} \mathsf {Get}[d_1: \alpha ] \cdot \mathsf {Get}[d_2: \beta ] = \mathsf {Get}[d_2: \beta ] \cdot \mathsf {Get}[d_1: \alpha ] \end{aligned}$$
(1)

Whenever we assert such a law on grades, it is important to check that it preserves the overall system’s confluence in the presence of the other laws, including the monoid laws. So far, we have asserted only a commutation law, and it is easy to see that no problem arises.

Second, we do not keep track of how many references to a single discourse referent occur. Moreover, if two references to the same discourse referent are made, their types should agree. This is captured by the following law:Footnote 11

$$\begin{aligned} \mathsf {Get}[d : \alpha ] \cdot \mathsf {Get}[d : \alpha ] \leadsto \mathsf {Get}[d : \alpha ] \end{aligned}$$
(2)

To complete the formal definition of the treatment of anaphoric expressions, it suffices to state how two instances of \(\mathsf {get}_d\) should interact, as guided by the behavior of their grades. We employ two laws on terms (which we label according to the respective corresponding laws on grades):

$$\begin{aligned} \qquad \qquad \mathsf {get}_{d_1} \,\star \, \lambda x.\mathsf {get}_{d_2} \,\star \, \lambda y. \eta \langle x, y\rangle&= \mathsf {get}_{d_2} \,\star \, \lambda y.\mathsf {get}_{d_1} \,\star \, \lambda x. \eta \langle x, y\rangle&\quad \qquad \qquad {(1^\prime )}\\ \qquad \qquad \mathsf {get}_d \,\star \, \lambda x.\mathsf {get}_d \,\star \, \lambda y. \eta \langle x, y\rangle&\longrightarrow \mathsf {get}_d \,\star \, \lambda x. \eta \langle x, x\rangle&\quad \qquad \qquad {(2^\prime )} \end{aligned}$$

The first law states that references to independent discourse referents commute. This law corresponds to law (1) on grades stating that the order of labels in a grade does not matter. The second law states that two references to the same discourse referent collapse to a single reference. This law corresponds to law (2) on grades, which collapses two associations with the same label. Note that, instead of first presenting the laws on grades, we could have stated the algebraic laws on terms and deduced their typing. Correct typing ensures that the behavior of terms, as captured by the algebraic laws, is mirrored by the behavior of grades, as captured by the grade laws.

3.3 Introducing Discourse Referents

As the dual to accessing discourse referents, one can introduce new ones. For this purpose, we add a new grade \(\mathsf {Put}[d : \alpha ]\), along with a new primitive:

$$\begin{aligned} \mathsf {put}_d : \alpha \rightarrow \textsf {M}_{\mathsf {Put}[d : \alpha ]} \diamond \end{aligned}$$

The returned type, \(\diamond \), is the unit type, thus signifying that \(\mathsf {put}_d\) makes no significant contribution at the level of values. In terms of this primitive, one can define an operation \((\cdot )^\blacktriangleright _{d}\), which binds its argument to the discourse referent d. (The notation is inspired by the similar notation of Barker and Shan (2014), as well as of Charlow (2014).) This operation performs the dynamic effects associated with its argument, following which it binds the value returned to d:

$$\begin{aligned} (\cdot )^\blacktriangleright _{d}&: \textsf {M}_g \alpha \rightarrow \textsf {M}_{g \cdot \mathsf {Put}[d: \alpha ]} \alpha \\ m^\blacktriangleright _{d}&= m \,\star \, \lambda x.\mathsf {put}_d x \,\star \, \lambda {\diamond }. \eta x \end{aligned}$$

The ‘\( \lambda {\diamond }.\)’ notation indicates that a value of type \(\diamond \) is expected as an argument to the relevant \( \lambda \)-expression.

Table 2 Adding discourse referents

To illustrate, let us return to our running example, given the updated lexicon in Table 2. We now interpret john walks as follows.

figure b

After unfolding the definitions and \(\beta \)-reducing, we obtain \(\mathsf {put}_d \textsf {j} \,\star \, \lambda {\diamond }. \eta (\textsf {walk}\,\textsf {j})\), whose type is \(\textsf {M}_{\mathsf {Put}[d:e]}t\), thus capturing that the discourse referent d has been introduced.

When considered on its own, \(\mathsf {put}_d\) behaves similarly to \(\mathsf {get}_d\). The order of introduction does not matter:Footnote 12

$$\begin{aligned} \mathsf {Put}[d_1: \alpha ] \cdot \mathsf {Put}[d_2: \beta ] = \mathsf {Put}[d_2: \beta ] \cdot \mathsf {Put}[d_1: \alpha ] \end{aligned}$$
(3)

Consequently, two discourse referents commute. We can formalize this as the following equation on terms:

figure c

Although \(\mathsf {get}_d\) and \(\mathsf {put}_d\) arise independently and have interpretations on their own, we can describe their interactions in terms of algebraic laws. We illustrate this fact first on the relations on terms, by adding two laws:

$$\begin{aligned} \mathsf {put}_d a \,\star \, \lambda {\diamond }.\mathsf {get}_d&\longrightarrow ( \eta a)^\blacktriangleright _{d}&\quad \qquad (4^\prime )\\ \mathsf {put}_{d_1} a \,\star \, \lambda {\diamond }.\mathsf {get}_{d_2}&\longrightarrow \mathsf {get}_{d_2} \,\star \, \lambda x.\mathsf {put}_{d_1} a \,\star \, \lambda {\diamond }. \eta x \quad (d_1 \ne d_2)&\quad \qquad (5^\prime ) \end{aligned}$$

These laws ensure that \(\mathsf {get}_d\) uses only the discourse referent d that \(\mathsf {put}_d\) introduces. Assuming that the terms are well typed, the grades on the left should reduce to the grades on the right; consequently, the following laws hold on grades:

$$\begin{aligned} \mathsf {Put}[d : \alpha ] \cdot \mathsf {Get}[d : \alpha ]&\leadsto \mathsf {Put}[d : \alpha ] \end{aligned}$$
(4)
$$\begin{aligned} \mathsf {Put}[d_1 : \alpha ] \cdot \mathsf {Get}[d_2 : \beta ]&\leadsto \mathsf {Get}[d_2: \beta ] \cdot \mathsf {Put}[d_1 : \alpha ]&(d_1 \ne d_2) \end{aligned}$$
(5)

The first law finds a satisfying linguistic justification: when a discourse referent is introduced, it is no longer presupposed. The second law ensures that introductions and uses of distinct discourse referents ignore each other.

To illustrate, consider composing the two utterances john walks with he sits. Given the lexicon in Table 2, this miniature discourse receives the following meaning:

$$\begin{aligned}&\llbracket \textit{john walks; he sits}\rrbracket \\ =\;&((\mathsf {put}_d \textsf {j} \,\star \, \lambda {\diamond }. \eta (\textsf {walk}\,\textsf {j})) \triangleleft {} \eta ( \lambda \phi , \psi . \phi \wedge \psi )) \triangleright {}(\mathsf {get}_d \,\star \, \lambda x. \eta (\textsf {sit} x)) \quad \quad \quad ({\hbox {by}}\,\, / , { \backslash }, \ \, \mu )\\ \longrightarrow \;&(\mathsf {put}_d \textsf {j} \,\star \, \lambda {\diamond }.\mathsf {get}_d )\,\star \, \lambda x. \eta (\textsf {walk} \textsf {j} \wedge \textsf {sit} x) \quad \quad \quad \quad ({\hbox {by Associativity, Left Identity}})\\ \longrightarrow \;&( \eta \textsf {j})^\blacktriangleright _{d} \,\star \, \lambda x. \eta (\textsf {walk} \textsf {j} \wedge \textsf {sit} x) \;\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad ({\hbox {by law}}\,(4))\\ \longrightarrow \;&\mathsf {put}_d \textsf {j} \,\star \, \lambda {\diamond }. \eta (\textsf {walk} \textsf {j} \wedge \textsf {sit} \textsf {j}) \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad ({\hbox {by Associativity, Left Identity}}) \end{aligned}$$

The resulting meaning is of type \(\textsf {M}_{\mathsf {Put}[d : e]} t\); it introduces a discourse referent (d), but has no anaphoric presupposition, despite the presence of the pronoun he. That is, its reference is resolved. Checking confluence is a less easy exercise now than before. We can, however, convince ourselves that it holds by noting that the following re-association is confluent:

3.4 On the State Monad

Both Charlow 2014 and other work in monadic dynamic semantics have employed the state monad, in order to model anaphora (Giorgolo & Unger, 2009; Unger, 2012). The foregoing formalization vindicates some of the state monad laws (laws (2) and (4)), but to get a full specification of the state monad, one additionally needs the following law:

figure d

To preserve types, this law on terms requires the following law to hold on grades:

$$\begin{aligned} \mathsf {Get}[d: \alpha ] \cdot \mathsf {Put}[d: \alpha ] = 1 \end{aligned}$$
(6)

Such a law is problematic, however, as it contravenes confluence:

Thus not all of the state monad laws can be imported into our framework, given how we employ graded types. What is responsible for this difference? The state monad is a theory of memory locations. According to the corresponding model of state, such memory locations pre-exist the lifetime of a program, and can be updated any number of times. Using a state monad to model anaphora would thus require that a constant set of referents be handled by the discourse. In comparison, our encoding of discourse referents is more precise: we record at the level of grades the exact discourse referents either introduced or presupposed. For our purposes, there is a fundamental difference between introducing a discourse referent and not introducing it. A contrario, we ought to reject the hypothetical law (6), which implies that using a discourse referent and then introducing it is, in fact, equivalent to doing nothing.

3.5 Quantification

As a further step, we may introduce another grade, \(\mathsf {Scope}\), in order to analyze expressions, such as every, which are commonly taken to denote generalized quantifier meanings. Like those introduced above, this grade is accompanied by its own primitive:

$$\begin{aligned} \mathsf {scope}: ((e \rightarrow t) \rightarrow t) \rightarrow \textsf {M}_{\mathsf {Scope}} e \end{aligned}$$

Thus given a quantifier q of type \((e \rightarrow t) \rightarrow t\), \(\mathsf {scope}\,q\) allows it to act as an entity at the level of values, i.e., in terms of the variable q binds, given that the primitive’s return type is e.Footnote 13

Indeed, the scopes of natural language quantifiers have been observed to be restricted in certain ways: one common view is that a quantifier cannot take scope outside the smallest finite clause in which it occurs syntactically. For example, some cat fears every dog will chase it can be understood only to imply the existence of a single highly pessimistic cat. To capture the effect of scope islands, we also introduce an operation on grades and a primitive which introduces it:

The intent is that allows one to ensure that a value bound in \({\textit{body}}\) using \(\mathsf {scope}\,q\) is not available outside of \({\textit{body}}\). This makes it possible to statically limit the scope of a variable bound by a quantifier.

The modularity provided by our approach allows us to import the laws regulating anaphora into the current setting. At the same time, we may describe the interactions between anaphora and quantification. To that end, we may state the following laws on grades:

$$\begin{aligned} \mathsf {Scope}\cdot \mathsf {Get}[d: \alpha ]&\leadsto \mathsf {Get}[d: \alpha ] \cdot \mathsf {Scope} \end{aligned}$$
(7)
(8)
(9)
(10)
(11)
(12)

These laws are reflected on terms as follows:Footnote 14

The occurrences of \(\mathsf {Get}[d: \alpha ]\) inside a bracket can be pulled to its left [laws (7) and (12)]. Doing so, moreover, facilitates it meeting a \(\mathsf {Put}[d: \alpha ]\), which can then eliminate it.

Note that a law commuting \(\mathsf {Put}[d: \alpha ]\) and \(\mathsf {Scope}\) is absent. Indeed, the grade \(\mathsf {Scope}\cdot \mathsf {Put}[d: \alpha ]\) corresponds to introducing an entity which may depend on another entity quantified over. Such a commutation should be rejected, as it would allow the introduced entity to escape its scope. An entity which is introduced inside a bracket, but before any \(\mathsf {Scope}\) introduction, however, can be pulled out of the bracket, as per law (11).

A \(\mathsf {Scope}\) introduced at the rightmost point of the body of a bracket can be reduced (law (9)): the operational interpretation of this law is to apply the quantifier to the returned property. If a discourse referent is introduced at the rightmost point of the body, immediately after \(\mathsf {Scope}\), then the introduction is simply ignored (law (10)). This should remain true for any number of introduced entities, moreover. To avoid introducing a scheme of reduction laws, we may use a law such as the following one, which coalesces indefinitely many introductions into one (or splits them) as needed:

$$\begin{aligned} \mathsf {Put}[d_1: \alpha ] \cdot \mathsf {Put}[d_2:\beta ]&= \mathsf {Put}[\langle d_1, d_2\rangle : \alpha \times \beta ] \\ \mathsf {put}_{d_1} a \,\star \, \lambda {\diamond }. \mathsf {put}_{d_2} b&= \mathsf {put}_{\langle d_1, d_2\rangle } \langle a, b\rangle&\qquad \qquad \qquad (13^\prime )\nonumber \end{aligned}$$
(13)

3.6 Indefinites

We now turn to indefinite noun phrases. Here, we pursue the idea of Charlow (2014, 2020a, 2020b) that the meaning of an indefinite noun phrase is to non-deterministically choose an entity from the set defined by its restriction. To do so, we introduce a new grade, \(\mathsf {Choose}[ \alpha ]\), indexed by a type \( \alpha \), and associate it with the following primitive:

$$\begin{aligned} \mathsf {choose }: ( \alpha \rightarrow t) \rightarrow \textsf {M}_{\mathsf {Choose}[ \alpha ]} \alpha \end{aligned}$$

We additionally provide the following law on grades:

$$\begin{aligned} \mathsf {Choose}[ \alpha ] \cdot \mathsf {Choose}[\beta ] \leadsto \mathsf {Choose}[ \alpha \times \beta ] \end{aligned}$$
(14)

This law is reflected at the level of terms as follows:

$$\begin{aligned}&\mathsf {choose }s_1 \,\star \, \lambda x. \mathsf {choose }(s_2x) \,\star \, \lambda y. \eta \langle x, y\rangle \\ \longrightarrow \;&\mathsf {choose }( \lambda \langle x, y\rangle . s_1x \wedge s_2x y)&\quad \qquad (14^\prime ) \end{aligned}$$

Intuitively, what this law says is that choosing two values in sequence is the same as choosing them simultaneously, as a pair. When it comes to the interaction with other grades, the \(\mathsf {Choose}[ \alpha ]\) grade behaves similarly to \(\mathsf {Put}[d: \alpha ]\): it commutes to the left of \(\mathsf {Put}[d: \alpha ]\) (law (15)), but not to the left of \(\mathsf {Get}[d: \alpha ]\) or \(\mathsf {Scope}\); moreover, it is forgotten once it is sandwiched between \(\mathsf {Scope}\) and the end of a bracket [laws (18) and (19)]; however, it can nevertheless escape on the left of a bracket [law (17)].

$$\begin{aligned} \mathsf {Put}[d:\beta ] \cdot \mathsf {Choose}[ \alpha ]&\leadsto \mathsf {Choose}[ \alpha ] \cdot \mathsf {Put}[d:\beta ] \end{aligned}$$
(15)
$$\begin{aligned} \mathsf {Choose}[ \alpha ] \cdot \mathsf {Get}[d:\beta ]&\leadsto \mathsf {Get}[d:\beta ] \cdot \mathsf {Choose}[ \alpha ] \end{aligned}$$
(16)
(17)
(18)
(19)

To remain concise, we transcribe only the laws on terms that relate \(\mathsf {choose }\) and \(\mathsf {scope}\) [laws (18) and (19)].

Table 3 Adding indefinites and quantifiers

To illustrate, consider the meaning derived for every dog sees a cat, given the updated lexicon in Table 3.

3.7 Determiners and Donkey Anaphora

The determiner algebra provides a new grade, \(\mathsf {Det}\), from which we define a new primitive, \(\mathsf {det}\), having the following type signature:

$$\begin{aligned} \mathsf {det}: ((e \rightarrow t) \rightarrow (e \rightarrow t) \rightarrow t) \rightarrow \textsf {M}_{\mathsf {Det}}((e \rightarrow t) \rightarrow (e \rightarrow t) \rightarrow t) \end{aligned}$$

\(\mathsf {det}\) introduces a determiner meaning, which it merely returns. The utility of including determiners among the grades is manifest, however, when considering their interactions with other effects; in particular \(\mathsf {Choose}[ \alpha ]\):

$$\begin{aligned} \mathsf {Det}\cdot \mathsf {Get}[d: \alpha ]&\leadsto \mathsf {Get}[d: \alpha ] \cdot \mathsf {Det}\end{aligned}$$
(20)
(21)
(22)
(23)
(24)

Note that each of these laws has a corresponding law that involves \(\mathsf {Scope}\), rather than \(\mathsf {Det}\). Indeed, the corresponding laws on terms are analogous, except for laws (23) and (24), which are substantively different. Before we demonstrate this, we give the laws on terms for laws (21) and (22), which are realized by feeding a determiner meaning to its continuation:Footnote 15

More interesting are laws (23) and (24), each of which can be realized in two ways. The first gives rise to a “weak” existential reading of donkey sentences, while the second gives rise to a “strong” universal reading.Footnote 16 We provide the two laws corresponding to law (23), as those for law (24) are uninterestingly different (i.e., they additionally erase an occurrence of \(\mathsf {put}_d\)).Footnote 17

Table 4 Adding determiners

With the lexicon in Table 4, we may derive the following meaning for every new yorker who sees a dog pets it:

At this point, we have two options, depending on the reduction rule we choose to coincide with law (24). If we opt for the weak reading, we can continue as follows:

On this reading, every New Yorker who sees a dog pets at least one dog they see. If we opt instead for the strong reading, we can continue as follows:

Now, every New Yorker who sees a dog pets every dog they see; i.e., the reading attributed to donkey sentences by most dynamic semantic accounts.

4 Realization in Terms of a Pure Calculus

In this section, we provide meanings to the grades, the operations, and their relation in terms of the simply typed \( \lambda \)-calculus with products (hereafter, STLC). We will only provide proof sketches here, but we note that the contents of this section and Sect. 3 have been formalized using the Agda proof assistant.

Theorem 1

(Coherence of reduction relations) If \(t_1 : \textsf {M}_{g_1} \alpha \), \(t_2 : \textsf {M}_{g_2} \alpha \), and \(t_1 \longrightarrow t_2\), then \(g_1 \leadsto g_2\).

Proof

By case analysis. \(\square \)

Definition 1

(Interpretation of grades) For every graded type \(\textsf {M}_g \alpha \), there is a semantic interpretation \(\llbracket \textsf {M}_g \alpha \rrbracket = S_{g}(\llbracket \alpha \rrbracket )\) as a type in the STLC (or, more generally, in the underlying typed \( \lambda \)-calculus without effects). \(\llbracket \cdot \rrbracket \) preserves STLC types and is defined on graded types as follows.

We stress that this interpretation is entirely modular in the sense that the meanings of the atomic effects are devised independently, without taking into account any interplay between effects. (It is a homomorphism on the grade structure.) As a rule, if the primitive operation associated with an effect takes as input an object of type X, then we take the product with X in the interpretation. Conversely, if such a primitive returns a type Y, then Y is found as the domain of an arrow in the interpretation. A consequence of this modularity is that all the results of this section can be proven in a modular fashion, by case analysis for each atomic grade. For grade composition, a straightforward induction applies.

Lemma 1

\(S\) is a graded monad.

The proof relies on the following facts: (1) each atomic grade is interpreted as a functor; (2) the unit grade is interpreted as the identity functor; (3) the composition of grades is interpreted as functor composition.

Theorem 2

If \(g_1\leadsto \ g_2\), then there is a function \(f : S_{g_1}( \alpha ) \rightarrow S_{g_2}( \alpha )\) for each STLC type \( \alpha \).

Proof

This is a constructive proof done by case analysis. The function f says how (the semantic interpretations of) effects are transformed by reductions. For instance, the law

$$\begin{aligned} \mathsf {Put}[d : \alpha ] \cdot \mathsf {Get}[d : \alpha ] \leadsto \mathsf {Put}[d : \alpha ] \end{aligned}$$

corresponds to functions \(f : ( \alpha \times ( \alpha \rightarrow \beta )) ~\rightarrow ~ ( \alpha \times \beta )\), which pass the newly introduced value (of type \( \alpha \)) to its continuation, which then uses it. That is, \(f \langle x, k\rangle = \langle x , k x\rangle \). \(\square \)

We call the relation induced by such functions ‘\(\llbracket g_1\leadsto g_2\rrbracket \)’. (That is, \(x\,\llbracket g_1 \leadsto g_2\rrbracket \,y\) iff \(f x = y\), where f is a function provided by Theorem 2.)Footnote 18 Finally, it bears repeating that the above construction defines the semantics of the reduction relation, and is thus the keystone of the interpretation.

Definition 2

(Interpretation of terms) For every well-typed term \(t : \textsf {M}_g \alpha \), we define an interpretation \(\llbracket t\rrbracket \) such that \(\llbracket t\rrbracket : S_{g}(\llbracket \alpha \rrbracket )\). The interpretations of \( \eta \) and \(\,\star \,\) are given by the graded monadic structure of \(S\) (Lemma 1). The recipe for interpreting each atomic grade is based straightforwardly on the type of the primitive giving rise to the grade. For example, \(\llbracket \mathsf {get}_d\rrbracket = \lambda x.x\), , etc.

Theorem 3

(Adequacy of the interpretation) The interpretation of terms respects the interpretation of grades and the interpretation of reductions as functions. Formally, if \(t_1 : \textsf {M}_{g_1} \alpha \), \(t_2 : \textsf {M}_{g_2} \alpha \), and \(t_1 \longrightarrow t_2\), then \(\llbracket t_1\rrbracket \llbracket {g_1} \leadsto {g_2}\rrbracket \llbracket t_2\rrbracket \).

This theorem essentially tells us that the axiomatization of term reductions exactly fits the interpretations of grades. As a result, if one wishes, one may omit the axiomatization, and use only the interpretation and the corresponding reduction relation. We have chosen to present the axiomatic view to emphasize the operational behavior of terms having effects. If one is interested only in the end product (i.e., pure \( \lambda \)-terms), then one would be better off axiomatizing grades and their relations only. This way, by omitting the axiomatization of operations and algebraic laws, one can describe their compositional meanings (as in Definitions 1 and 2) directly.

5 Related Work

5.1 Effects and Handlers

To improve compositionality, general effects and handlers systems have been proposed for dynamic semantics by Maršík (2016) and Maršík and Amblard (2014, 2016). In these approaches, new operations, such as \(\mathsf {get}\) or \(\mathsf {put}\), can be declared and defined locally in terms of the ambient calculus. These approaches have much in common with ours, insofar as they provide modular interpretations of the effectful operations they employ. Furthermore, while effectful meanings are defined in a typed extension of \( \lambda \)-calculus, they yield terms of a pure \( \lambda \)-calculus once they are handled.

The chief difference between the effects and handlers approach and the one advanced here, which makes algebraic laws central, is that the former approach demands that every occurrence of an operation be interpreted (i.e., handled) independently of the context in which it occurs. This requirement enforces absolute compositionality of interpretation, whereas our method does not. In other words, while our syntax is compositional, the eventual interpretation of a grade may depend on its context. Indeed, our reduction rules are written so that the meaning of an operation can depend on its neighbors. This design allows the interpretation of \(\mathsf {Scope}\), for example, to occur only at the rightmost point in a bracket, where it may receive a function of type \(e \rightarrow t\). Crucially, nevertheless, the results yielded by the applications of laws are compositional: due to associativity and confluence, one may safely apply reduction rules to a term m or a grade g independently of the context in which m or g occurs. When combining m with a continuation k, it suffices to consider their reduced forms: confluence guarantees that the result of \(m \,\star \,k\) is the same, regardless of what reductions occur before their combination.

5.2 The Underlying Calculus

Even though we have assumed the STLC as our ambient calculus, monadic and algebraic effects approaches (and, more generally, approaches based on computational effects) are agnostic as to the type system used by the underlying \( \lambda \)-calculus, be it Martin Löf Type Theory (Martin-Löf, 1984) or one of its variants, System F Girard 1972, Cooper’s TTR Cooper and Ginzburg 2015, Asher’s TCL Asher 2011, etc. Thus our approach (as others) may be added to such systems without modifying the respective calculi.

5.3 Graded Effects

Our treatment of discourse phenomena in terms of grades is partially inspired by the interpretation of Cooper storage in terms of a graded applicative functor due to Kobele (2018a). Kobele employs grades that correspond to stores of quantifier meanings, in order to encode the types of both stored quantifiers and the variables they bind. We employ somewhat richer grades than Kobele, in order to encode, e.g., discourse referents. Such rich grades allow us to describe linguistically meaningful interactions at the level of types that reflect the algebraic laws that apply at the level of terms.

5.4 Modalities Instead of Graded Monads

Our presentation relies on the standard structure of \( \lambda \)-calculi to encode dynamic effects as monads. This causes a certain amount of notational weight in the axiomatization. Namely, we have to use a family of operators \(\triangleright {}\), \(\triangleleft {}\), \(\,\star \,\), etc., instead of simple functional application.

To avoid this overhead, an alternative presentation could use modalities to represent the combination of dynamic effects associated with a value. Several calculi supporting these kind of modalities have been developed recently (Petricek et al. 2014; Orchard et al. 2019; Abel and Bernardy 2020).

6 Conclusion

We have proposed a framework which both unifies and refines approaches to dynamic semantics based on monads. The key idea is to break down effects into atomic grades. The interactions among grades are provided by algebraic laws, which can be presented in a modular fashion. Even though the number of possible laws grows quadratically with the number of possible effects, laws are much fewer than this theoretical maximum if we exclude the mechanical commutation laws.

The process of applying this refinement reveals possible improvements to earlier analyses, for example regarding the interpretation of anaphora using the state monad (Sect. 3.4). The use of a bracketing operation to delimit scope appears to be new, and is an essential device in the interpretation of quantification effects.

Our framework can either be given a purely axiomatic treatment (Sect. 3), or, like many accounts, be provided as part of a pure \( \lambda \)-calculus (Sect. 4). In future work, we intend to describe more effects within the same framework, including presupposition and conventional implicature.