Algebraic Effects for Extensible Dynamic Semantics

Grove, Julian; Bernardy, Jean-Philippe

doi:10.1007/s10849-022-09378-7

Algebraic Effects for Extensible Dynamic Semantics

Open access
Published: 26 August 2022

Volume 32, pages 219–245, (2023)
Cite this article

Download PDF

You have full access to this open access article

Journal of Logic, Language and Information Aims and scope Submit manuscript

Algebraic Effects for Extensible Dynamic Semantics

Download PDF

2426 Accesses
2 Altmetric
Explore all metrics

Abstract

Research in dynamic semantics has made strides by studying various aspects of discourse in terms of computational effect systems, for example, monads (Shan, 2002; Charlow, 2014), Barker and 2014), (Maršik, 2016). We provide a system, based on graded monads, that synthesizes insights from these programs by formalizing individual discourse phenomena in terms of separate effects, or grades. Included are effects for introducing and retrieving discourse referents, non-determinism for indefiniteness, and generalized quantifier meanings. We formalize the behavior of individual effects, as well as the interactions between effects, in terms of algebraic laws tailored to the relevant discourse phenomena. The system we propose is thus modular and suggests a novel approach to integrating formal accounts of distinct semantic phenomena. Finally, we give an interpretation of the system into pure $ \lambda $-calculus that respects the laws. Future work will aim to integrate more discourse phenomena using the same methodology, for example, presupposition and conventional implicature.

Neglect-Zero Effects in Dynamic Semantics

Predicate Logic with Anaphora

Variable Handling and Compositionality: Comparing DRT and DTS

Article Open access 27 May 2019

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In the last two decades, research in dynamic semantics has attained a breadth of insights into the relation between compositional semantics and dynamic semantics by viewing prima facie non-compositional phenomena as arising from computational side effects. The effectful approach to meaning allows one to view compositionally recalcitrant features—e.g., discourse referents, intensionality, and scope-taking—as giving rise to a rich, but uniform structure which integrates them with truth-conditional meaning. This integration is done by injecting truth-conditional meaning into the effectful structure in a natural way, such that the structure gives rise to a functor. This pattern of analysis has come in a variety of forms: continuations to study scope (Barker 2002; Barker and Shan 2014; de Groote 2001), graded applicative functors to study quantification (Kobele, 2018a), and monads to study discourse referents, anaphora, and indefiniteness, among other phenomena (Shan, 2002; Giorgolo & Unger, 2009; Giorgolo & Asudeh, 2012, Charlow, 2014, 2020a, 2020b, i.a.).

Progress in understanding dynamic and scopal phenomena in terms of effects, however, has presented two basic methodological questions. On the one hand, given effectful treatments of individual phenomena (say, discourse referents, quantification, and conventional implicature), how does one integrate them into a semantic analysis encompassing them all? On the other hand, how does one study interactions between these phenomena, while simultaneously preserving their individual treatments in the result? In this paper, we address both questions by providing a general framework, based on algebraic effects, for characterizing individual dynamic semantic phenomena, as well as their interactions, in terms of algebraic laws.

Stated in other terms, our goal is to improve the compositionality properties of functor-based theories of dynamic semantics; i.e., by recasting them as algebraic theories:

At a meta-theoretical level, when two phenomena are described by two distinct theories within our framework, we provide a systematic recipe for obtaining a combined theory of both phenomena. The combination is monotonic, in the sense that the predictions of the original theories, regarding either phenomenon, remain unchanged in the combined theory.
In individual analyses, when two syntactically adjacent constituents feature two distinct (yet possibly interacting) phenomena, their meanings may always be combined compositionally, in order to obtain a meaning for their combination.

We begin in Sect. 2 with a background to monadic dynamic semantics, and we present the issue of compositionality that pertains to monads and monad transformers. In Sect. 3, we axiomatize our approach in terms of the meta-language we use to describe algebraic theories, and we show how meanings may be provided in terms of this meta-language. Section 4 provides an interpretation of this axiomatizations in terms of a simply typed $ \lambda $-calculus with products. We discuss related work in Sect. 5 before concluding in Sect. 6.

2 Monadic Dynamic Semantics

Since the work of Shan (2002), monads have provided a popular interface for semantic analyses employing computational effects. Monads have been used to study anaphora (Giorgolo & Unger, 2009) and conventional implicature (Giorgolo and Asudeh 2012), and have more recently been taken up by Charlow (2014, 2020a, 2020b) to study the interactions among quantification, anaphora, indefiniteness, and binding in a framework that relies on monad transformers.

We assume a general familiarity with monads, but we briefly remind the reader of their structure, in order to introduce notation. A monad M is an endofunctor that takes a given type $ \alpha $ onto a type $M \alpha $ of computations exhibiting structure that encapsulates some desired side effect, e.g., reading and writing to a store, or non-determinism.^{Footnote 1} Each monad M is associated with two operators, $ \eta $ (‘return’) and $\,\star \,$ (‘bind’), having the following type signatures, for any types $ \alpha $ and $\beta $:

$$\begin{aligned} \eta&: \alpha \rightarrow M \alpha \\ (\,\star \,)&: M \alpha \rightarrow ( \alpha \rightarrow M \beta ) \rightarrow M \beta \end{aligned}$$

The role of $ \eta $ is to inject pure (i.e., non-effectful) values into the structure provided by M, while $\,\star \,$ sequences a computation of type $M \alpha $ with an indexed computation of type $ \alpha \rightarrow M \beta $ to produce a sequenced computation of type $M \beta $.

2.1 Using Monad Transformers: Charlow (2014)

Charlow (2014) introduces a monadic dynamic semantics that combines analyses of anaphora, indefiniteness, and quantification by relying on monad transformers. In particular, Charlow uses a Powerset monad to characterize indefiniteness, and then applies a State monad transformer, in order to obtain a system to characterize both indefiniteness and anaphora in the same grammar. He then applies a Continuation monad transformer, in order to provide a setting to study quantification. Crucially, the analyses that he provides for individual phenomena are extended compositionally to obtain analyses of their combinations with new phenomena.^{Footnote 2}

The Powerset monad $\textsf {P}$ allows one to analyze indefinite noun phrases (and the expressions with which they compose) as denoting sets, encoded as functions of type $ \alpha \rightarrow t$:

$$\begin{aligned} \textsf {P} \alpha&= \alpha \rightarrow t\\ \eta&: \alpha \rightarrow \alpha \rightarrow t\\ \eta a&= \{a\}(= \lambda x.x = a)\\ (\,\star \,)&: ( \alpha \rightarrow t) \rightarrow ( \alpha \rightarrow \beta \rightarrow t) \rightarrow \beta \rightarrow t\\ m \,\star \,k&= \bigcup _{x \in m}k x(= \lambda y.\exists x : m x \wedge k x y) \end{aligned}$$

This way, the noun phrase a linguist, for instance, will denote the set $\{x \ |\ \textsf {ling} x\}$ and may be composed with an intransitive verb such as sleeps by injecting the latter into the monad via $ \eta $: $ \eta \textsf {sleep}$. To compose them, Charlow employs monadic functional application (which he overloads with forward and backward application, to be disambiguated by the types of arguments). Functional application (FA) is defined as follows for an arbitrary monad M:

$$\begin{aligned} {\textbf {FA}}\;:\;&M ( \alpha \rightarrow \beta ) \rightarrow M \alpha \rightarrow M \beta \\ \text {or}\;&M \alpha \rightarrow M ( \alpha \rightarrow \beta ) \rightarrow M \beta \\ {\textbf {FA}}\,m\,n\; =\;&m \,\star \, \lambda f.n \,\star \, \lambda x. \eta (f x)\\ \text {or}\;&m \,\star \, \lambda x.n \,\star \, \lambda f. \eta (f x) \end{aligned}$$

Now, a linguist sleeps may be interpreted as ${\textbf {FA}} \{x \ |\ \textsf {ling} x\} ( \eta \textsf {sleep})$, which can be reduced to $\{\textsf {sleep} x \ |\ \textsf {ling} x\}$; that is, a set of truth values containing ${\textit{True}}$ iff some linguist sleeps.

To incorporate anaphora, he invokes the following State monad transformer, which takes an underlying monad M onto a new monad $\textsf {S}_T M$, for some fixed type s of states:

$$\begin{aligned} \textsf {S}_T\,M\, \alpha&= s \rightarrow M ( \alpha \times s)\\ \eta&: \alpha \rightarrow s \rightarrow M ( \alpha \times s)\\ \eta a&= \lambda s. \eta \langle a, s\rangle \\ (\,\star \,)\; :\;&(s \rightarrow M ( \alpha \times s)) \rightarrow \\&( \alpha \rightarrow s \rightarrow M (\beta \times s)) \rightarrow \\&s \rightarrow M (\beta \times s)\\ m \,\star \,k&= \lambda s.m s \,\star \, \lambda \langle x, s^\prime \rangle .k x s^\prime \end{aligned}$$

In this example M will be instantiated to the Powerset monad, and s to the type of lists of individuals. This transformation of the Powerset monad to provide State functionality allows such lists to be accessed and updated throughout semantic composition as lists of discourse referents. To allow the indefinite a linguist to introduce a discourse referent, for example, Charlow defines the following operation, $(\cdot )^\triangleright {}$, for an underlying Powerset monad, though which we give for an arbitrary underlying monad M in the presence of State functionality:

$$\begin{aligned} (\cdot )^\triangleright {}&: \textsf {S}_T M \alpha \rightarrow \textsf {S}_T M \alpha \\ m^\triangleright {}&= m \,\star \, \lambda \langle x, s\rangle . \eta x (x::s) \end{aligned}$$

Here, the operation $::$ conses a new individual onto a list, thus providing it as a discourse referent. Now, one can associate the sentence a linguist sleeps with a discourse referent by having a linguist introduce it (given an updated instance of FA):

$$\begin{aligned}&{\textbf {FA}} ( \lambda s.\{\langle x, s\rangle \ |\ \textsf {ling} x\}^\triangleright {}) ( \eta \textsf {sleep})\\&\quad = \lambda s.\{\langle \textsf {sleep} x, x::s\rangle \ |\ \textsf {ling} x\} \end{aligned}$$

Thus the meaning of a linguist has changed, given our use of the State-transformed Powerset monad. The new meaning is, in fact, straightforward to obtain from the old meaning, however, in terms of a function lifting values from $M \alpha $ to $\textsf {S}_T M \alpha $:

$$\begin{aligned} lift _\textsf {S}&: M \alpha \rightarrow \textsf {S}_T M \alpha \\ lift _\textsf {S}\, m&= \lambda s.m \,\star \, \lambda x. \eta \langle x, s\rangle \end{aligned}$$

It is in this sense that the addition of State functionality to meanings stated with respect to the Powerset monad is (in principle) compositional. Both the monadic combinators of the Powerset monad, and the meanings it is used to characterize, may be injected into the State setting.

Charlow uses this strategy to introduce analyses of quantificational noun phrases into the monadic setting. Taking inspiration from the continuation-style treatments of quantifiers of Barker (2002) and Barker and Shan (2014), he employs the following Continuation monad transformer, $\textsf {C}_T$.

$$\begin{aligned} \textsf {C}_T M \alpha&= ( \alpha \rightarrow M t) \rightarrow M t\\ \eta&: \alpha \rightarrow ( \alpha \rightarrow t) \rightarrow t\\ \eta a&= \lambda c.c a\\ (\,\star \,)\; :\;&(( \alpha \rightarrow M t) \rightarrow M t) \rightarrow \\&( \alpha \rightarrow (\beta \rightarrow M t) \rightarrow M t) \rightarrow \\&(\beta \rightarrow M t) \rightarrow M t\\ m \,\star \,k =\&\lambda c.m ( \lambda x.k x c) \end{aligned}$$

The underlying monad, in this case, is the State-transformed Powerset monad. Like the State monad transformer, the Continuation monad transformer also comes with a lifting function $ lift _\textsf {C}$:

$$\begin{aligned} lift _\textsf {C}&: M \alpha \rightarrow \textsf {C}_T M \alpha \\ lift _\textsf {C}\, m&= \lambda k.m \,\star \,k \end{aligned}$$

Now, a quantificational noun phrase such as every philosopher can be given the meaning $ \lambda c, s.\{\langle \forall x.\textsf {phil} x \rightarrow \exists y, s^\prime : \langle True, s^\prime \rangle \in c x s, s\rangle \}$.^{Footnote 3} Moreover, a sentence such as every philosopher sees a linguist, may be composed as follows, given a version of FA appropriate to the Continuation monad:

$$\begin{aligned}&{\textbf {FA}} ( \lambda c, s.\{\langle \forall x.\textsf {phil} x \rightarrow \exists s^\prime : \langle True, s^\prime \rangle \in c x s, s\rangle \})\\&({\textbf {FA}} ( \eta \textsf {see})) (lift_\textsf {C} ( \lambda s.\{\langle y, s\rangle \ |\ \textsf {ling} y\}))))\\&\quad = \lambda c, s.\left\{ \langle \forall x.\textsf {phil} x \rightarrow \exists s^\prime :\langle True, s^\prime \rangle \in \bigcup _{\textsf {ling} y}(c (\textsf {see} y x) s), s\rangle \right\} \end{aligned}$$

Finally, as Charlow shows, such meanings of type $\textsf {C}_T (\textsf {S}_T \textsf {P}) t$ may be lowered to ones of type $\textsf {S}_T \textsf {P} t$ by applying them to the $ \eta $ of the State-transformed Powerset monad:

$$\begin{aligned}&{\textit{lower}}_\textsf {C} \left( \lambda c, s.\left\{ \langle \forall x.\textsf {phil} x \rightarrow \exists s^\prime :\langle {\textit{True}}, s^\prime \rangle \in \bigcup _{\textsf {ling} y}(c (\textsf {see} y x) s), s\rangle \right\} \right) \\&\quad = \left( \lambda c, s.\left\{ \langle \forall x.\textsf {phil} x \rightarrow \exists s^\prime :\langle {\textit{True}}, s^\prime \rangle \in \bigcup _{\textsf {ling} y}(c (\textsf {see} y x) s), s\rangle \right\} \right) \eta \\&\quad = \lambda s.\{\langle \forall x.\textsf {phil} x \rightarrow \exists s^\prime : \langle {\textit{True}}, s^\prime \rangle \in \{\langle \textsf {see} y x, s\rangle \ |\ \textsf {ling} y\}, s\rangle \}\\&\quad = \lambda s.\{\langle \forall x.\textsf {phil} x \rightarrow \exists y : \textsf {ling} y \wedge \textsf {see} y x, s\rangle \}\\&\quad = \eta (\forall x.\textsf {phil} x \rightarrow \exists y : \textsf {ling} y \wedge \textsf {see} y x) \end{aligned}$$

Such lowered meanings may, in turn, be lifted back into the Continuation monad, e.g., in order to further compose them with quantificational meanings:

$$\begin{aligned}&lift _\textsf {C} ( \eta (\forall x.\textsf {phil} x \rightarrow \exists y : \textsf {ling} y \wedge \textsf {see} y x))\\&\quad = \lambda k. \eta (\forall x.\textsf {phil} x \rightarrow \exists y : \textsf {ling} y \wedge \textsf {see} y x) \,\star \,k\\&\quad = \lambda k.k (\forall x.\textsf {phil} x \rightarrow \exists y : \textsf {ling} y \wedge \textsf {see} y x)\\&\quad = \eta (\forall x.\textsf {phil} x \rightarrow \exists y : \textsf {ling} y \wedge \textsf {see} y x) \end{aligned}$$

Note that once a quantifier has been lowered, its scope is fixed. Thus Charlow composes $ lift _\textsf {C}$ and ${\textit{lower}}_\textsf {C}$ in this way, as an operator ${\textit{reset}}$, in order to delimit the scope of quantifiers to finite clause boundaries. At the same time, as he shows, lowering does not affect the capacity of indefinite noun phrases and discourse referents to take scope; their side effects are still potent, as they are represented in terms of the State-transformed Powerset monad. As a consequence, the limited scopal possibilities for quantifiers and the flexible scoping behavior of indefinites and discourse referents may be modeled within the same continuation-based setting.

2.2 Monads and Compositionality

The above discussion provides only a schematic presentation of the system of Charlow (2014). What we hope to have conveyed, however, is the manner in which the system, as a theory of indefiniteness, anaphora, and quantification, is monotonic and compositional in the senses introduced earlier. The theory of indefiniteness may be stated on its own, in terms of the Powerset monad, and then embedded into the combined theory of indefiniteness and anaphora, using a monad transformer. This combined theory may likewise be embedded into the combined theory of indefiniteness, anaphora, and quantification. To say that the embedding is monotonic and compositional is to say that it constitutes a (monad) homomorphism. Every lifting function $ lift $ has the property, in general, that it preserves the monadic combinators: $ lift \,( \eta a) = \eta a$, and $ lift \,(m \,\star \,k) = lift \,m \,\star \, \lambda x. lift \,(k x)$. Thus the theory stated with respect to the underlying monad is never truly forgotten and may, in fact, be used when convenient, i.e., before applying a $ lift $.

What we aim to show in this paper is that the algebraic approach that we advocate has this property to an even greater degree. Indeed, the simple monadic approach requires, in many cases, determining a monad ahead of time that combines all of the effects which may occur in a given analysis. For example, say that one wants a theory of quantification on its own, independent of a theory of indefiniteness and anaphora. Then, one may employ the Continuation monad (as akin to Barker 2002; Barker and Shan 2014); in this case, the definitions of $ \eta $ and $\,\star \,$ remain identical to those stated above, except for their types: the result type of the continuation is now simply t, so that $M \alpha = ( \alpha \rightarrow t) \rightarrow t$. In turn, every philosopher may be given its usual generalized-quantifier meaning, i.e., $ \lambda k.\forall x : \textsf {phil} x \rightarrow k x$. Incorporating theories of indefiniteness and anaphora, however, will now prove more difficult. The State monad transformer will provide a monad that takes a type $ \alpha $ onto the type $s \rightarrow ( \alpha \times s \rightarrow t) \rightarrow t$:

$$\begin{aligned} \textsf {S}_T \textsf {C} \alpha&= s \rightarrow \textsf {C} ( \alpha \times s)\\&= s \rightarrow ( \alpha \times s \rightarrow t) \rightarrow t\\ \eta a&= \lambda s, c.c \langle a, s\rangle \\ m \,\star \,k&= \lambda s, c.m s ( \lambda \langle x, s^\prime \rangle .k x s^\prime c) \end{aligned}$$

Indeed, this result may appear, at first, to be suitable for a combined analysis of quantification and anaphora, but note, for example, that the value returned within the underlying Continuation monad will systematically have the type of a product. As a result, a ${\textit{lower}}$ operation will be required to have the type $(s \rightarrow (t \times s \rightarrow t) \rightarrow t) \rightarrow s \rightarrow t \times s$, but it is not obvious what the appropriate definition of such an operation would be.^{Footnote 4} Rather, in order to achieve the desired result, it seems that one must start with the Powerset monad, then incorporate anaphora, and then finally, incorporate quantification. Much more generally, the lifting functions $ lift _X$ associated with each monad transformer X are often unidirectional, requiring that a choice of result fixes the underlying monad. Thus, ensuring that the resulting monad has a certain desired behavior will limit the flexibility with which one is able to combine different sources of functionality. In contrast, as we will show, the algebraic approach assigns a type with the minimum required effect to each meaning, and the combination of meanings with different effects systematically computes the appropriate result. There is thus no priority associated with one effect or another.^{Footnote 5}

A second difference between our algebraic approach and the monadic approach is that the types we compute for meanings exhibiting multiple effects is more informative: it yields a linguistically meaningful summary of the effects an expression gives rise to, as we will show.

3 Algebraic Effects via Graded Monads

As a way forward, we propose a double move: to simultaneously make the monadic approach modular and make its types more fine-grained.

First, we propose that semantic side effects be studied algebraically, in terms of equational laws characterizing the individual phenomena, which may then be combined. This move is inspired by Maršík (2016) and Maršík and Amblard (2014, 2016), who develop a typed extension of the $ \lambda $-calculus to study algebraic effects in semantics.^{Footnote 6} Unlike the approach of Maršík, we show how effects employed by semanticists—e.g., state and non-determinism—may be recast algebraically (while remaining in a pure setting), leading to more extensible grammars.

Second, we propose to track the relevant effects at the level of types, by using a graded monad.^{Footnote 7} In contrast to plain monads, graded monads are indexed with an abstraction of the effect that they perform, hereafter referred to simply as the “grade”. The unit $ \eta $ of the monad is associated the unit grade (1). The grade of the composition of effects under $\,\star \,$ is the composition of their grades, written with the operator $(\cdot )$ (see Fig. 1). Graded monads have been applied previously in the field of programming language theory to describe the semantics of algebraic effects (Katsumata, 2014; Mycroft et al., 2016; Orchard et al., 2019). In natural language semantics, they have been employed in the analysis of presupposition projection and anaphora (Grove, 2019). In our analysis, different phenomena are assigned different grades independently of each other. This means that the interpretations associated with individual phenomena may be freely composed, in order to yield grammars that combine the relevant effects.

One can then describe the interactions between effects using two sets of laws. The first set concerns the abstract level of grades. The second set concerns the concrete level of $ \lambda $-terms and operations. These two sets of laws are related: any law between terms generates a corresponding law between grades; that is, any law governing terms is only allowable if there exists a corresponding law governing the behavior of grades. To illustrate, consider the unit and associativity laws on terms, which are part of the definition of a graded monad (Fig. 1). For types to be preserved in the statement of associativity for $\,\star \,$, the $(\cdot )$ operator must be associative. Likewise, for types to be preserved in the identity laws regulating the behavior of $ \eta $, 1 must be the left and right unit of $(\cdot )$. That is, grades must form a monoid.

From now on, we develop not just an equational theory of terms and grades, but a theory of reduction. That is, we use a reduction relation between terms written ‘$\longrightarrow $’, and one between grades written ‘$\leadsto $’. These relations are the (respective) reflexive transitive congruence of the laws that we list below. By definition, two terms $t_1$ and $t_2$ are equal if they are inter-reducible; likewise for grades. At this point, our theory encompasses only the graded monad laws. At the introduction of any new law, we will ensure that the reduction relations on both grades and terms are confluent. In particular, we will ensure that the asserted laws are compatible with associativity; i.e., $g_1\cdot (g_2\cdot g_3)$ and $(g_1\cdot g_2) \cdot g_3$ should always reduce to the same grade. Similarly at the level of terms: any proposed reduction rule should respect the Associativity law. We further discuss the importance of confluence in Sect. 5.1.

3.1 Compositional Dynamic Semantics

As recalled in Sect. 2, monadic semantics in the style of Shan (2002) aims to augment the interpretation of each syntactic category with an effect. In the present framework, this effect is graded. For example, if a sentence is interpreted as a truth value of type t in a non-effectful semantics, it is interpreted in our framework as a truth value associated with an effect with some grade g, i.e., of type $\textsf {M}_g t$.

Moreover, whereas in Montague semantics, one uses functional application, we additionally employ the graded applicative functor structure arising from the graded monad, characterized by (either of) the operators $(\triangleright {})$ and $(\triangleleft {})$:

$$\begin{aligned} (\triangleright {})&: \textsf {M}_p ( \alpha \rightarrow \beta ) \rightarrow \textsf {M}_q \alpha \rightarrow \textsf {M}_{p \cdot q} \beta \\ m \triangleright {}n&= m \,\star \, \lambda f.n \,\star \, \lambda x. \eta (f x) \\ (\triangleleft {})&: \textsf {M}_p \alpha \rightarrow \textsf {M}_q ( \alpha \rightarrow \beta ) \rightarrow \textsf {M}_{p \cdot q} \beta \\ m \triangleleft {}n&= m \,\star \, \lambda x.n \,\star \, \lambda f. \eta (f x) \end{aligned}$$

Table 1 A lexicon fragment

Full size table

For illustration, we present a small applicative categorial grammar fragment in Table 1, and, in Fig. 2, two rules of interpretation corresponding to functional application and two rules which make use of the applicative functor structure of our system. The need for seemingly redundant rules corresponding to simple functional application (above in Fig. 2), in addition to applicative combination (below in Fig. 2), arises from the fact that some meanings manipulate effectful values directly: their type is of the form $\textsf {M}_p \alpha \rightarrow \textsf {M}_q \beta $, rather than $\textsf {M}_q ( \alpha \rightarrow \beta )$. As such, they cannot be combined by either of the $\triangleright {}$ or $\triangleleft {}$ operators.^{Footnote 8} We additionally admit a rule ($ \mu $) which collapses a meaning of type $\textsf {M}_{g_1} (\textsf {M}_{g_2} \alpha )$ into one of type $\textsf {M}_{g_1 \cdot g_2} \alpha $ by sequencing it (via $\,\star \,$) with the identity function. In the following pages, we write ‘$ \mu \,m$’ in place of ‘$m \,\star \, \lambda x.x$’ to be concise.

Using only the ($\backslash $) rule, we may interpret john walks as a value whose grade is 1, i.e., one without any dynamic effect.

The definition of $(\triangleleft {})$ and the monad laws allow this result to be reduced:

$$\begin{aligned}&( \eta \textsf {j}) \triangleleft {}( \eta \textsf {walk})\\ =\;&\eta \textsf {j} \,\star \, \lambda x. \eta \textsf {walk} \,\star \, \lambda f. \eta (f x)\\ \longrightarrow \;&\eta (\textsf {walk}\,\textsf {j})&\quad \qquad \mathrm{(by\,Left\,Identity)} \end{aligned}$$

3.2 Anaphora

We can extend our analysis to account for anaphora. For any type $ \alpha $, we may posit a grade $\mathsf {Get}[d : \alpha ]$, along with a new primitive, $\mathsf {get}_d$:

$$\begin{aligned} \mathsf {get}_d : \textsf {M}_{\mathsf {Get}[d : \alpha ]} \alpha \end{aligned}$$

The purpose of $\mathsf {get}_d$ is to retrieve a discourse referent d, whose type is $ \alpha $, from the linguistic context.^{Footnote 9} For instance, one can consider $ \alpha $ to be e, the semantic type of entities, although any semantic type is supported, in principle.

The grade $\mathsf {Get}[d : \alpha ]$ records that one presupposes the existence of a discourse referent with label d and type $ \alpha $. For example, $\mathsf {get}_d$ may be used to interpret a pronoun, with the typing $\mathsf {get}_d : \textsf {M}_{\mathsf {Get}[d : e]} e$. The labels used for discourse referents are equipped with a decidable equality relation, but otherwise, they carry no meaning.^{Footnote 10} It should be noted that labels occur only inside grades—in Sect. 4, we show how the primitives may be interpreted into a label-free calculus. Finally, thanks to the typing rule for $\,\star \,$, a phrase which uses some number of discourse referents lists them all in its grade. For example, we might have the type $\textsf {M}_{\mathsf {Get}[d_{masc} : e] \cdot \mathsf {Get}[d_{fem} : e]} t$ for the sentence he likes her.

Our goal is to formalize how grades interact. Since we do not keep track of the order in which discourse referents are introduced, we have the following equality on grades:

$$\begin{aligned} \mathsf {Get}[d_1: \alpha ] \cdot \mathsf {Get}[d_2: \beta ] = \mathsf {Get}[d_2: \beta ] \cdot \mathsf {Get}[d_1: \alpha ] \end{aligned}$$

(1)

Whenever we assert such a law on grades, it is important to check that it preserves the overall system’s confluence in the presence of the other laws, including the monoid laws. So far, we have asserted only a commutation law, and it is easy to see that no problem arises.

Second, we do not keep track of how many references to a single discourse referent occur. Moreover, if two references to the same discourse referent are made, their types should agree. This is captured by the following law:^{Footnote 11}

$$\begin{aligned} \mathsf {Get}[d : \alpha ] \cdot \mathsf {Get}[d : \alpha ] \leadsto \mathsf {Get}[d : \alpha ] \end{aligned}$$

(2)

To complete the formal definition of the treatment of anaphoric expressions, it suffices to state how two instances of $\mathsf {get}_d$ should interact, as guided by the behavior of their grades. We employ two laws on terms (which we label according to the respective corresponding laws on grades):

$$\begin{aligned} \qquad \qquad \mathsf {get}_{d_1} \,\star \, \lambda x.\mathsf {get}_{d_2} \,\star \, \lambda y. \eta \langle x, y\rangle&= \mathsf {get}_{d_2} \,\star \, \lambda y.\mathsf {get}_{d_1} \,\star \, \lambda x. \eta \langle x, y\rangle&\quad \qquad \qquad {(1^\prime )}\\ \qquad \qquad \mathsf {get}_d \,\star \, \lambda x.\mathsf {get}_d \,\star \, \lambda y. \eta \langle x, y\rangle&\longrightarrow \mathsf {get}_d \,\star \, \lambda x. \eta \langle x, x\rangle&\quad \qquad \qquad {(2^\prime )} \end{aligned}$$

The first law states that references to independent discourse referents commute. This law corresponds to law (1) on grades stating that the order of labels in a grade does not matter. The second law states that two references to the same discourse referent collapse to a single reference. This law corresponds to law (2) on grades, which collapses two associations with the same label. Note that, instead of first presenting the laws on grades, we could have stated the algebraic laws on terms and deduced their typing. Correct typing ensures that the behavior of terms, as captured by the algebraic laws, is mirrored by the behavior of grades, as captured by the grade laws.

3.3 Introducing Discourse Referents

As the dual to accessing discourse referents, one can introduce new ones. For this purpose, we add a new grade $\mathsf {Put}[d : \alpha ]$, along with a new primitive:

$$\begin{aligned} \mathsf {put}_d : \alpha \rightarrow \textsf {M}_{\mathsf {Put}[d : \alpha ]} \diamond \end{aligned}$$

The returned type, $\diamond $, is the unit type, thus signifying that $\mathsf {put}_d$ makes no significant contribution at the level of values. In terms of this primitive, one can define an operation $(\cdot )^\blacktriangleright _{d}$, which binds its argument to the discourse referent d. (The notation is inspired by the similar notation of Barker and Shan (2014), as well as of Charlow (2014).) This operation performs the dynamic effects associated with its argument, following which it binds the value returned to d:

$$\begin{aligned} (\cdot )^\blacktriangleright _{d}&: \textsf {M}_g \alpha \rightarrow \textsf {M}_{g \cdot \mathsf {Put}[d: \alpha ]} \alpha \\ m^\blacktriangleright _{d}&= m \,\star \, \lambda x.\mathsf {put}_d x \,\star \, \lambda {\diamond }. \eta x \end{aligned}$$

The ‘$ \lambda {\diamond }.$’ notation indicates that a value of type $\diamond $ is expected as an argument to the relevant $ \lambda $-expression.

Table 2 Adding discourse referents

Full size table

To illustrate, let us return to our running example, given the updated lexicon in Table 2. We now interpret john walks as follows.

After unfolding the definitions and $\beta $-reducing, we obtain $\mathsf {put}_d \textsf {j} \,\star \, \lambda {\diamond }. \eta (\textsf {walk}\,\textsf {j})$, whose type is $\textsf {M}_{\mathsf {Put}[d:e]}t$, thus capturing that the discourse referent d has been introduced.

When considered on its own, $\mathsf {put}_d$ behaves similarly to $\mathsf {get}_d$. The order of introduction does not matter:^{Footnote 12}

$$\begin{aligned} \mathsf {Put}[d_1: \alpha ] \cdot \mathsf {Put}[d_2: \beta ] = \mathsf {Put}[d_2: \beta ] \cdot \mathsf {Put}[d_1: \alpha ] \end{aligned}$$

(3)

Consequently, two discourse referents commute. We can formalize this as the following equation on terms:

Although $\mathsf {get}_d$ and $\mathsf {put}_d$ arise independently and have interpretations on their own, we can describe their interactions in terms of algebraic laws. We illustrate this fact first on the relations on terms, by adding two laws:

$$\begin{aligned} \mathsf {put}_d a \,\star \, \lambda {\diamond }.\mathsf {get}_d&\longrightarrow ( \eta a)^\blacktriangleright _{d}&\quad \qquad (4^\prime )\\ \mathsf {put}_{d_1} a \,\star \, \lambda {\diamond }.\mathsf {get}_{d_2}&\longrightarrow \mathsf {get}_{d_2} \,\star \, \lambda x.\mathsf {put}_{d_1} a \,\star \, \lambda {\diamond }. \eta x \quad (d_1 \ne d_2)&\quad \qquad (5^\prime ) \end{aligned}$$

These laws ensure that $\mathsf {get}_d$ uses only the discourse referent d that $\mathsf {put}_d$ introduces. Assuming that the terms are well typed, the grades on the left should reduce to the grades on the right; consequently, the following laws hold on grades:

$$\begin{aligned} \mathsf {Put}[d : \alpha ] \cdot \mathsf {Get}[d : \alpha ]&\leadsto \mathsf {Put}[d : \alpha ] \end{aligned}$$

(4)

$$\begin{aligned} \mathsf {Put}[d_1 : \alpha ] \cdot \mathsf {Get}[d_2 : \beta ]&\leadsto \mathsf {Get}[d_2: \beta ] \cdot \mathsf {Put}[d_1 : \alpha ]&(d_1 \ne d_2) \end{aligned}$$

(5)

The first law finds a satisfying linguistic justification: when a discourse referent is introduced, it is no longer presupposed. The second law ensures that introductions and uses of distinct discourse referents ignore each other.

To illustrate, consider composing the two utterances john walks with he sits. Given the lexicon in Table 2, this miniature discourse receives the following meaning:

$$\begin{aligned}&\llbracket \textit{john walks; he sits}\rrbracket \\ =\;&((\mathsf {put}_d \textsf {j} \,\star \, \lambda {\diamond }. \eta (\textsf {walk}\,\textsf {j})) \triangleleft {} \eta ( \lambda \phi , \psi . \phi \wedge \psi )) \triangleright {}(\mathsf {get}_d \,\star \, \lambda x. \eta (\textsf {sit} x)) \quad \quad \quad ({\hbox {by}}\,\, / , { \backslash }, \ \, \mu )\\ \longrightarrow \;&(\mathsf {put}_d \textsf {j} \,\star \, \lambda {\diamond }.\mathsf {get}_d )\,\star \, \lambda x. \eta (\textsf {walk} \textsf {j} \wedge \textsf {sit} x) \quad \quad \quad \quad ({\hbox {by Associativity, Left Identity}})\\ \longrightarrow \;&( \eta \textsf {j})^\blacktriangleright _{d} \,\star \, \lambda x. \eta (\textsf {walk} \textsf {j} \wedge \textsf {sit} x) \;\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad ({\hbox {by law}}\,(4))\\ \longrightarrow \;&\mathsf {put}_d \textsf {j} \,\star \, \lambda {\diamond }. \eta (\textsf {walk} \textsf {j} \wedge \textsf {sit} \textsf {j}) \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad ({\hbox {by Associativity, Left Identity}}) \end{aligned}$$

The resulting meaning is of type $\textsf {M}_{\mathsf {Put}[d : e]} t$; it introduces a discourse referent (d), but has no anaphoric presupposition, despite the presence of the pronoun he. That is, its reference is resolved. Checking confluence is a less easy exercise now than before. We can, however, convince ourselves that it holds by noting that the following re-association is confluent:

3.4 On the State Monad

Both Charlow 2014 and other work in monadic dynamic semantics have employed the state monad, in order to model anaphora (Giorgolo & Unger, 2009; Unger, 2012). The foregoing formalization vindicates some of the state monad laws (laws (2) and (4)), but to get a full specification of the state monad, one additionally needs the following law:

To preserve types, this law on terms requires the following law to hold on grades:

$$\begin{aligned} \mathsf {Get}[d: \alpha ] \cdot \mathsf {Put}[d: \alpha ] = 1 \end{aligned}$$

(6)

Such a law is problematic, however, as it contravenes confluence:

Thus not all of the state monad laws can be imported into our framework, given how we employ graded types. What is responsible for this difference? The state monad is a theory of memory locations. According to the corresponding model of state, such memory locations pre-exist the lifetime of a program, and can be updated any number of times. Using a state monad to model anaphora would thus require that a constant set of referents be handled by the discourse. In comparison, our encoding of discourse referents is more precise: we record at the level of grades the exact discourse referents either introduced or presupposed. For our purposes, there is a fundamental difference between introducing a discourse referent and not introducing it. A contrario, we ought to reject the hypothetical law (6), which implies that using a discourse referent and then introducing it is, in fact, equivalent to doing nothing.

3.5 Quantification

As a further step, we may introduce another grade, $\mathsf {Scope}$, in order to analyze expressions, such as every, which are commonly taken to denote generalized quantifier meanings. Like those introduced above, this grade is accompanied by its own primitive:

$$\begin{aligned} \mathsf {scope}: ((e \rightarrow t) \rightarrow t) \rightarrow \textsf {M}_{\mathsf {Scope}} e \end{aligned}$$

Thus given a quantifier q of type $(e \rightarrow t) \rightarrow t$, $\mathsf {scope}\,q$ allows it to act as an entity at the level of values, i.e., in terms of the variable q binds, given that the primitive’s return type is e.^{Footnote 13}

Indeed, the scopes of natural language quantifiers have been observed to be restricted in certain ways: one common view is that a quantifier cannot take scope outside the smallest finite clause in which it occurs syntactically. For example, some cat fears every dog will chase it can be understood only to imply the existence of a single highly pessimistic cat. To capture the effect of scope islands, we also introduce an operation on grades and a primitive which introduces it:

The intent is that allows one to ensure that a value bound in ${\textit{body}}$ using $\mathsf {scope}\,q$ is not available outside of ${\textit{body}}$. This makes it possible to statically limit the scope of a variable bound by a quantifier.

The modularity provided by our approach allows us to import the laws regulating anaphora into the current setting. At the same time, we may describe the interactions between anaphora and quantification. To that end, we may state the following laws on grades:

$$\begin{aligned} \mathsf {Scope}\cdot \mathsf {Get}[d: \alpha ]&\leadsto \mathsf {Get}[d: \alpha ] \cdot \mathsf {Scope} \end{aligned}$$

(7)

(8)

(9)

(10)

(11)

(12)

These laws are reflected on terms as follows:^{Footnote 14}

The occurrences of $\mathsf {Get}[d: \alpha ]$ inside a bracket can be pulled to its left [laws (7) and (12)]. Doing so, moreover, facilitates it meeting a $\mathsf {Put}[d: \alpha ]$, which can then eliminate it.

Note that a law commuting $\mathsf {Put}[d: \alpha ]$ and $\mathsf {Scope}$ is absent. Indeed, the grade $\mathsf {Scope}\cdot \mathsf {Put}[d: \alpha ]$ corresponds to introducing an entity which may depend on another entity quantified over. Such a commutation should be rejected, as it would allow the introduced entity to escape its scope. An entity which is introduced inside a bracket, but before any $\mathsf {Scope}$ introduction, however, can be pulled out of the bracket, as per law (11).

A $\mathsf {Scope}$ introduced at the rightmost point of the body of a bracket can be reduced (law (9)): the operational interpretation of this law is to apply the quantifier to the returned property. If a discourse referent is introduced at the rightmost point of the body, immediately after $\mathsf {Scope}$, then the introduction is simply ignored (law (10)). This should remain true for any number of introduced entities, moreover. To avoid introducing a scheme of reduction laws, we may use a law such as the following one, which coalesces indefinitely many introductions into one (or splits them) as needed:

$$\begin{aligned} \mathsf {Put}[d_1: \alpha ] \cdot \mathsf {Put}[d_2:\beta ]&= \mathsf {Put}[\langle d_1, d_2\rangle : \alpha \times \beta ] \\ \mathsf {put}_{d_1} a \,\star \, \lambda {\diamond }. \mathsf {put}_{d_2} b&= \mathsf {put}_{\langle d_1, d_2\rangle } \langle a, b\rangle&\qquad \qquad \qquad (13^\prime )\nonumber \end{aligned}$$

(13)

3.6 Indefinites

We now turn to indefinite noun phrases. Here, we pursue the idea of Charlow (2014, 2020a, 2020b) that the meaning of an indefinite noun phrase is to non-deterministically choose an entity from the set defined by its restriction. To do so, we introduce a new grade, $\mathsf {Choose}[ \alpha ]$, indexed by a type $ \alpha $, and associate it with the following primitive:

$$\begin{aligned} \mathsf {choose }: ( \alpha \rightarrow t) \rightarrow \textsf {M}_{\mathsf {Choose}[ \alpha ]} \alpha \end{aligned}$$

We additionally provide the following law on grades:

$$\begin{aligned} \mathsf {Choose}[ \alpha ] \cdot \mathsf {Choose}[\beta ] \leadsto \mathsf {Choose}[ \alpha \times \beta ] \end{aligned}$$

(14)

This law is reflected at the level of terms as follows:

$$\begin{aligned}&\mathsf {choose }s_1 \,\star \, \lambda x. \mathsf {choose }(s_2x) \,\star \, \lambda y. \eta \langle x, y\rangle \\ \longrightarrow \;&\mathsf {choose }( \lambda \langle x, y\rangle . s_1x \wedge s_2x y)&\quad \qquad (14^\prime ) \end{aligned}$$

Intuitively, what this law says is that choosing two values in sequence is the same as choosing them simultaneously, as a pair. When it comes to the interaction with other grades, the $\mathsf {Choose}[ \alpha ]$ grade behaves similarly to $\mathsf {Put}[d: \alpha ]$: it commutes to the left of $\mathsf {Put}[d: \alpha ]$ (law (15)), but not to the left of $\mathsf {Get}[d: \alpha ]$ or $\mathsf {Scope}$; moreover, it is forgotten once it is sandwiched between $\mathsf {Scope}$ and the end of a bracket [laws (18) and (19)]; however, it can nevertheless escape on the left of a bracket [law (17)].

$$\begin{aligned} \mathsf {Put}[d:\beta ] \cdot \mathsf {Choose}[ \alpha ]&\leadsto \mathsf {Choose}[ \alpha ] \cdot \mathsf {Put}[d:\beta ] \end{aligned}$$

(15)

$$\begin{aligned} \mathsf {Choose}[ \alpha ] \cdot \mathsf {Get}[d:\beta ]&\leadsto \mathsf {Get}[d:\beta ] \cdot \mathsf {Choose}[ \alpha ] \end{aligned}$$

(16)

(17)

(18)

(19)

To remain concise, we transcribe only the laws on terms that relate $\mathsf {choose }$ and $\mathsf {scope}$ [laws (18) and (19)].

Table 3 Adding indefinites and quantifiers

Full size table

To illustrate, consider the meaning derived for every dog sees a cat, given the updated lexicon in Table 3.

3.7 Determiners and Donkey Anaphora

The determiner algebra provides a new grade, $\mathsf {Det}$, from which we define a new primitive, $\mathsf {det}$, having the following type signature:

$$\begin{aligned} \mathsf {det}: ((e \rightarrow t) \rightarrow (e \rightarrow t) \rightarrow t) \rightarrow \textsf {M}_{\mathsf {Det}}((e \rightarrow t) \rightarrow (e \rightarrow t) \rightarrow t) \end{aligned}$$

$\mathsf {det}$ introduces a determiner meaning, which it merely returns. The utility of including determiners among the grades is manifest, however, when considering their interactions with other effects; in particular $\mathsf {Choose}[ \alpha ]$:

$$\begin{aligned} \mathsf {Det}\cdot \mathsf {Get}[d: \alpha ]&\leadsto \mathsf {Get}[d: \alpha ] \cdot \mathsf {Det}\end{aligned}$$

(20)

(21)

(22)

(23)

(24)

Note that each of these laws has a corresponding law that involves $\mathsf {Scope}$, rather than $\mathsf {Det}$. Indeed, the corresponding laws on terms are analogous, except for laws (23) and (24), which are substantively different. Before we demonstrate this, we give the laws on terms for laws (21) and (22), which are realized by feeding a determiner meaning to its continuation:^{Footnote 15}

More interesting are laws (23) and (24), each of which can be realized in two ways. The first gives rise to a “weak” existential reading of donkey sentences, while the second gives rise to a “strong” universal reading.^{Footnote 16} We provide the two laws corresponding to law (23), as those for law (24) are uninterestingly different (i.e., they additionally erase an occurrence of $\mathsf {put}_d$).^{Footnote 17}

Table 4 Adding determiners

Full size table

With the lexicon in Table 4, we may derive the following meaning for every new yorker who sees a dog pets it:

At this point, we have two options, depending on the reduction rule we choose to coincide with law (24). If we opt for the weak reading, we can continue as follows:

On this reading, every New Yorker who sees a dog pets at least one dog they see. If we opt instead for the strong reading, we can continue as follows:

Now, every New Yorker who sees a dog pets every dog they see; i.e., the reading attributed to donkey sentences by most dynamic semantic accounts.

4 Realization in Terms of a Pure Calculus

In this section, we provide meanings to the grades, the operations, and their relation in terms of the simply typed $ \lambda $-calculus with products (hereafter, STLC). We will only provide proof sketches here, but we note that the contents of this section and Sect. 3 have been formalized using the Agda proof assistant.

Theorem 1

(Coherence of reduction relations) If $t_1 : \textsf {M}_{g_1} \alpha $, $t_2 : \textsf {M}_{g_2} \alpha $, and $t_1 \longrightarrow t_2$, then $g_1 \leadsto g_2$.

Proof

By case analysis. $\square $

Definition 1

(Interpretation of grades) For every graded type $\textsf {M}_g \alpha $, there is a semantic interpretation $\llbracket \textsf {M}_g \alpha \rrbracket = S_{g}(\llbracket \alpha \rrbracket )$ as a type in the STLC (or, more generally, in the underlying typed $ \lambda $-calculus without effects). $\llbracket \cdot \rrbracket $ preserves STLC types and is defined on graded types as follows.

We stress that this interpretation is entirely modular in the sense that the meanings of the atomic effects are devised independently, without taking into account any interplay between effects. (It is a homomorphism on the grade structure.) As a rule, if the primitive operation associated with an effect takes as input an object of type X, then we take the product with X in the interpretation. Conversely, if such a primitive returns a type Y, then Y is found as the domain of an arrow in the interpretation. A consequence of this modularity is that all the results of this section can be proven in a modular fashion, by case analysis for each atomic grade. For grade composition, a straightforward induction applies.

Lemma 1

$S$ is a graded monad.

The proof relies on the following facts: (1) each atomic grade is interpreted as a functor; (2) the unit grade is interpreted as the identity functor; (3) the composition of grades is interpreted as functor composition.

Theorem 2

If $g_1\leadsto \ g_2$, then there is a function $f : S_{g_1}( \alpha ) \rightarrow S_{g_2}( \alpha )$ for each STLC type $ \alpha $.

Proof

This is a constructive proof done by case analysis. The function f says how (the semantic interpretations of) effects are transformed by reductions. For instance, the law

$$\begin{aligned} \mathsf {Put}[d : \alpha ] \cdot \mathsf {Get}[d : \alpha ] \leadsto \mathsf {Put}[d : \alpha ] \end{aligned}$$

corresponds to functions $f : ( \alpha \times ( \alpha \rightarrow \beta )) ~\rightarrow ~ ( \alpha \times \beta )$, which pass the newly introduced value (of type $ \alpha $) to its continuation, which then uses it. That is, $f \langle x, k\rangle = \langle x , k x\rangle $. $\square $

We call the relation induced by such functions ‘$\llbracket g_1\leadsto g_2\rrbracket $’. (That is, $x\,\llbracket g_1 \leadsto g_2\rrbracket \,y$ iff $f x = y$, where f is a function provided by Theorem 2.)^{Footnote 18} Finally, it bears repeating that the above construction defines the semantics of the reduction relation, and is thus the keystone of the interpretation.

Definition 2

(Interpretation of terms) For every well-typed term $t : \textsf {M}_g \alpha $, we define an interpretation $\llbracket t\rrbracket $ such that $\llbracket t\rrbracket : S_{g}(\llbracket \alpha \rrbracket )$. The interpretations of $ \eta $ and $\,\star \,$ are given by the graded monadic structure of $S$ (Lemma 1). The recipe for interpreting each atomic grade is based straightforwardly on the type of the primitive giving rise to the grade. For example, $\llbracket \mathsf {get}_d\rrbracket = \lambda x.x$, , etc.

Theorem 3

(Adequacy of the interpretation) The interpretation of terms respects the interpretation of grades and the interpretation of reductions as functions. Formally, if $t_1 : \textsf {M}_{g_1} \alpha $, $t_2 : \textsf {M}_{g_2} \alpha $, and $t_1 \longrightarrow t_2$, then $\llbracket t_1\rrbracket \llbracket {g_1} \leadsto {g_2}\rrbracket \llbracket t_2\rrbracket $.

This theorem essentially tells us that the axiomatization of term reductions exactly fits the interpretations of grades. As a result, if one wishes, one may omit the axiomatization, and use only the interpretation and the corresponding reduction relation. We have chosen to present the axiomatic view to emphasize the operational behavior of terms having effects. If one is interested only in the end product (i.e., pure $ \lambda $-terms), then one would be better off axiomatizing grades and their relations only. This way, by omitting the axiomatization of operations and algebraic laws, one can describe their compositional meanings (as in Definitions 1 and 2) directly.

5 Related Work

5.1 Effects and Handlers

To improve compositionality, general effects and handlers systems have been proposed for dynamic semantics by Maršík (2016) and Maršík and Amblard (2014, 2016). In these approaches, new operations, such as $\mathsf {get}$ or $\mathsf {put}$, can be declared and defined locally in terms of the ambient calculus. These approaches have much in common with ours, insofar as they provide modular interpretations of the effectful operations they employ. Furthermore, while effectful meanings are defined in a typed extension of $ \lambda $-calculus, they yield terms of a pure $ \lambda $-calculus once they are handled.

The chief difference between the effects and handlers approach and the one advanced here, which makes algebraic laws central, is that the former approach demands that every occurrence of an operation be interpreted (i.e., handled) independently of the context in which it occurs. This requirement enforces absolute compositionality of interpretation, whereas our method does not. In other words, while our syntax is compositional, the eventual interpretation of a grade may depend on its context. Indeed, our reduction rules are written so that the meaning of an operation can depend on its neighbors. This design allows the interpretation of $\mathsf {Scope}$, for example, to occur only at the rightmost point in a bracket, where it may receive a function of type $e \rightarrow t$. Crucially, nevertheless, the results yielded by the applications of laws are compositional: due to associativity and confluence, one may safely apply reduction rules to a term m or a grade g independently of the context in which m or g occurs. When combining m with a continuation k, it suffices to consider their reduced forms: confluence guarantees that the result of $m \,\star \,k$ is the same, regardless of what reductions occur before their combination.

5.2 The Underlying Calculus

Even though we have assumed the STLC as our ambient calculus, monadic and algebraic effects approaches (and, more generally, approaches based on computational effects) are agnostic as to the type system used by the underlying $ \lambda $-calculus, be it Martin Löf Type Theory (Martin-Löf, 1984) or one of its variants, System F Girard 1972, Cooper’s TTR Cooper and Ginzburg 2015, Asher’s TCL Asher 2011, etc. Thus our approach (as others) may be added to such systems without modifying the respective calculi.

5.3 Graded Effects

Our treatment of discourse phenomena in terms of grades is partially inspired by the interpretation of Cooper storage in terms of a graded applicative functor due to Kobele (2018a). Kobele employs grades that correspond to stores of quantifier meanings, in order to encode the types of both stored quantifiers and the variables they bind. We employ somewhat richer grades than Kobele, in order to encode, e.g., discourse referents. Such rich grades allow us to describe linguistically meaningful interactions at the level of types that reflect the algebraic laws that apply at the level of terms.

5.4 Modalities Instead of Graded Monads

Our presentation relies on the standard structure of $ \lambda $-calculi to encode dynamic effects as monads. This causes a certain amount of notational weight in the axiomatization. Namely, we have to use a family of operators $\triangleright {}$, $\triangleleft {}$, $\,\star \,$, etc., instead of simple functional application.

To avoid this overhead, an alternative presentation could use modalities to represent the combination of dynamic effects associated with a value. Several calculi supporting these kind of modalities have been developed recently (Petricek et al. 2014; Orchard et al. 2019; Abel and Bernardy 2020).

6 Conclusion

We have proposed a framework which both unifies and refines approaches to dynamic semantics based on monads. The key idea is to break down effects into atomic grades. The interactions among grades are provided by algebraic laws, which can be presented in a modular fashion. Even though the number of possible laws grows quadratically with the number of possible effects, laws are much fewer than this theoretical maximum if we exclude the mechanical commutation laws.

The process of applying this refinement reveals possible improvements to earlier analyses, for example regarding the interpretation of anaphora using the state monad (Sect. 3.4). The use of a bracketing operation to delimit scope appears to be new, and is an essential device in the interpretation of quantification effects.

Our framework can either be given a purely axiomatic treatment (Sect. 3), or, like many accounts, be provided as part of a pure $ \lambda $-calculus (Sect. 4). In future work, we intend to describe more effects within the same framework, including presupposition and conventional implicature.

Notes

The underlying category we employ will invariably be Cartesian closed. One may restrict attention to the category of sets and functions, for example.
We use slightly different terminology and notation than that found in Charlow (2014).
The meanings Charlow provides for quantificational noun phrases headed by every are assembled in terms of more primitive operators which he defines elsewhere. We have thus somewhat simplified his presentation for our purposes.
The most obvious candidate would be to throw out the state returned by the surrounding function on continuations—that is, such a ${\textit{lower}}$ would be defined as:
$$\begin{aligned} {\textit{lower}}\,m = \lambda s.\langle m\,s\,( \lambda \langle a, s^\prime \rangle .a), s\rangle \end{aligned}$$
Such an operation, however, would invariably discard the anaphoric potential of its argument, treating, e.g., quantifiers and proper names alike. (In contrast, choosing State to be the underlying monad would allow anaphoric side effects to survive when they do not arise from bona fide quantifers.)
One may wonder if the algebraic approach could be recast in terms of monad transformers. An issue which would arise is that the relevant lifting operation depends both on the meaning to which it is applied and on the context. Thus if we have n different atomic effects, we must consider $n \times (n-1)$ combinations of them (one for each pair of effects). Furthermore, monad transformers are ill-equipped to deal with effect bracketing, which we introduce in Sect. 3.5.
We trace the the idea of algebraic effects to the work of Kiselyov and Ishii (2015), Plotkin and Power (2001) and Plotkin and Pretnar (2008).
Our approach straightforwardly adapts to the setting of graded applicative functors (Kobele, 2018a). The two variants afford different dimensions of generalization.
Indeed, the choice between the simple and applicative variants of (/) in a derivational step is determined by the semantic types of the arguments being combined. Likewise for the choice between the variants of ($\backslash $). (Semantic types of the form $\textsf {M}_p \alpha \rightarrow \textsf {M}_q \beta $ will be encountered in Sect. 3.5.) The same quirk justifies the presence of the $ \mu $ rule, which we introduce next.
We encode here roughly the notion of discourse referents of Karttunen (1976).
This decision procedure tells whether or not there is co-reference. A possible implementation of it would be to match the properties of referents with predicates associated with anaphoric expressions (Bernardy et al., 2021).
Our framework is, in principle, agnostic about the type system of the underlying $ \lambda $-calculus. For instance, rich types, as proposed by Luo (2012), are supported, as is the simply typed $ \lambda $-calculus. Even though we will avoid rich types in our analysis, we note that they may be particularly beneficial when it comes to tracking discourse referents. For instance, law (2) generalizes as follows:
$$\begin{aligned} \mathsf {Get}[d : \alpha ] \cdot \mathsf {Get}[d : \beta ] \leadsto \mathsf {Get}[d : \alpha \wedge \beta ] \end{aligned}$$
(where ‘$ \alpha \wedge \beta $’ refers to the meet of types $ \alpha $ and $\beta $). Thus if two parts of a phrase refer to the same discourse referent, then the type associated with that discourse referent needs to be the meet of the types found in the parts.
Additionally, complex relations can be captured within the types of discourse referents. For example, the meaning of john sees his dog could be assigned the type $\textsf {M}_{\mathsf {Get}([d:\Sigma (x:\textsf {dog})(\textsf {have}(\textsf {j},x))])}$, which records a presupposition of the existence (via a $\Sigma $ type) of John’s dog. In the presence of rich types, one can additionally expect the types of the discourse referents to play a role in resolving anaphora.
Note that this algebra merely characterizes the logic of discourse referents, saying nothing about their accessibility from a cognitive standpoint.
The ‘$\mathsf {scope}$’ notation is inspired by Maršík (2016) and Maršík and Amblard (2016) though the constructors’ exact purpose and semantics are different between the two approaches.
We leave the proof of confluence to the reader. It relies on checking that appending something to the left-hand side of a reduction yields the same result as appending it to the right-hand side.
We omit the corresponding law for law (20), which is uninterestingly different from its variant involving $\mathsf {Scope}$.
An alternative approach to rendering dynamically potent determiner meanings out of static ones of type $(e \rightarrow t) \rightarrow (e \rightarrow t) \rightarrow t$ is provided by Kobele (2018b). The laws of Kobele, which inspire ours, incorporate the contexts and discourse continuations of de Groote (2006) by relying on $ \lambda $-homomorphisms.
In this framework, these two laws are formally incompatible. We could add non-determinism, but prefer not to, in order to avoid obscuring our main points. In general, however, a full account will provide the conditions under which each reading is available; see, e.g., Kanazawa (1994).
In addition, any two reduction functions associated with grade equality form an isomorphism; i.e., if $g_1= g_2$, then $S_{g_1}( \alpha ) \cong S_{g_2}( \alpha )$ (for any $ \alpha $). This can be seen by noting two facts. First, that the monoid laws regulating grades give rise to identity functions on terms, since 1 is interpreted via the identity functor, and functor composition is associative. Second, that otherwise equivalent grades require merely rearranging either the order of $ \lambda $-abstractions or the components of a tuple in the interpretation. For example, $\mathsf {Put}[d_1: \alpha ] \cdot \mathsf {Put}[d_2:\beta ] = \mathsf {Put}[d_2:\beta ] \cdot \mathsf {Put}[d_1: \alpha ]$ corresponds to the isomorphisms $\llbracket \alpha \rrbracket \times (\llbracket \beta \rrbracket \times \gamma ) \cong \llbracket \beta \rrbracket \times (\llbracket \alpha \rrbracket \times \gamma )$.

References

Abel, A., & Bernardy, J.-P. (2020). A unified view of modalities in type systems. Proceedings of the ACM on Programming Languages,4, 90:1–90:28. https://doi.org/10.1145/3408972
Asher, N. (2011). Lexical meaning in context: A web of words. Cambridge University Press. https://www.cambridge.org/core/books/lexical-meaning-in-context/F1D01632AD5B491A94860A350B9E764A.
Barker, C. (2002). Continuations and the nature of quantification. Natural Language Semantics, 10, 211–242. https://doi.org/10.1023/A:1022183511876.
Article Google Scholar
Barker, C., & Shan, C.-C. (2014). Continuations and natural language. Oxford studies in theoretical linguistics (Vol. 53).
Bernardy, J.-P., Chatzikyriakidis, S., & Maskharashvili, A. (2021). A computational treatment of anaphora and its algorithmic implementation. Journal of Logic, Language and Information, 30, 1–29. https://doi.org/10.1007/s10849-020-09322-7.
Article Google Scholar
Charlow, S. (2014). On the semantics of exceptional scope. PhD Thesis, NYU, New York. https://semanticsarchive.net/Archive/2JmMWRjY.
Charlow, S. (2020a). The scope of alternatives: Indefiniteness and islands. Linguistics and Philosophy, 43, 427–472. https://doi.org/10.1007/s10988-019-09278-3.
Charlow, S. (2020b). Static and dynamic exceptional scope. https://ling.auf.net/lingbuzz/004650, publisher: Rutgers University Published: LingBuzz.
Cooper, R., & Ginzburg, J. (2015). Type theory with records for natural language semantics*. In The handbook of contemporary semantic theory (pp. 375–407). John Wiley & Sons, Ltd. https://doi.org/10.1002/9781118882139.ch12, section: 12 _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/9781118882139.ch12.
de Groote, P. (2001). Type raising, continuations, and classical logic. In van Rooij, R., & Stokhof, M. (Eds.), Proceedings of the 13th Amsterdam colloquium (pp. 97–101). Institute for Logic, Language and Computation, Universiteit van Amsterdam.
de Groote, P. (2006). Towards a Montagovian account of dynamics. Semantics and Linguistic Theory,16, 1–16. https://journals.linguisticsociety.org/proceedings/index.php/SALT/article/view/2952, number: 0.
Giorgolo, G., & Asudeh, A. (2012). $\langle {\text{M}}, \eta , \star \rangle $ Monads for conventional implicatures. In A. Aguilar Guevara, A. Chernilovskaya, & R. Nouwen (Eds.), Proceedings of Sinn und Bedeutung 16, MITWPL (pp. 265–278). http://mitwpl.mit.edu/open/sub16/Giorgolo.pdf.
Giorgolo, G., & Unger, C. (2009). Coreference without discourse referents. In B. Plank, E. T. K. Sang, & T. Van de Cruys (Eds.), Proceedings of the 19th meeting of computational linguistics in the Netherlands (pp. 69–81).
Girard, J.-Y. (1972). Interprétation fonctionnelle et élimination des coupures de l’arithmétique d’ordre supérieur. Doctoral Dissertation, Université Paris 7.
Grove, J. (2019). Scope-taking and presupposition satisfaction. PhD Thesis, University of Chicago, Chicago. https://semanticsarchive.net/Archive/TRmOTkzM.
Kanazawa, M. (1994). Weak vs. strong readings of donkey sentences and monotonicity inference in a dynamic setting. Linguistics and Philosophy, 17, 109–158. https://doi.org/10.1007/BF00984775.
Article Google Scholar
Karttunen, L. (1976). Discourse referents. In Notes from the linguistic underground. Syntax and semantics (Vol. 7). Academic Press. http://web.stanford.edu/~laurik/publications/archive/discref.pdf.
Katsumata, S. (2014). Parametric effect monads and semantics of effect systems. In Proceedings of the 41st ACM SIGPLAN-SIGACT symposium on principles of programming languages, POPL’14 (pp. 633–645). Association for Computing Machinery. https://doi.org/10.1145/2535838.2535846.
Kiselyov, O., & Ishii, H. (2015). Freer monads, more extensible effects. ACM SIGPLAN Notices, 50, 94–105. https://doi.org/10.1145/2887747.2804319.
Article Google Scholar
Kobele, G. M. (2018a). The cooper storage idiom. Journal of Logic, Language and Information, 27, 95–131. https://doi.org/10.1007/s10849-017-9263-1.
Kobele, G. M. (2018b). Modularizing semantics. https://home.uni-leipzig.de/gkobele/files/slides/Kobele18Frankfurt.pdf.
Luo, Z. (2012). Common nouns as types. In D. Béchet, & A. Dikovsky (Eds.), Logical aspects of computational linguistics. Lecture notes in computer science (pp. 173–185). Springer.
Martin-Löf, P. (1984). Intuitionistic type theory. Bibliopolis. https://archive-pml.github.io/martin-lof/pdfs/Bibliopolis-Book-retypeset-1984.pdf.
Maršík, J. (2016). Effects and handlers in natural language. PhD thesis, Université de Lorraine. https://hal.inria.fr/tel-01417467.
Maršík, J., & Amblard, M. (2014). Algebraic effects and handlers in natural language interpretation. In V. de Paiva, W. Neuper, P. Quaresma, C. Retoré, L. S. Moss, & J. Saludes (Eds.), Natural language and computer science, volume TR 2014-002 of joint proceedings of the second workshop on natural language and computer science (NLCS’14) & 1st international workshop on natural language services for reasoners (NLSR 2014). Center for Informatics and Systems of the University of Coimbra. https://hal.archives-ouvertes.fr/hal-01079206.
Maršík, J., & Amblard, M. (2016). Introducing a calculus of effects and handlers for natural language semantics. In A. Foret, G. Morrill, R. Muskens, R. Osswald, & S. Pogodalla (Eds.), Formal grammar. Lecture notes in computer science (pp. 257–272). Springer.
Mycroft, A., Orchard, D., & Petricek, T. (2016). Effect systems revisited—Control-flow algebra and semantics. In C. W. Probst, C. Hankin, & R. Rydhof Hansen (Eds.), Semantics, logics, and Calculi: Essays dedicated to Hanne Riis Nielson and Flemming Nielson on the occasion of their 60th birthdays. Lecture notes in computer science (pp. 1–32). Springer International Publishing. https://doi.org/10.1007/978-3-319-27810-0_1.
Orchard, D., Liepelt, V.-B., & Eades, H., III. (2019). Quantitative program reasoning with graded modal types. Proceedings of the ACM on Programming Languages,3, 110:1–110:30. https://doi.org/10.1145/3341714.
Petricek, T., Orchard, D., & Mycroft, A. (2014). Coeffects: A calculus of context-dependent computation. In Proceedings of the 19th ACM SIGPLAN international conference on Functional programming, ICFP’14 (pp. 123–135). Association for Computing Machinery. https://doi.org/10.1145/2628136.2628160.
Plotkin, G., & Pretnar, M. (2008). A logic for algebraic effects. In 2008 23rd Annual IEEE symposium on logic in computer science (pp. 118–129). ISSN: 1043-6871.
Plotkin, G. D., & Power, J. (2001). Adequacy for algebraic effects. In Proceedings of the 4th international conference on foundations of software science and computation structures, FoSSaCS’01 (pp. 1–24). Springer.
Shan, C. (2002). Monads for natural language semantics. arXiv:cs/0205026
Unger, C. (2012). Dynamic semantics as monadic computation. In M. Okumura, D. Bekki, & K. Satoh (Eds.), New frontiers in artificial intelligence. Lecture notes in computer science (pp. 68–81). Springer.

Download references

Funding

Open access funding provided by University of Gothenburg.

Author information

Authors and Affiliations

Centre for Linguistic Theory and Studies in Probability Department of Philosophy, Linguistics and Theory of Science University of Gothenburg, Gothenburg, Sweden
Julian Grove & Jean-Philippe Bernardy

Authors

Julian Grove
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Philippe Bernardy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Julian Grove.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Grove, J., Bernardy, JP. Algebraic Effects for Extensible Dynamic Semantics. J of Log Lang and Inf 32, 219–245 (2023). https://doi.org/10.1007/s10849-022-09378-7

Download citation

Accepted: 27 July 2022
Published: 26 August 2022
Issue Date: May 2023
DOI: https://doi.org/10.1007/s10849-022-09378-7

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Algebraic Effects for Extensible Dynamic Semantics

Abstract

Similar content being viewed by others

Neglect-Zero Effects in Dynamic Semantics

Predicate Logic with Anaphora

Variable Handling and Compositionality: Comparing DRT and DTS

1 Introduction

2 Monadic Dynamic Semantics

2.1 Using Monad Transformers: Charlow (2014)

2.2 Monads and Compositionality

3 Algebraic Effects via Graded Monads

3.1 Compositional Dynamic Semantics

3.2 Anaphora

3.3 Introducing Discourse Referents

3.4 On the State Monad

3.5 Quantification

3.6 Indefinites

3.7 Determiners and Donkey Anaphora

4 Realization in Terms of a Pure Calculus

Theorem 1

Proof

Definition 1

Lemma 1

Theorem 2

Proof

Definition 2

Theorem 3

5 Related Work

5.1 Effects and Handlers

5.2 The Underlying Calculus

5.3 Graded Effects

5.4 Modalities Instead of Graded Monads

6 Conclusion

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation