The following list contains a survey of some important and recent research in modeling face-to-face conversation. The list below is a presented as a guide to the literature by topic and date; we include complete citations afterwards in alphabetical order. For brevity, research works are keyed by ﬁrst author and date only (we use these keys on the slides as well as in this list). Of course, most papers are multiply authored. The list is not intended to be exhaustive. Our (...) primary aim is simply to provide bibliographic information for all the research that we will refer to during the ESSLLI class itself. The entries also provide a sampling from ongoing research projects so that you can get an overall sense of the state of the ﬁeld and begin to follow up topics of particular interest to you. (shrink)
Utterances in situated activity are about the world. Theories and systems normally capture this by assuming references must be resolved to real-world entities in utterance understanding. We describe a number of puzzles and problems for this approach, and propose an alternative semantic representation using discourse relations that link utterances to the nonlinguistic context to capture the context-dependent interpretation of situated utterances. Our approach promises better empirical coverage and more straightforward system building. Substantiating these advantages is work in progress.
cal practice: the enterprise of specifying information about the world for use in computer systems. Knowledge representation as a ﬁeld also encompasses conceptual results that call practitioners’ attention to important truths about the world, mathematical results that allow practitioners to make these truths precise, and computational results that put these truths to work. This chapter surveys this practice and its results, as it applies to the interpretation of natural language utterances in implemented natural language processing systems. For a broader perspective (...) on such technical practice, in all its strengths and weaknesses, see (Agre 1997). Knowledge representation offers a powerful general tool for the science of language. Computational logic, a prototypical formalism for representing knowledge about the world, is also the model for the level of logical form that linguists use to characterize the grammar of meaning (Larson and Segal 1995). And researchers from (Schank and Abelson 1977) to (Shieber 1993) and (Bos to appear) have relied crucially on such representations, and the inference methods associated with them, in articulating accounts of semantic relations in language, such as synonymy, entailment, informativeness and contradiction. The new textbooks (Blackburn and Bos 2002a, Blackburn and Bos 2002b) provide an excellent grounding in this research, and demonstrate how deeply computational ideas from knowledge representation can inform pure linguistic study. In this short chapter, I must leave much of.. (shrink)
This chapter investigates the computational consequences of a broadly Gricean view of language use as intentional activity. In this view, dialogue rests on coordinated reasoning about communicative intentions. The speaker produces each utterance by formulating a suitable communicative intention. The hearer understands it by recognizing the communicative intention behind it. When this coordination is successful, interlocutors succeed in considering the same intentions— that is, the same representations of utterance meaning—as the dialogue proceeds. In this paper, I emphasize that these intentions (...) can be formalized; we can provide abstract but systematic representations that spell out what a speaker is trying to do with an utterance. Such representations describe utterances simultaneously as the product of our knowledge of grammar and as actions chosen for a reason. In particular, they must characterize the speaker’s utterance in grammatical terms, provide the links to the context that the grammar requires, and so arrive at a contribution that the speaker aims to achieve. Because I have implemented this formalism, we can regard it as a possible analysis of conversational processes at the level of computational theory. Nevertheless, this analysis leaves open what the nature of the biological computation involved in inference to intentions is, and what regularities in language use support this computation. (shrink)
Templates are a widespread natural language tech- nology that achieves believability within a narrow range of interaction and coverage. We consider templates for embodied conversational behavior. Such templates combine a speciﬁc pattern of marked-up text, specifying prosody and conversational signals as well as words, with similarly-annotated gaps that can be ﬁlled in by rule to yield a coherent contribution to a dialogue with a user. In this paper we argue that templates can give a de- signer substantial freedom to realize (...) speciﬁc combina- tions of behaviors in interactions with users and thereby to explore the relationships among such factors as emo- tion, personality, individuality and social role. (shrink)
We use the interpretation of vague scalar predicates like small as an illustration of how systematic semantic models of dialogue context enable the derivation of useful, ﬁne-grained utterance interpretations from radically underspeci- ﬁed semantic forms. Because dialogue context sufﬁces to determine salient alternative scales and relevant distinctions along these scales, we can infer implicit standards of comparison for vague scalar predicates through completely general pragmatics, yet closely constrain the intended meaning to within a natural range.
This paper presents a simple and versatile tree-rewriting lexicalized grammar formalism, TAGLET, that provides an effective scaffold for introducing advanced topics in a survey course on natural language processing (NLP). Students who implement a strong competence TAGLET parser and generator simultaneously get experience with central computer science ideas and develop an effective starting point for their own subsequent projects in data-intensive and interactive NLP.
We describe a methodology for learning a disambiguation model for deep pragmatic interpretations in the context of situated task-oriented dialogue. The system accumulates training examples for ambiguity resolution by tracking the fates of alternative interpretations across dialogue, including subsequent clariﬁcatory episodes initiated by the system itself. We illustrate with a case study building maximum entropy models over abductive interpretations in a referential communication task. The resulting model correctly resolves 81% of ambiguities left unresolved by an initial handcrafted baseline. A key (...) innovation is that our method draws exclusively on a system’s own skills and experience and requires no human annotation. (shrink)
In this paper, we introduce a system, Sentence Planning Using Description, which generates collocations within the paradigm of sentence planning. SPUD simultaneously constructs the semantics and syntax of a sentence using a Lexicalized Tree Adjoining Grammar (LTAG). This approach captures naturally and elegantly the interaction between pragmatic and syntactic constraints on descriptions in a sentence, and the inferential and lexical interactions between multiple descriptions in a sentence. At the same time, it exploits linguistically motivated, declarative speci- ﬁcations of the discourse (...) functions of syntactic constructions to make contextually appropriate syntactic choices. (shrink)
This paper pursues a formal analogy between natural language dialogue and collaborative real-world action in general. The analogy depends on an analysis of two aspects of collaboration that ﬁgure crucially in language use. First, agents must be able to coordinate abstractly about future decisions which cannot be made on present information. Second, when agents ﬁnally take such decisions, they must again coordinate in order to interpret one anothers’ actions as collaborative. The contribution of this paper is a general representation of (...) collaborative plans and intentions, inspired by representations of deductions in logics of knowledge, action and time, which supports these two kinds of coordination. Such representations.. (shrink)
Building animated conversational agents requires developing a ﬁne-grained analysis of the motions and meanings available to interlocutors in face-to-face conversation and implementing strategies for using these motions and meanings to communicate eﬀectively. In this paper, we describe our research on signaling uncertainty on an animated face as an end-to-end case study of this process. We sketch our eﬀorts to characterize people’s facial displays of uncertainty in face-to-face conversation in ways that allow us to simulate those behaviors in an animated agent. (...) Our work has led to new insights into the structure, timing, expressive content and communicative function of facial actions that we must take into account to explain our empirical ﬁndings and to build agents that reproduce people’s eﬀective use of the face in managing the dynamics of conversation. (shrink)
In modal subordination, a modal sentence is interpreted relative to a hypothetical scenario introduced in an earlier sentence. In this paper, I argue that this phenomenon reﬂects the fact that the interpretation of modals is an ANAPHORIC process. Modal morphemes introduce sets of possible worlds, representing alternative hypothetical scenarios, as entities into the discourse model. Their interpretation depends on evoking sets of worlds recording described and reference scenarios, and relating such sets to one another using familiar notions of restricted, preferential (...) quantiﬁcation. This proposal relies on an extended model of environments in dynamic semantics to keep track of associations between possible worlds and ordinary individuals; it assumes that modal meanings and other lexical meanings encapsulate quantiﬁcation over possible worlds. These two innovations are required in order for modals to refer to sets of possible worlds directly as static objects in place of the inherently dynamic objects—quite different from the referents of pronouns and tenses—used in previous accounts. The simpler proposal that results offers better empirical coverage and suggests a new parallel between modal and temporal interpretation. (shrink)
Interdisciplinary investigations marry the methods and concerns of different fields. Computer science is the study of precise descriptions of finite processes; semantics is the study of meaning in language. Thus, computational semantics embraces any project that approaches the phenomenon of meaning by way of tasks that can be performed by following definite sets of mechanical instructions. So understood, computational semantics revels in applying semantics, by creating intelligent devices whose broader behavior fits the meanings of utterances, and not just their form. (...) IBM’s Watson (Ferrucci, Brown, Chu-Carroll, Fan, Gondek, Kalyanpur, Lally, Murdock, Nyberg, Prager, Schlaefer & Welty 2010) is a harbinger of the excitement and potential of this technology. (shrink)
This paper argues for teaching computer science to linguists through a general course at the introductory graduate level whose goal is to prepare students of all backgrounds for collaborative computational research, especially in the sciences. We describe our work over the past three years in creating a model course in the area, called Computational Thinking. What makes this course distinctive is its combined emphasis on the formulation and solution of computational problems, strategies for interdisciplinary communication, and critical thinking about computational (...) explanations. (shrink)
We present an algorithm for simultaneously constructing both the syntax and semantics of a sentence using a Lexicalized Tree Adjoining Grammar (LTAG). This approach captures naturally and elegantly the interaction between pragmatic and syntactic constraints on descriptions in a sentence, and the inferential interactions between multiple descriptions in a sentence. At the same time, it exploits linguistically motivated, declarative speciﬁcations of the discourse functions of syntactic constructions to make contextually appropriate syntactic choices.
The meanings of donkey sentences cannot be captured using a procedure which, like Montague’s, uses the existential quantiﬁers of classical logic to translate indeﬁnites and the variables to translate pronouns. The treatment of these examples requires meanings which depend on the context in which sentences appear, and thus necessitates a logic which models this context to some extent. If context is represented as the information conveyed in discourse, and the meanings of pronouns are enriched to depend on this information, the (...) result is the E-Type approach (ETA) adapted by Heim (1990) from proposals in Evans (1980) and Cooper (1979). If the context is represented as a list of potential referents, and the meanings of indeﬁnites are enriched to introduce new referents into this list, the result is a compositional formulation like Groenendijk and Stokhof’s (1990) of the discourse representation theory (DRT) of Kamp (1981) and Heim (1982). Either tack sufﬁces to capture the way in which the referents of he and it systematically correspond to the alternative possibilities described by the antecedent. Disjunction offers a parallel way of introducing alternatives in the antecedent of a conditional, as shown in (2). (shrink)
We translate sentence generation from TAG grammars with semantic and pragmatic information into a planning problem by encoding the contribution of each word declaratively and explicitly. This allows us to exploit the performance of off-the-shelf planners. It also opens up new perspectives on referring expression generation and the relationship between language and action.
We present a formal analysis of iconic coverbal gesture. Our model describes the incomplete meaning of gesture that’s derivable from its form, and the pragmatic reasoning that yields a more specific interpretation. Our formalism builds reported.
When we wish to frame or to communicate a precise and nuanced argument, we should first clarify whatever meaningful distinctions our reasoning exploits. That’s why every good paper begins by defining its terms. A tiger is a large and ferocious predatory cat, yellow with black stripes. A bachelor is an unmarried man. Freedom is the capacity to choose one’s actions for oneself, independent of causal forces in the outside world. Knowledge is justified true belief. Getting clear on our concepts is (...) the process of analysis. It is such a fundamental part of philosophical practice that the preponderance of contemporary philosophical writing in English today is described as ‘analytic’. (shrink)
Both formal semantics and cognitive semantics are the source of important insights about language. By developing precise statements of the rules of meaning in fragmentary, abstract languages, formalists have been able to offer perspicuous accounts of how we might come to know such rules and use them to communicate with others. Conversely, by charting the overall landscape of interpretations, cognitivists have documented how closely interpretations draw on the commonsense knowledge that lets us make our way in the world. There is (...) no opposition between these insights. Sooner or later we will have a semantics that responds to both. However, developing such a semantics is profoundly difficult, because there are certain tensions to be overcome in reconciling the two perspectives. For one thing, the overall landscape of meaning does seem to be characterized by a much richer ontology and more dynamic categories than are exhibited by the fragments typically studied in the formal tradition. One sign of strain is the recent tendency to talk of “procedural”, “non-compositional”, or “computational” semantics, as in Hamm, Kamp and van Lambalgen 2006, hereafter HK&vL. We think such locutions can serve as useful reminders to keep semantics fixed on the central question of how language allows us to share information that some have and others need to get. However, there is some danger that formalists will merely by put off by an idea that, taken literally, may not be such a good one. In this short article, we want to explore and defend the traditional realist view attributed by HK&vL to Lewis among others. In fact, this view offers a well-developed, extremely straightforward and robust account of the relation between semantics and cognition. Moreover, while the realist view has ways of accommodating the representationalist insights of DRT (Lewis 1979; Thomason 1990; Stalnaker 1998), it remains unclear how “computational” semantics can account for the key data for the realist view: cases where we judge interlocutors to be ignorant about aspects of meaning in their native language (Kripke 1972; Putnam 1975; Stalnaker 1979; Williamson 1994).. (shrink)
The mid-twentieth century saw the introduction of a new general model of processes, COMPUTATION, with the work of scientists such as Turing, Chomsky, Newell and Simon.1 This model so revolutionized the intellectual world that the dominant scientific programs of the day—spearheaded by such eminent scientists as Hilbert, Bloomfield and Skinner—are today remembered as much for the way computation exposed their stark limitations as for their positive contributions.2 Ever since, the field of Artificial Intelligence (AI) has defined itself as the subfield (...) of computer science dedicated to the understanding of intelligent entities as computational processes. Now, drawing on fifty years of results of increasing breadth and applicability, we can also characterize AI research as a concrete practice: an ENGINEER-. (shrink)
In abductive planning, plans are constructed as reasons for an agent to act: plans are demonstrations in logical theory of action that a goal will result assuming that given actions occur successfully. This paper shows how to construct plans abductively for an agent that can sense the world to augment its partial information. We use a formalism that explicitly refers not only to time but also to the information on which the agent deliberates. Goals are reformulated to represent the successive (...) stages of deliberation and action the agent follows in carrying out a course of action, while constraints on assumed actions ensure that an agent at each step performs a specific action selected for its known effects. The result is a simple formalism that can directly inform extensions to implemented planners. (shrink)
The cognitive hierarchy model is an approach to decision making in multi-agent interactions motivated by laboratory studies of people. It bases decisions on empirical assumptions about agents’ likely play and agents’ limited abilities to second-guess their opponents. It is attractive as a model of human reasoning in economic settings, and has proved successful in designing agents that perform effectively in interactions not only with similar strategies but also with sophisticated agents, with simpler computer programs, and with people. In this paper, (...) we explore the qualitative structure of iterating best response solutions in two repeated games, one without messages and the other including communication in the form of non-binding promises and threats. Once the model anticipates interacting with sufficiently sophisticated agents with a sufficiently high probability, reasoning leads to policies that disclose intentions truthfully, and expect credibility from the agents they interact with, even as those policies act aggressively to discover and exploit other agents’ weaknesses and idiosyncrasies. Non-binding communication improves overall agent performance in our experiments. (shrink)
Algorithms for NLG NLG is typically broken down into stages of discourse planning (to select information and organize it into coherent paragraphs), sentence planning (to choose words and structures to fit information into sentence-sized units), and realization (to determine surface form of output, including word order, morphology and final formatting or intonation). The SPUD system combines the generation steps of sentence planning and surface realization by using a lexicalized grammar to construct the syntax and semantics of a sentence simultaneously.
We study prefixed tableaux for first-order multi-modal logic, providing proofs for soundness and completeness theorems, a Herbrand theorem on deductions describing the use of Herbrand or Skolem terms in place of parameters in proofs, and a lifting theorem describing the use of variables and constraints to describe instantiation. The general development applies uniformly across a range of regimes for defining modal operators and relating them to one another; we also consider certain simplifications that are possible with restricted modal theories and (...) fragments. (shrink)
An essential ingredient of language use is our ability to reason about utterances as intentional actions. Linguistic representations are the natural substrate for such reasoning, and models from computational semantics can often be seen as providing an infrastructure to carry out such inferences from rich and accurate grammatical descriptions. Exploring such inferences offers a productive pragmatic perspective on problems of interpretation, and promises to leverage semantic representations in more flexible and more general tools that compute with meaning.
This paper gives a new, proof-theoretic explanation of partial-order reasoning about time in a nonmonotonic theory of action. The explanation relies on the technique of lifting ground proof systems to compute results using variables and unification. The ground theory uses argumentation in modal logic for sound and complete reasoning about specifications whose semantics follows Gelfond and Lifschitz’s language . The proof theory of modal logic A represents inertia by rules that can be instantiated by sequences of time steps or events. (...) Lifting such rules introduces string variables and associates each proof with a set of string equations; these equations are equivalent to a set of partial-order tree-constraints that can be solved efficiently. The defeasible occlusion of inertia likewise imposes partial-order constraints in the lifted system. By deriving an auxiliary partial order representation of action from the underlying logic, not the input formulas or proofs found, this paper strengthens the connection between practical planners and formal theories of action. Moreover, the general correctness of the theory of action justifies partial-order representations not only for forward reasoning from a completely specified start state, but also for explanatory reasoning and for reasoning by cases. (shrink)
This paper develops a general approach to contextual reasoning in natural language processing. Drawing on the view of natural language interpretation as abduction (Hobbs et al., 1993), we propose that interpretation provides an explanation of how an utterance creates a new discourse context in which its interpreted content is both true and promi- nent. Our framework uses dynamic theories of semantics and pragmatics, formal theories of context, and models of attentional state. We describe and illustrate a Prolog implementation.
We use a dynamic, context-sensitive approach to abductive interpretation to describe coordinated processes of understanding, generation and accommodation in dialogue. The agent updates the dialogue uniformly for its own and its interlocutors’ utterances, by accommodating a new context, inferred abductively, in which utterance content is both true and prominent. The generator plans natural and comprehensible utterances by exploiting the same abductive preferences used in understanding. We illustrate our approach by formalizing and implementing some interactions between information structure and the form (...) of referring expressions. (shrink)
We relate the theory of presupposition accommodation to a computational framework for reasoning in conversation. We understand presuppositions as private commitments the speaker makes in using an utterance but expects the listener to recognize based on mutual information. On this understanding, the conversation can move forward not just through the positive effects of interlocutors’ utterances but also from the retrospective insight interlocutors gain about one anothers’ mental states from observing what they do. Our title, ENLIGHTENED UPDATE, highlights such cases. Our (...) approach fleshes out two key principles: that interpretation is a form of intention recognition; and that intentions are complex informational structures, which specify commitments to conditions and to outcomes as well as to actions. We present a formalization and implementation of these principles for a simple conversational agent, and draw on this case study to argue that pragmatic reasoning is holistic in character, continuous with common-sense reasoning about collaborative activities, and most effectively characterized by associating specific, reliable interpretive constraints directly with grammatical forms. In showing how to make such claims precise and to develop theories that respect them, we illustrate the general place of computation in the cognitive science of language. (shrink)
We cannot explain our diverse practices for engaging with imagery through general pragmatic mechanisms. There is no general mechanism behind practices like metaphor and irony. Metaphor works the way it works; irony works the way it works.
The commonplace view about metaphorical interpretation is that it can be characterized in traditional semantic and pragmatic terms, thereby assimilating metaphor to other familiar uses of language. We will reject this view, and propose in its place the view that, though metaphors can issue in distinctive cognitive and discourse effects, they do so without issuing in metaphorical meaning and truth, and so, without metaphorical communication. Our inspiration derives from Donald Davidson’s critical arguments against metaphorical meaning and Richard Rorty’s exploration of (...) the diverse uses of language. But unlike these authors we ground our discussion squarely in distinctions about causal mechanisms in cooperative activity developed by H.P. Grice and others. (shrink)