Natural Language Semantics and Computability

Moot, Richard; Retoré, Christian

doi:10.1007/s10849-019-09290-7

Natural Language Semantics and Computability

Published: 19 April 2019

Volume 28, pages 287–307, (2019)
Cite this article

Journal of Logic, Language and Information Aims and scope Submit manuscript

691 Accesses
4 Citations
2 Altmetric
Explore all metrics

Abstract

This paper is a reflexion on the computability of natural language semantics. It does not contain a new model or new results in the formal semantics of natural language: it is rather a computational analysis, in the context for type-logical grammars, of the logical models and algorithms currently used in natural language semantics, defined as a function from a grammatical sentence to a (non-empty) set of logical formulas—because a statement can be ambiguous, it can correspond to multiple formulas, one for each reading. We argue that as long as we do not explicitly compute the interpretation in terms of possible world models, one can compute the semantic representation(s) of a given statement, including aspects of lexical meaning. This is a very generic process, so the results are, at least in principle, widely applicable. We also discuss the algorithmic complexity of this process.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 8

Natural Language Semantics and Its Computable Analysis

Language Theory and Semantics

Computability and Analysis, a Historical Approach

Notes

The models produced by Montague are uncountably infinite, since they start from countably infinite sets, then use the powerset operation (Montague 1970; Thomason 1974, p. 194).
This is true already for second-order logic and its standard, set-theoretic models (van Dalen 2013; Shapiro 1991). As Shapiro (1991, p. vii) puts it succinctly: “there can be no complete effective deductive system for second-order logic”.
There are exactly two (non-equivalent) proofs of this sentence. The second proof using the same premisses corresponds to the second, more prominent reading of the sentence whose lambda term is: \((every\ kid) (\lambda x^e. (a\ cartoon)(\lambda y^e((watched\ y)\ x))\).
We use the standard convention to translate a term \(((p\, y)\, x)\) into a predicate p(x, y).
For many algorithms, the complexity is a function of the number of atomic subformulas of the formulas in the sentence. Empirical estimation shows the number of atomic formulas is a bit over twice the number of words in a sentence (Moot 2015b).
For the algorithm of Pentus (2010), the order appears as an exponent in the worst-case complexity: for a grammar of order n there is a multiplicative factor of \(2^{5(n+1)}\). So though polynomial, this algorithm is not necessarily efficient.
The last two items are sometimes stated as the weaker condition “constant growth” instead of semilinearity and the stronger condition of polynomial parsing instead of polynomial fixed recognition. Since all other properties are properties of formal languages, we prefer the formal language theoretic notion of polynomial fixed recognition.
We can side-step the need for a Pentus-like proof by looking only at fragments of order 1 and some order 1 fragments are known to generate mildly context-sensitive languages (de Groote and Pogodalla 2004; Wijnholds 2011).
Of course, when our goal is to generate (subsets of) n! different proofs rather than a single proof (if one exists), then we are no longer in NP, though it is unknown whether an algorithm exists which produces a sort of shared representation for all such subsets such that (1) the algorithm outputs “no” when the sentence is ungrammatical (2) the algorithm has a fairly trivial algorithm (say of a low-degree polynomial at worst) for recovering all readings from the shared representation (3) the shared structure is polynomial in the size of the input.
We need to be careful here: the number of readings for a sentence with n quantifiers is \(\varTheta (n!)\), whereas the maximum number of Lambek calculus proofs is \(O(c_0^{c_2n} C_{c_1c_2n})\), for constants \(c_0\), \(c_1\), \(c_2\) which depend on the grammar (\(c_0\) is the maximum number of formulas for a single word, \(c_1\) is the maximum number of (negative) atomic subformulas for a single formula and \(c_2\) represents the minimum number of words needed to add a generalized quantifier to a sentence, i.e. \(c_2n\) is the number of words required to produce an n-quantifier sentence) and \(O(c_0^{c_2n} C_{c_1c_2n})\) is in o(n!).
The known polynomial fragments of order 1 are in LOGCFL (de Groote and Pogodalla 2004; Wijnholds 2011). A counting argument similar to the one we use for the Lambek calculus can be used to show there are not enough distinct LOGCFL trees to enumerate the different quantifier scope readings required (essentially because we only have a finite number of pointers to handle n quantifiers).
In addition, Ebert (2005) argues that underspecification languages are not expressive enough to capture all possible readings of a sentence in a single structure. So underspecification does not solve the combinatorial problem but, at best, reduces it.
To give an indication, the TLGbank (Moot 2015b) contains more than 14,000 French sentences and has a median of 26 words per sentence, 99% of sentences having less than 80 words, with outliers at 190 and at 266 (the maximum sentence length in the corpus).

References

Asher, N. (2011). Lexical meaning in context: A web of words. Cambridge: Cambridge University Press.
Google Scholar
Baillot, P., & Mogbil, V. (2004). Soft lambda-calculus: A language for polynomial time computation. In Foundations of software science and computation structures (pp. 27–41). Springer.
Barker, C. (2015). Scope. In S. Lapping & C. Fox (Eds.), Handbook of contemporary semantic theory (2nd ed., pp. 40–76). Hoboken: Wiley Blackwell.
Google Scholar
Bassac, C., Mery, B., & Retoré, C. (2010). Towards a type-theoretical account of lexical semantics. Journal of Logic, Language and Information, 19(2), 229–245. https://doi.org/10.1007/s10849-009-9113-x.
Article Google Scholar
Blackburn, P., & Bos, J. (2005). Representation and inference for natural language: A first course in computational semantics. Stanford: CSLI.
Google Scholar
Bos, J., Clark, S., Steedman, M., Curran, J. R., & Hockenmaier, J. (2004). Wide-coverage semantic representation from a CCG parser. In Proceedings of COLING-2004 (pp. 1240–1246).
Buszkowski, W. (1997). Mathematical linguistics and proof theory. In J. van Benthem & A. ter Meulen (Eds.), Handbook of logic and language, chap. 12 (pp. 683–736). Amsterdam: Elsevier.
Google Scholar
Carpenter, B. (1994). Quantification and scoping: A deductive account. In The proceedings of the 13th west coast conference on formal linguistics.
Chatzikyriakidis, S., & Luo, Z. (2014). Natural language inference in Coq. Journal of Logic, Language and Information, 23(4), 441–480.
Google Scholar
Church, A. (1940). A formulation of the simple theory of types. Journal of Symbolic Logic, 5(2), 56–68.
Google Scholar
Cooper, R. (1975). Montague’s semantic theory and transformational grammar. Ph.D. thesis, University of Massachusetts.
Corblin, F. (2013). Cours de sémantique: Introduction. Paris: Armand Colin.
Google Scholar
de Groote, P. D., & Pogodalla, S. (2004). On the expressive power of abstract categorial grammars: Representing context-free formalisms. Journal of Logic, Language, and Information, 13(4), 421–438.
Google Scholar
Ebert, C. (2005). Formal investigations of underspecified representations. Ph.D. thesis, King’s College, University of London.
Fox, C., & Lappin, S. (2010). Expressiveness and complexity in underspecified semantics. Linguistic Analysis, 36(1–4), 385–417.
Google Scholar
Gómez-Rodríguez, C., Alonso, M. A., & Vilares, M. (2006). On the theoretical and practical complexity of TAG parsers. In Proceedings of formal grammar (FG 2006) (pp. 87–101).
Hobbs, J. R., & Shieber, S. M. (1987). An algorithm for generating quantifier scopings. Computational Linguistics, 13(1–2), 47–63.
Google Scholar
Jacobson, P. (2002). The (dis)organization of the grammar: 25 years. Linguistics and Philosophy, 25(5–6), 601–626.
Google Scholar
Joshi, A. (1985). Tree adjoining grammars: How much context-sensitivity is required to provide reasonable structural descriptions? In D. R. Dowty, L. Karttunen & A. Zwicky (Eds.), Natural language parsing (pp. 206–250). Cambridge: Cambridge University Press.
Joshi, A. (1997). Parsing techniques. In R. A. Cole, J. Mariani, H. Uszkoreit, A. Zaenen, & V. Zue (Eds.), Survey of the state of the art in human language technology, chap. 11.4 (pp. 351–356). Cambridge University Press and Giardini.
Koller, A., & Thater, S. (2010). Computing weakest readings. In Proceedings of the 48th annual meeting of the association for computational linguistics (pp. 30–39).
Kubota, Y., & Levine, R. (2012). Gapping as like-category coordination. In D. Béchet & A. Dikovsky (Eds.), Logical aspects of computational linguistics. Lecture notes in computer science (Vol. 7351, pp. 135–150). Nantes: Springer.
Google Scholar
Lafont, Y. (2004). Soft linear logic and polynomial time. Theoretical Computer Science, 318(1), 163–180.
Google Scholar
Langacker, R. (2008). Cognitive grammar: A basic introduction. Oxford: Oxford University Press.
Google Scholar
Luo, Z. (2012). Formal semantics in modern type theories with coercive subtyping. Linguistics and Philosophy, 35(6), 491–513.
Google Scholar
Mineshima, K., Martınez-Gómez, P., Miyao, Y., & Bekki, D. (2015). Higher-order logical inference with compositional semantics. In Proceedings of EMNLP (pp. 2055–2061).
Montague, R. (1970). English as a formal language. In B. Visentini (Ed.), Linguaggi nella societa e nella tecnica (pp. 188–221). Edizioni di Communita.
Montague, R. (1974). The proper treatment of quantification in ordinary English. In R. Thomason (Ed.), Formal philosophy: Selected papers of Richard Montague. New Haven: Yale University Press.
Google Scholar
Moortgat, M. (1997). Categorial type logics. In J. van Benthem & A. ter Meulen (Eds.), Handbook of logic and language, chap. 2 (pp. 93–177). Amsterdam: Elsevier.
Google Scholar
Moot, R. (2002). Proof nets for linguistic analysis. Ph.D. thesis, Utrecht Institute of Linguistics OTS, Utrecht University.
Moot, R. (2007). Filtering axiom links for proof nets. In L. Kallmeyer, P. Monachesi, G. Penn, & G. Satta (Eds.), Proccedings of formal grammar 2007.
Moot, R. (2010). Wide-coverage French syntax and semantics using Grail. In Proceedings of traitement automatique des langues naturelles (TALN), Montreal. System Demo
Moot, R. (2015a). Linear one: A theorem prover for first-order linear logic. https://github.com/RichardMoot/LinearOne. Accessed 16 Apr 2019.
Moot, R. (2015b). A type-logical treebank for french. Journal of Language Modelling, 3(1), 229–264.
Google Scholar
Moot, R., & Piazza, M. (2001). Linguistic applications of first order multiplicative linear logic. Journal of Logic, Language and Information, 10(2), 211–232.
Google Scholar
Moot, R., & Retoré, C. (2012). The logic of categorial grammars: A deductive account of natural language syntax and semantics. Berlin: Springer.
Google Scholar
Morrill, G. (2019). Parsing/theorem-proving for logical grammar CatLog3. Journal of Logic, Language and Information. https://doi.org/10.1007/s10849-018-09277-w.
Morrill, G., & Valentín, O. (2015). Computational coverage of TLG: The Montague test. In Proceedings CSSP 2015 Le onzième Colloque de Syntaxe et Sémantique à Paris (pp. 63–68).
Morrill, G., Valentín, O., & Fadda, M. (2011). The displacement calculus. Journal of Logic, Language and Information, 20(1), 1–48.
Google Scholar
Park, J. C. (1996). Quantifier scope, lexical semantics, and surface structure constituency. Technical report, University of Pennsylvania.
Partee, B. (2001). Montague grammar. In N. J. Smelser & P. B. Baltes (Eds.), International encyclopedia of the social and behavioral sciences. Oxford: Pergamon.
Google Scholar
Pentus, M. (1995). Lambek grammars are context free. In Proceedings of logic in computer science (pp. 429–433).
Pentus, M. (2010). A polynomial-time algorithm for Lambek grammars of bounded order. Linguistic Analysis, 36(1–4), 441–471.
Google Scholar
Pinker, S. (1994). The language instinct. Penguin Science.
Retoré, C. (2014). The Montagovian generative lexicon \(\Lambda Ty_n\): A type theoretical framework for natural language semantics. In Proceedings of TYPES (pp. 202–229). 10.4230/LIPIcs.TYPES.2013.202.
Sarkar, A. (2000). Practical experiments in parsing using tree adjoining grammars. In Proceeding of TAG+5.
Savateev, Y. (2009). Product-free Lambek calculus is NP-complete. In Symposium on logical foundations of computer science (LFCS) (pp. 380–394).
Schwichtenberg, H. (1982). Complexity of normalization in the pure typed lambda-calculus. In The L. E. J. Brouwer centenary symposium (pp. 453–457). North-Holland.
Shapiro, S. (1991). Foundations without foundationalism: A case for second-order logic. Oxford: Clarendon Press.
Google Scholar
Shieber, S. (1985). Evidence against the context-freeness of natural language. Linguistics & Philosophy, 8, 333–343.
Google Scholar
Stanley, R. P. (2015). Catalan numbers. Cambridge: Cambridge University Press.
Google Scholar
Thomason, R. (Ed.). (1974). Formal philosophy: Selected papers of Richard Montague. New Haven: Yale University Press.
Google Scholar
Turing, A. (1950). Computing machinery and intelligence. Mind, 49, 433–460.
Google Scholar
van Benthem, J. (1986). Categorial grammar. In Essays in logical semantics, chap. 7 (pp. 123–150). Dordrecht: Reidel.
van Dalen, D. (2013). Logic and structure (5th ed.). Berlin: Springer.
Google Scholar
Wijnholds, G. (2011). Investigations into categorial grammar: Symmetric pregroup grammar and displacement calculus. Master’s thesis, Utrecht University.

Download references

Author information

Authors and Affiliations

LIRMM, Montpellier University, CNRS, 161 Rue Ada, 34095, Montpellier Cedex 5, France
Richard Moot & Christian Retoré

Authors

Richard Moot
View author publications
You can also search for this author in PubMed Google Scholar
Christian Retoré
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Richard Moot.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Moot, R., Retoré, C. Natural Language Semantics and Computability. J of Log Lang and Inf 28, 287–307 (2019). https://doi.org/10.1007/s10849-019-09290-7

Download citation

Published: 19 April 2019
Issue Date: 15 June 2019
DOI: https://doi.org/10.1007/s10849-019-09290-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Natural Language Semantics and Computability

Abstract

Access this article

Similar content being viewed by others

Natural Language Semantics and Its Computable Analysis

Language Theory and Semantics

Computability and Analysis, a Historical Approach

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Natural Language Semantics and Computability

Abstract

Access this article

Similar content being viewed by others

Natural Language Semantics and Its Computable Analysis

Language Theory and Semantics

Computability and Analysis, a Historical Approach

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation