ABSTRACTS OF CURRENT LITERATURE
Articles, Word Order, and Resource
Control Hypothesis
Janusz S. Bien
Warsaw
In Mey, Jacob L., Ed., Language and
Discourse: Test and Protest, A Festschrift for
Petr Sgall. (Vol. 19, Linguistic and Literary
Studies in Eastern Europe.) John Benjamins
Publishing Company, Amsterdam/
Philadelphia, ! 986.
The paper elaborates the ideas presented in Bien (1983). The definite and
indefinite distinction is viewed as a manifestation of the variable depth of
nominal phrase processings: indefinite phrases are represented by frame
pointers, while definite ones by frame instances incorporating information
found by memory search. In general, the depth of processing is determined by the availability of resources. Different word orders cause different distributions of the parser's processing load and therefore influence
also the depth of processing. Articles and word order appear to be only
some of several resource control devices available in natural languages.
For copies of the following papers from Projekt SEMSYN, please write to
Frau Martin
c / o Projekt SEMSYN
Institut fuer Informatik
Azenbergstr. 12
D-7000 Stuttgart 1
West Germany
or e-mail to: semsyn@ifistg.uucp
The Automated News Agency: SEMTEX A Text Generator for German
Dietmar Roesner
GEOTEX - A System for Verbalizing
Geometric Constructions (in German)
Waiter Kehl
As a by-product of the J a p a n e s e / G e r m a n machine translation project
SEMSYN the SEMTEX text generator for German has been implemented
(in ZetaLISP for SYMBOLICS lisp machines). SEMTEX's first application
has been to generate newspaper stories about job market development.
Starting point for the newspaper application is just the data from the
montl~ly job market report (numbers of unemployed, open jobs . . . . ). A
rudimentary "text planner" takes these data and those of relevant previous
months, checks for changes and significant developments, simulates possible argumentations of various political speakers on these developments and
finally creates a representation for the intended text as an ordered list of
frame descriptions. SEMTEX then converts this list into.a newspaper story
in German using an extended version of the generator of the SEMSYN
project.
The extensions for SEMTEX include:
• Building up a representation for the context during the utterance of
successive sentences that allows for
- avoiding repetitions in wording
- avoiding re-utterance of information still valid
- pronominalization and other types of references.
• Grammatical tense is dynamically derived by checking the temporal
information from the conceptual repr%sentations and relating it to the
time of speech and the time-period focussed by the story.
• When simulating arguments the text planner uses abstract rhetorical
schemata; the generator is enriched with knowledge about various ways
to express such rhetorical structures as German surface texts.
GEOTEX is an application of the SEMTEX text generator for German: The
text generator is combined with a tool for interactively creating geometric
constructions. The latter offers formal commands for manipulating (i.e.
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
93
The FINITE STRING Newsletter
Abstracts of Current Literature
creating, naming and - deliberately - deleting) basic objects of Euclidean
geometry. The generator is used to produce descriptive texts - in G e r m a n
- related to the geometric construction:
• descriptions of the geometric objects involved,
• descriptions of the sequence of steps done during a construction.
SEMTEX's context-handling mechanisms have been enriched for GEOTEX:
• Elision is no longer restricted to adjuncts. For repetitive operations, verb
and subject will be elided in subsequent sentences.
• The distinction between known information and new one is exploited to
decide on constituent ordering: the constituent referring to the known
object is "topicalized", i.e. put in front of the sentence.
• The system allows for more ways to refer to objects introduced in the
text: pronouns, textual deixis using demonstrative pronouns, names. The
choice between these variants is done deliberately.
GEOTEX is implemented in ZetaLISP and runs on SYMBOLICS lisp
machines.
The Generation System of the SEMSYN
Project. Towards a Task-Independent
Generator for German
Dietmar Roesner
We report on our experiences from the implementation of the SEMSYN
generator, a system generating German texts from semantic representations, and its application to a variety of different areas, input structures
and generation tasks. In its initial version the SEMSYN generator was used
within a J a p a n e s e / G e r m a n MT project, where it produced G e r m a n equivalents to Japanese titles from scientific papers. Being carefully designed in
object-oriented style (and implemented with the FLAVOR system) the
system proved to be easily adaptable to other semantic representations e.g. output from CMU's Universal Parser - and extensible to other generation tasks: generating G e r m a n news stories, generating descriptive texts
to geometric constructions.
Copies of the following reports on the joint research project WISBER can be ordered free of charge from
Dr. Johannes Arz
Universit~it des Saarlandes
FR. 10 Informatik IV
lm Stadtwald 15
D-6600 Saarbrticken 11
Electronic mail address: wisber% sbsvax.uucp@germany.csnet
Neuere Grammatiktheorien und Grammatikformalismen
H.-U. Block, M. Gehrke, H. Haugeneder,
R. Hunze
Report No. 1
Entwurf eines Erhebungsschemas fiir
Geldanlage
R. Busche, S. op de Hipt, M.-J. Schacter-Radig
Report No. 2
Generierung von Erkl~irungen aus formalen
Wissensrepr~isentationen
H. Riisner
in LDV-Forum, Band 4, Nummer 1,
Juni 1986, pp. 3-19
94
The present paper gives an overview of modern theories of syntax and is
intended to provide insight into current trends in the field of parsing.
The grammar theories treated here are government and binding theory,
generalized phrase structure grammar, and lexical functional grammar, as
these approaches currently appear to be the most promising.
Recent grammar formalisms are virtually all based on unification procedures. Three representatives of this group (functional unification grammar, &patr., and definite clause grammar) are presented.
This report describes the acquisition schema for the knowledge required by
knowledge-based consulting system WISBER, the goal of which consists
in carrying out the process of knowledge acquisition and formalization in a
methodical - i.e., planned and controlled - manner.
The main task involves the design of appropriate acquisition techniques
and their successful application in the domain of investment consulting.
The main topic of this report concerns the generation of natural language
texts. The use of explanation components in expert systems involves
making computer behavior more transparent. This standard can only be
attained if the current stack dump procedure is replaced by procedures in
which user expectations are met with respect to the contents of the, systems
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
Abstracts of Current Liferalrure
Report No. 3
explanation as well as the acceptability of language structure.
This paper reports on work pertaining to an expanded range of explanation components in the Nixdorf exper system shell TwAIce.
A critical account of the position held by grammatical theory in generating natural language at the user level is given, whereby the decision for a
certain theory remains first and foremost pragmatical.
Moreover, a stand is taken concerning scientific experimentation on the
transfer of formal knowledge representation. Practical problems concerning technical technology are pointed out that haven't yet been taken into
account.
Incremental Construction of C- and
F-Structure in an LFG-Parser
In this paper a parser for Lexical Functional G r a m m a r (LFG) which is
characterized by incrementally constructing the c- and f-structure of a
sentence during parsing is presented. Then the possibilities of the earliest
check on consistency, coherence, and completeness are discussed.
Incremental construction of f-structure leads to an early detection and
abortion of incorrect paths and so increases parsing efficiency. Furthermore, those semantic interpretation processes that operate on partial structures can be triggered at an earlier state. This also leads to a considerable
improvement in parsing time. LFG seems to be well suited for such an
approach because it provides for locality principles by the definition of
coherence and completeness.
H.-U. Block, R. Hunze
in Proceedings of the 1 lth International
Conference on Computational Linguistics,
COLING'86, Bonn, pp. 490-493
Report No. 4
The Treatment of Movement Rules in an
LFG-Parser
H.-U. Block, H. Haugender
in Proceedings of the 1 lth International
Conference on Computational Linguistics,
COLlNG'86, Bonn, pp. 482-486
Report No. 5
Morpheme-Based Lexical Analysis
M. Gehrke, H.-U. Block
Report No. 6
Probleme der Wissensrepr~isentation in
Beratungssystemen
H.-U. Block, M. Gehrke, H. Haugender,
R. Hunze
Report No. 7
In this paper a way of treating long-distance movement phenomena as
exemplified in (1) is proposed within the framework of an LFG-based
parser.
(1) Who do you think Peter tried to meet
'You think Peter tried to meet who'
After a short overview of the treatment of general discontinuous dependencies in the Theory of Government and Binding, Lexical Functional
Grammar, and Generalized Phrase Structure Grammar, the so-called whor long-distance movement are concentrated arguing that a general mechanism which is compatible with both the LFG and the GB treatment of
long-distance movement can be found.
Finally, the implementation of such a movement mechanism in an
LFG-parser is presented.
In this paper some aspects of the advantages and disadvantages of a
morpheme-based lexicon with respect to a full lexicon are discussed.
Then a current implementation of an application-independent lexical
access component is presented as well as an implemented formalism for the
inflectional analysis of German.
The present report consists of two main sections. The first part analyzes
individual knowledge sources that require specialization for the consulting
system W1SBER. It should serve as a first approximation to the structural
analysis of all knowledge sources.
In the second part, methods for the representation of knowledge and
languages are examined. Regarding this, KL-ONE, interpreted as an epistemic formal structure of language representation for describing structure
objects, is examined. Supplementing this is an examination of other
systems which, in addition, have significant assertive components such as
KRYPTON and KL-TWO at their disposal.
At the other end of the spectrum lies PEARL, a system that cannot
clearly be semantically and epistemically interpreted as a representational
language as such.
Between these two poles lie, on the one hand, FLR, which, without
guaranteeing the semantic clarity of the grammatical constructions used,
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
95
The FINITE STRING Newsletter
Abstracts of Current Literature
flexibly combines a large number of the ideas previously suggested and, on
the other hand, KRS, representative for a group of hybrid representation
systems which allow a flexible combination of various formal structures of
representation.
Beratung und natiirliehsprachlicher Dialog
- eine Evaluation yon Systemen der
Kiinstlichen Intelligenz
H. Bergmann, M. Gerlach, W. Hoeppner,
H. Marburger
Report No. 8
This report contains an evaluation of Artificial Intelligence systems which
provide the research base for the development of the natural-language
advisory system WISBER.
First, the reasons for selecting the particular systems considered in the
study are given and a set of evaluation criteria emphasizing in particular
pragmatic factors (e.g., dialog phenomena, handling of speech acts, user
modeling) is presented.
The body of the report consists of descriptions and critical evaluations
of the following systems: ARGOT, AYPA, GRUNDY, GUIDON, HAM-ANS,
KAMP, OSCAR, ROMPER, TRACK, UC, VIE-LANG, WIZARD, WUSOR,
XCALIBUR.
The final chapter summarizes the results, concentrating on the possible
utilization of individual system capabilities in the development of WISBER.
Form der Ergebnisse der Wissensakquisition in WISBER-XPS4
M. Fliegner, M.-J. Schachter-Radig
Report No. 9
In this paper fundamental questions are discussed concerning the representation of expert knowledge, exemplified within the area of investment
consulting.
While a written report is appropriate for a general presentation of
results, it neither satisfies the needs of systems development - which of
course must build upon the results of knowledge acquisition - nor can it do
justice to the requirements of knowledge acquisition itself.
On the other hand, epistemologically expressive knowledge representation tools require that conceptual design decisions must be made quite
early on. The tools LOOPS, OPS5, prolog-based shell, and KL-ONE are
dealt with.
The following abstracts are from C O L I N G "86 P R O C E E D I N G S , copies of which are available only from
IKS e.V.
Poppelsdorfer Allee 47
D-5300 Bonn 1
WEST G E R M A N Y
Telephone: + 4 9 / 2 2 8 / 7 3 5 6 4 5
EARN/BITNET:
UPK000@DBNRHRZ1
IN T E R N E T :
UPK000 % D B N R H R Z 1.BITNET @ WlS C V M . W I S C . E D U
The price is 95 DM within Europe and 110 DM for air delivery to non-European countries. Please pay in advance by
check to the address above or by bankers draft to the following account:
Bank for Gemeinwirtschaft Bonn
Account no. 11205 163 900, BLZ 380 101 11
Lexicon-Grammar: The Representation of
Compound Words
Maurice Gross
Universit6 Paris 7
Laboratoire Documentaire et Linguistique
2, place Jussieu
F-75221 Paris CEDEX 05
COLING'86, pp. 1-6
96
The essential feature of a lexicon-grammar is that the elementary unit of
computation and storage is the simple sentence: subject-verb-complement(s). This type of representation is obviously needed for verbs: limiting a verb to its shape has no meaning other than typographic, since a verb
cannot be separated from its subject and essential complements. We have
shown (1975) that given a verb, or equivalently a simple sentence, the set
of syntactic properties that describes its variations is unique: in general,
no other verb has an identical syntactic paradigm. As a consequence,
the properties of each verbal construction must be represented in a lexicon-grammar. The lexicon has no significance taken as an isolated component and the grammar component, viewed as independent of the lexicon,
will have to be limited to certain complex sentences.
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
An Empirically Based Approach towards
a System of Semantic Features
Cornelia Zelinsky- Wibbelt
IAI-Eurotra-D
Martin-Luther-StraBe 14
D-6600 Saarbrticken
Abstracts of Current Literature
A major problem in machine translation is the semantic description of lexical units which should be based on a semantic system that is both coherent
and operationalized to the greatest possible degree. This is to guarantee
consistency between lexical units coded by lexicographers. This article
introduces a generating device for achieving well-formed semantic feature
expressions.
COLING'86, pp. 7-12
Concept and Structure of Semantic Markers for Machine Translation in Mu-Project
Yoshiyuki Sakamoto
Electrotechnical Laboratory
Sakura-mura. Niihari-gun.
Ibaraki, Japan
Tetsuya Ishikawa
University of Library & Information Science
Yatabe-machi. Tsukuba-gun.
lbaraki, Japan
Masayuki Satoh
Japan Information Center of Science & Technology. Nagata-cho, Chiyoda-ku
Tokyo, Japan
COLING'86, pp. 13-20
This paper discusses the semantic features of nouns classified into categories in Japanese-to-English translation, and proposes a system for semantic
markers. In our system, syntactic analysis is carried out by checking the
semantic compatibility between verbs and nouns. The semantic structure
of a sentence can be extracted at the same time as its syntactic analysis.
We also use semantic markers to select words in the transfer phase for
translation into English.
The system of the Semantic Markers for Nouns consists of 13 conceptual facets, including one facet for 'Others' (discussed later), and is made up
of 49 filial slots (semantic markers) as terminals. We have tested about
3,000 sample abstracts in science and technological fields. Our research
has revealed that our method is extremely effective in determining the
meanings of Wago verbs (basic Japanese verbs) which have broader
concepts like the English verbs make, get, take, put, etc.
A Theory of Semantic Relations for Large
Scale Natural Language Processing
Hanne Ruus
Institut for nordisk filologi & Eurotra-DK
Ebbe Spang-Hanssen
Romansk institut & Eurotra-DK
University of Copenhagen
Njalsgade 80
DK-2300 Copenhagen S
COLING'86, pp. 20-22
Even a superficial meaning representation of a text requires a system of
semantic labels that characterize the relations between the predicates in
the text and their arguments. The semantic interpretation of syntactic
subjects and objects, of prepositions and subordinate conjunctions has
been treated in numerous books and papers with titles including works like
deep case, case roles, semantic roles, and semantic relations.
In this paper we concentrate on the semantic relations established by
predicates: what are they, what are their characteristics, how do they group
the predicates.
Extending the Expressive Capacity of the
Semantic Component of the OPERA
System
Celestin Sedogbo
Centre de Recherche Bull
68, Route de Versailles
78430 Louveciennes, France
COLING'86, pp. 23-28
OPERA is a natural language question answering system allowing the interrogation of a data base consisting of an extensive listing of operas. The
linguistic front-end of OPERA is a comprehensive grammar of French, and
its semantic component translates the syntactic analysis into logical formulas (first order logic formulas).
However, there are quite a few constructions which can be analyzed
syntactically in the grammar but for which we are unable to specify translations. Foremost among them are anaphoric and elliptic constructions.
Thus this paper describes the extension of OPERA to anaphoric and elliptic
constructions on the basis of the Discourse Segmentation Theory.
User Models: The Problem of Disparity
Sandra Carberry
A significant component of a user model in an information-seeking
dialogue is the task-related plan motivating the information-seeker's
queries. A number of researchers have modeled the plan inference process
and used these models to design more robust natural language interfaces.
However, in each case it has been assumed that the system's context model
and the plan under construction by the information-seeker are never at
variance. This paper addresses the problem of disparate plans. It presents
a four phase approach and argues that handling disparate plans requires an
enriched context model. This model must permit the addition of components suggested by the information-seeker but not fully supported by the
system's domain knowledge, and must differentiate among the components
according to the kind of support accorded each component as a correct
Department of Computer & Information
Science
University of Delaware
Newark, Delaware 19716
COLING'86, pp. 29-34
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
97
The FINITE STRING Newsletter
Abstracts of Current Literature
part of the information-seeker's overall plan. It is shown how a component's support should affect the system's hypothesis about the source of
error once plan disparity is suggested.
Pragmatic Sensitivity in NL Interfaces and
the Structure of Conversation
Tom Wachtel
Scicon Ltd., London and
Research Unit for Information Science & AI,
Hamburg University
COLING'86, pp. 35-41
A T w o - L e v e l Dialogue Representation
Giacomo Ferrari
Department of Linguistics
University of Pisa
Ronan Reilly
Educational Research Center
St. Patrick's College, Dublin 9
COLING'86, pp. 42-45
INTERFACILE: Linguistic Coverage
and Query Reformulation
Yvette Mathieu, Paul Sabatier
CNRS - LADL
Universit~ Paris 7
Tour Centrale 9 E
2 Place Jussieu
75005 Paris
COLING'86, pp. 46-49
Category Cooccurrence Restrictions and
the Elimination of Metarules
James Kilbury
Technical University of Berlin
KIT/NASEV, CIS, Sekr. FRS-8
Franklinstr. 28/29
D-1000 Berlin 10
Germany - West Berlin
COLING'86, pp. 50-55
98
The work reported here is being conducted as part of the LOKI project
(ESPRIT Project 107, " A logic oriented approach to knowledge and data
bases supporting natural user interaction"). The goal of the NL part of the
project is to build a pragmatically sensitive natural language interface to a
knowledge base. By "pragmatically sensitive", we mean that the system
should not only produce well-formed coherent and cohesive language (a
minimum requirement of any NL system designed to handle discourse) but
should also be sensitive to those aspects of user behaviour that humans are
sensitive to over and above simply providing a good response, including
producing output that is appropriately decorated with those minor and
semantically inconsequential elements of language that make the difference
between natural language and natural natural language.
This paper concentrates on the representation of the structure of
conversation in our systems, we will first outline the representation we use
for dialogue moves, and then outline the nature of the definition of wellformed dialogue that we are operating with. Finally, we will note a few
extensions to the representation mechanism.
In this paper a two-level dialogue representation system is presented. It is
intended to recognize the structure of a large range of dialogues including
some nonverbal communicative acts which may be involved in an interaction. It provides a syntactic description of a dialogue which can be
expressed in terms of re-writing rules. The semantic level of the proposed
representation system is given by the goal and subgoal structure underlying
the dialogue syntactic units. Two types of goals are identified; goals which
relate to the content of the dialogue, and those which relate to communicating the content.
The experience we have gained in designing and using natural language
interfaces has led us to develop a general language system, INTERFACILE,
involving the following principles:
- The linguistic coverage must be elementary but must include phenomena that allow a rapid, concise, and spontaneous interaction, such as
anaphora (ellipsis, pronouns, etc.).
- The linguistic competence and limits of the interface must be easily and
rapidly perceived by the user.
- The interface must be equipped with strategies and procedures for leading the user to adjust his linguistic competence to the capacities of the
system.
We have illustrated these principles in an application: a natural language
(French) interface for acquiring the formal commands of some operating
system languages. (The examples given here concern DCL of Digital
Equipment Company.)
This paper builds upon and extends certain ideas developed within the
framework of Generalized Phrase Structure G r a m m a r (GPSG). A new
descriptive device, the Category Cooccurrence Restriction (CCR), is introduced in analogy to existing devices of GPSG in order to express constraints on the cooccurrence of categories within local trees (i.e., trees of
depth one) which at present are stated with Immediate Dominance &idp.
rules and metarules. In addition to providing a uniform format for the
statement of such constraints, CCRs permit generalizations to be
expressed which presently cannot be captured in GPSG.
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
Abstracts of CurrentLiterature
Sections 1.1 and 1.2 introduce CCRs and presuppose only a general
familiarity with GPSG. The ideas do not depend on details of GPSG and
can be applied to other grammatical formalisms.
Sections 1.3-1.5 discuss CCRs in relation to particular principles of
GPSG and assume familiarity with Gazdar et al. (1985) (henceforth abbreviated as GKPS). Finally, section 2 contains proposals for using CCRs to
avoid the analyses with metarules given for English in GKPS
Testing the Projectivity Hypothesis
Vladimir Pericliev
Mathematical Linguistics Dept.
Institute of Mathematics with Comp Centre
1113 Sofia, bl.8, Bulgaria
llarion llarionov
Mathematics Dept.
Higher Inst of English & Building
Sofia, Bulgaria
COLING'86, pp. 56-58
The empirical validity of the projectivity hypothesis for Bulgarian is tested.
It is shown that the justification of the hypothesis presented for other
languages suffers serious methodological deficiencies. Our automated testing, designed to evade such deficiencies, yielded results falsifying the
hypothesis for Bulgarian: the non-projective constructions studied were in
fact grammatical rather than ungrammatical, as implied by the projectivity
thesis. Despite this, the projectivity/non-projectivity distinction itself has
to be retained in Bulgarian syntax and, with some provisions, in the
systems for automatic processing as well.
Particle Homonymy and Machine
Translation
Kdroly Fdbricz
JATE University of Szeged
Egyetem u. 2.
Hungary H - 6722
COLING'86, pp. 59-61
The purpose of this contribution is to formulate ways in which the homonymy of so-called 'Modal Particles' and the etymons can be handled. Our
aim is to show that not only a strategy for this type of h o m o n y m y can be
worked out, but also a formalization of information beyond propositional
content can be introduced with a view to its MT application.
Plurals, Cardinalities, and Structures of
Determination
This paper presents an approach for processing incomplete and inconsistent knowledge. Basis for attaching these problems are 'structures of
determination', which are extensions of Scott's approximation lattices
taking into consideration some requirements from natural language processing and representation of knowledge. The theory developed is exemplified with processing plural noun phrases referring to objects which have to
be understood as classes or sets. Referential processes are handled by
processes on 'Referential Nets', which are a specific knowledge structure
developed for the representation of object-oriented knowledge. Problems
of determination with respect to cardinality assumptions are emphasized.
Christopher U. Habel
Universitat Hamburg, Fachbereich Informatik
SchlOterstr. 70
D-1000 Hamburg 13
COLING'86, pp. 62-64
Processing Word Order Variation within a
Modified I D / L P Framework
Pradip Dey
University of Alabama at Birmingham
Birmingham, AL 35294
COLING'86 pp. 65-67
From a well represented sample of world languages, Steel (1978) shows
that about 7 8 % of languages exhibit significant word order variation.
Only recently has this wide-spread phenomenon been drawing appropriate
attention. Perhaps ID/LP (Immediate Dominance and Linear Precedence)
framework is the most debated theory in this area. We point out some
difficulties in processing standard ID/LP grammar and present a modified
version of the grammar. In the modified version, the right-hand side of
phrase structure rules is treated as a set or partially-ordered set. An
instance of the framework is implemented.
Sentence Adverbials in a System of Question Answering without a Prearranged Data
Base
In the present paper we provide a report on a joint approach to the computation treatment of sentence adverbials (such as surprisingly, presumably,
or probably) and focussing adverbials (such as only or at least, including
negation (not) and some other adverbial expressions, such as for example
or inter alia) within a system of question answering
without a prearranged data base (TIBAQ).
This approach is based on a joint theoretical account of the expressions
in question in the framework of a functional description of language; we
argue that in the primary case, the expressions in question occupy, in the
underlying topic-focus articulation of a sentence, the focus-initial position,
Eva Koktova
Hamburg, West Germany
COLING'86 pp. 68-73
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
99
The FINITE STRING Newsletter
Abstracts of Current Literature
extending their scope over the focus, or the new information, of a
sentence, thus specifying, in a broad sense of the word, how the next information of a sentence holds. On the surface the expressions in question are
usually moved to scope-ambiguous positions, which can be analyzed by
means of several general strategies.
D-PATR: A Development Environment
for Unification-Based Grammars
Lauri Kartunnen
Artificial Intelligence Center
SRI International
333 Ravenswood Avenue
Menlo Park, CA 94025
a n d Center for the Study of Language and
Information, Stanford University
COLING'86, pp. 74-80
Structural Correspondence Specification
Environment
Yongfeng Yah
Groupe d'Etudes pour la Traduction
Automatique (GETA)
B.P. 68
University of Grenoble
38402 Saint Martin d'H6res, France
D-PATR is a development environment for unification-based grammars on
Xerox 1100 series work stations. It is based on the PATR formalism developed at SRI International. This formalism is suitable for encoding a wide
variety of grammars. At one end of this range are simple phrase-structure
grammars with no feature augmentations. The PATR formalism can also be
used to encode grammars that are based on a number of current linguistic
theories, such as lexical-functional grammar (Bresnan and Kaplan), headdriven phrase structure grammar (Pollard and Sag), and functional unification grammar (Kay). At the other end of the range covered by D-PATR are
unification-based categorial grammars (Klein, Steedman, Uszkoreit,
Wittenberg) in which all the syntactic information is incorporated in the
lexicon and the remaining few combinatorial rules that build phrases are
function application and composition. Definite-clause grammars (Pereira
and Warren) can also be encoded in the PATR formalism.
This article presents the Structural Correspondence Specification Environment (SCSE) being implemented at GETA.
The SCSE is designed to help linguists to develop, consult, and verify
the SCS grammars (SCSG) which specify linguistic models. It integrates
the techniques of data bases, structure editors, and language interpreters.
We argue that formalisms and tools of specification are as important as the
specification itself.
COLING'86, pp. 81-84
Conditioned Unification for Natural
Language Processing
Kditi Hasida
Electrotechnical Laboratory
Umezono 1-1-4, Sakura-Mura, Niibari-Gun
Ibaraki, 305 Japan
COLING'86, pp. 85-87
Methodology and Verifiability in Montague
Grammar
Seiki Akama
Fujitsu Ltd.
2-4-19, Sin-Yokohama
Yokohama, 222, Japan
This paper presents what we call a conditional unification, a new method
of unification for processing natural languages. The key idea is to annotate
the patterns with a certain sort of conditions, so that they carry abundant
information. This method transmits information from one pattern to
another more efficiently than procedure attachments, in which information
contained in the procedure is embedded in the program rather than directly
attached to patterns. Coupled with techniques in formal linguistics, moreover, conditioned unification serves most types of operations for natural
language processing.
Methodological problems in Montague G r a m m a r are discussed. Our
observations show that a mode-theoretic approach to natural language
semantics is inadequate with respect to its verifiability from a logical point
of view. But, the formal attitudes seem to be of use for the development in
computational linguistics.
COLING'86, pp. 88-90
Towards a Dedicated Database Management System for Dictionaries
Marc Domenig, Patrick Shann
lnstitut Dalle Molle pour les Etudes
Semantiques et Cognitives &isscop.
Route des Acacias 54
1227 Geneva, Switzerland
COLING'86 pp. 91-96
100
This paper argues that a lexical data base should be implemented with a
special kind of database management system (DBMS) and outlines the
design of such a system. The major difference between this proposal and a
general purpose DBMS is that its data definition language (DDL) allows
the specification of the entire morphology, which turns the lexical data
base from a mere collection of 'static' data into a real-time word-analyzer.
Moreover, the dedication of the system conduces to the feasibility of user
interfaces with very comfortable monitor and manipulation functions.
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
The Transfer Phase of the Mu Machine
Translation System
Makoto Nagao, Jun-ichi Tsujii
Department of Electrical Engineering
Kyoto University
Kyoto, Japan 606
COLING'86, pp. 97-103
Lexical Transfer: A Missing Element in
Linguistics Theories
Alan K. Melby
Brigham Young University
Department of Linguistics
Provo, Utah 84602
COLING'86, pp. 104-106
Idiosyncratic Gap: A Tough Problem to
Structure-Based Machine Translation
Yoshihiko Nitta
Advanced Research Laboratory
Hitachi Ltd.
Kokubunji, Tokyo 185 Japan
COLING'86, pp. 107-111
Lexicai-Functional Transfer: A Transfer
Framework in a Machine-Translation
System Based on LFG
Ikuo Kudo
CSK Research Institute
3-22-17 Higashi-Ikebukuro, Toshima-ku
Tokyo, 170, Japan
Hirosato Nomura
NTT Basic Research Laboratories
Musashino-shi, Tokyo, 180, Japan
COLING'86, pp. 112-114
Abstracts of Current Literature
The interlingual approach to MT has been repeatedly advocated by
researchers originally interested in natural language understanding who
take machine translation to be one possible application. However, not
only the ambiguity but also the vagueness which every natural language
inevitably has leads this approach into essential difficulties. In contrast,
our project, the Mu-project, adopts the transfer approach as the basic
framework of MT. This paper describes the detailed construction of the
transfer phase of our system from Japanese to English, and gives some
examples of problems which seem difficult to treat in the interlingual
approach.
Some of the design principles relevant to the topic of this paper are:
• Multiple Layer of Grammars
• Multiple Layer Presentation
• Lexicon Driven Processing
• Form-Oriented Dictionary Description
This paper also shows how these principles are realized in the current
system.
One of the necessary tasks of a machine translation system is lexical transfer. In some cases there is a one-to-one mapping from source language
word to target language word. What theoretical model is followed when
there is a one-to-many mapping? Unfortunately, none of the linguistic
models that have been used in machine translation include a lexical transfer component. In the absence of a theoretical model, this paper will
suggest a new way to test lexical transfer systems. This test is being
applied to an MT system under development. One possible conclusion may
be that further effort should be expended developing models of lexical
transfer.
Current practical machine translation systems, which are designed to deal
with a huge amount of documents, are generally structure-based. That is,
the translation process is done based on the analysis and transformation of
the structure of the source sentence, not on the understanding and paraphrasing of the meaning of that sentence. But each language has its own
syntactic and semantic idiosyncrasy, and on this account, without understanding the total meaning of the source.sentence, it is often difficult for
MT to bridge properly the idiosyncratic gap between source and target
language. A somewhat new method call "Cross Translation Test" is
presented that reveals the detail of idiosyncratic gap together with the
so-so satisfiable possibility of MT The usefulness of the sublanguage
approach in reducing the idiosyncratic gap between source and target
languages is also mentioned.
This paper presents a transfer framework called LFT (Lexical-Functional
Transfer) for a machine translation system based on LFG (Lexical-Functional Grammar). The translation process consists of subprocesses of analysis, transfer, and generation. We adopt the so-called f-structures of LFG
as the intermediate representations or interfaces between those subprocesses, thus the transfer process converts a source f-structure into a target
f-structure. Since LFG is a grammatical framework for sentence structure
analysis of one language, for the purpose, we propose a new framework for
specifying transfer rules with LFG schemata, which incorporates corresponding lexical functions of two different languages into an equational
representation. The transfer process, therefore, is to solve equations called
target f-descriptions derived from the transfer rules applied to the source
f-structure and then to produce a target f-structure.
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
101
The FINITE STRING Newsletter
Transfer and MT Modularity
Pierre Isabelle, Elliott Macklovitch
Canadian Workplace Automation Research
Center
1575 Chomedey Boulevard
Laval, Quebec, Canada H7V 2X2
COLING'86, pp. 115-117
The Need for MT-Oriented Versions of
Case and Valency in MT
Harold L. Somers
Centre for Computational Linguistics
University of Manchester
Institute of Science and Technology
COLING'86, pp. 118-123
A Parametric NL Translator
Randall Sharp
Dept. of Computer Science
University of British Columbia
Vancouver, Canada
Abstracts of Current Literature
The transfer components of typical second generation (G2) MT systems
do not fully conform to the principles of G2 modularity, incorporating
extensive target language information while failing to separate translation
facts from linguistic theory. The exclusion from transfer of all non-contrastive information leads us to a system design in which the three major
components operate in parallel rather than in sequence. We also propose
that MT systems be designed to allow translators to express their knowledge in natural metalanguage statements.
This paper looks at the use in machine translation systems of the linguistic
models of Case and Valency. It is argued that neither of these models was
originally developed with this use in mind, and both must be adapted
somewhat to meet this purpose. In particular, the traditional Valency
distinction of complements and adjuncts leads to conflicts when valency
frames in different languages are compared: a finer but more flexible
distinction is required. Also, these concepts must be extended beyond the
verb, to include the noun and adjective as valency bearers. As far as Case
is concerned, too narrow an approach has traditionally been taken: work in
this field has been too concerned only with cases for arguments in verb
frames; case label systems for non-valency bound elements and also for
elements in nominal groups must be elaborated. The paper suggests an
integrated approach specifically oriented towards the particular problems
found in MT.
This report outlines a machine translation system whose linguistic component is based on principles of G o v e r n m e n t and Binding. A "universal
g r a m m a r " is defined, together with parameters of variation for specific
languages. The system, written in Prolog, parses, generates, and translates
between English and Spanish (both directions).
COLING'86, pp. 124-126
Lexicase Parsing: A Lexicon-Driven
Approach to Syntactic Analysis
Stanley Starosta
University of Hawaii Social Science Research
Institute and Pacific International Center
for High Technology Research
Honolulu, Hawaii 96822
Hirosato Nomura
NTT Basic Research Laboratories
Musashino-shi, Tokyo, 180, Japan
COLING'86, pp. 127-132
Solutions for Problems of MT Parser
Methods used in Mu-Machine Translation
Project
Jun-ichi Nakamura, Jun-ichi Tsujii,
Makoto Nagao
Dept. of Electrical Engineering
Kyoto University
Sakyo, Kyoto 606, Japan
COLING'86, pp. 133-135
102
This paper presents a lexicon-based approach to syntactic analysis, Lexicase, and applies it to a lexicon-driven computational parsing system. The
basic descriptive mechanism in a Lexicase grammar is lexical features. The
properties of lexical items are represented by contextual and non-contextual features, and generalizations are expressed as relationships among sets
of these features and among sets of lexical entries. Syntactic tree structures are represented as networks of pairwise dependency relationships
among the words in a sentence. Possible dependencies are marked as
contextual features on individual lexical items, and Lexicase parsing is a
process of picking out words in a string and attaching dependents to them
in accordance with their contextual features. Lexicase is an appropriate
vehicle for parsing because Lexicase analyses are monostratal, flat, and
relatively non-abstract, and it is well suited to machine translation because
grammatical representations for corresponding sentences in two languages
will be very similar to each other in structure and inter-constituent
relations, and thus far easier to interconvert.
A parser is a key component of a machine translation system. If it fails in
parsing an input sentence, the MT system cannot output a complete translation. A parser of a practical MT system must solve many problems
caused by the varieties of characteristics of natural languages. Some problems are caused by the incompleteness of grammatical rules and dictionary
information, and some by the ambiguity of natural languages. Others are
caused by various types of sentence constructions, such as itemization,
insertion by parentheses, and other typographical conventions that cannot
be naturally captured by ordinary linguistic rules.
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
Abstracts of CurrentLiterature
The authors of this paper have been developing MT systems between
Japanese and English (in both directions) under the Mu-machine translation project. In the system's development, several methods have been
implemented with grammar writing language GRADE to solve the problems
of the MT parser. In this paper, first the characteristics of GRADE and the
Mu-MT parser are briefly described. Then, methods to solve the MT parsing problems that are caused by the varieties of sentence constructions and
the ambiguities of natural languages are discussed from the viewpoint of
efficiency and maintainability.
Strategies and Heuristics in the Analysis of
a Natural Language in Machine Translation
Zaharin Yusoff
Groupe d'Etudes pour la Traduction
Automatique
BP no. 68
Universit6 de Grenoble
38402 Saint-Martin-d'H~res, France
The analysis phase in an indirect, transfer, and global approach to machine
translation is studied. The analysis conducted can be described as exhaustive (meaning with backtracking), depth-first, and strategically and heuristically driven, while the grammar used is an augmented context free
grammar. The problem areas, being pattern matching, ambiguities,
forward propagation, checking for correctness, and backtracking, are highlighted. Established results found in the literature are employed whenever
adaptable, while suggestions are given otherwise.
COLING'86, pp. 136-139
Parsing in Parallel
Xiuming Huang, Louise Guthrie
Computing Research Laboratory
New Mexico State University
Las Cruces, NM 88003
COLING'86, pp. 140-145
Computational Comparative Studies on
Romance Languages: A Linguistic
Comparison of Lexicon-Grammars
Annibale Elia
lstituto di Linguistica Universit~t di Salerno
Yvette Mathieu
Laboratoire d'Automatique Documentaire et
Linguistique
C.N.R.S. - Universit6 de Paris 7
The paper is a description of a parallel model for natural language parsing,
and a design for its implementation on the Hypercube multiprocessor. The
parallel model is based on the Semantic Definite Clause G r a m m a r formalism and integrates syntax and semantics through the communication of
processes. The main processes, of which there are six, contain either purely syntactic or purely semantic information, giving the advantage of simple;
transparent algorithms dedicated to only one aspect of parsing. Communication between processes is used to impose semantic constraints on the
syntactic processes.
What we present here is an application on the basis of the Italian and
French linguistic data bank assembled by the Istituto di Linguistica of
Salerno University (Italy) and the Laboratoire Automatique Documentaire
et Linguistique (C.N.R.S.-France). These two research centers have been
working for years to the constitution of formalized grammars of the
respective languages. The composition of lexicon-grammars is the first
stage of this project.
COLING'86, pp. 146-150
A Stochastic Approach to Parsing
Geoffrey Sampson
Department of Linguistics and Phonetics
University of Leeds
COLING'86, pp. 151-155
Parsing Without (Much) Phrase Structure
Michael B. Kac
Department of Linguistics
University of Minnesota
Simulated annealing is a stochastic computational technique for finding
optimal solutions to combinatorial problems for which the combinatorial
explosion phenomenon rules out the possibility of systematically examining
each alternative. It is currently being applied to the practical problem of
optimizing the physical design of computer circuitry, and to the theoretical
problems of resolving patterns of auditory and visual stimulation into
meaningful arrangements of phonemes and three-dimensional objects.
Grammatical parsing - resolving unanalyzed linear sequences of words into
meaningful grammatical structures - can be regarded as a perception problem logically analogous to those just cited, and simulated annealing holds
great promise as a parsing technique.
Approaches to NL syntax conform in varying degrees to the older relational/dependency model (essentially that assumed in traditional grammar), which treats a sentence as a group of words united by various relations, and the newer constituent model . . . . In computational linguistics
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
103
The FINITE STRING Newsletter
Minneapolis, MN 55455
Alexis Manaster-Ramer
Program in Linguistics
University of Michigan
Ann Arbor, MI 48109
COLING'86, pp. 156-158
Reconnaissance-Attack Parsing
Michael B. Kac, Tom Rindflesch
Department of Linguistics
University of Minnesota
Minneapolis, MN 55455
Karen L. Ryna
Computer Sciences Center
Honeywell, Inc.
Minneapolis, MN 55427
Abstracts of Current Literature
there is a strong (if not universal) reliance on phrase structure as the medium via which to represent syntactic structure; call this the consensus view.
... In its strongest form, the consensus view says that the recovery of a
fully specified parse tree is an essential step in computational language
processing, and would, if correct, provide important support for the
constituent model. In this paper, we shall critically examine the rationale
for this view, and will sketch (informally) an alternative view which we
find more defensible. The actual position we shall take for this discussion,
however, is conservative in that we will not argue that there is no place
whatever for constituent analysis in parsing or in syntactic analysis generally. What we argue is that phrase structure is at least partly redundant in
that a direct leap to the composition of some semantic units is possible
from a relatively underspecified syntactic representation (as opposed to a
complete parse tree).
In this paper we will describe an approach to parsing, one major component of which is a strategy called RECONNAISSANCE-ATTACK. Under
this strategy, no structure building is attempted until after completion of a
preliminary phase designed to exploit low-level information to the fullest
possible extent. This first pass then defines a set of constraints that restrict
the set of available options when structure building proper begins. R-A
parsing is in principle compatible with a variety of different views regarding the nature of syntactic representation, though it fits more comfortably
with some than with others.
COLING'86, pp. 159-160
Panel: Natural Language Interfaces Ready for Commercial Success?
Wolfgang Wahlster (Chair)
Department of Computer Science
University of Saarbrticken
D-6600 Saarbrucken 11
Fed. Rep. of Germany
COLING'86 p. 161
STATEMENT BY THE CHAIR (abridged) The goal of this panel is to
evaluate three natural language interfaces which were introduced to the
commercial market in 1985 (cf. Carnegie Group 1985, Kamins 1985,
Texas Instruments 1985) and to relate them to current research in computational linguistics. Each of the commercial systems selected as a starting
point for the discussion (see Wahlster 1986 for a functional comparison)
was developed by a well-known scientist with considerable research experience in NL processing: LanguageCraft 1 by Carnegie Group (designed
under the direction of J. Carbonell), NLMenu by Texas Instruments
(designed under the direction of H. Tennant), and Q & A 2 by Symantec
(designed under the direction of G. Hendrix).
1 Trademark of Carnegie-Group, Inc.
2 Trademark of Symantec Corporation
Requirements for Robust Natural
Language Interfaces: The LanguageCraft
and XCALIBUR Experiences
Jaime G. Carbonell
Carnegie-Mellon University
and Carnegie-Group, Inc.
Pittsburgh, PA 15213
COLING'86, pp. 162-163
104
PANELIST STATEMENT (abridged): Natural Language interfaces to
data bases and expert systems require the investigation of several crucial
capabilities in order to be judged habitable by their end users and productive by the developers of applications. User habitability is measured in
terms of linguistic coverage, robustness of behavior and speed of response,
whereas implementer activity is measured by the amount of effort required
to connect the interface to a new application, to develop its syntactic and
semantic grammar, and to test and debug the resultant system assuring
a certain level of performance. These latter criteria have not been
addressed directly by natural language researchers in pure laboratory
settings, with the exception of user-defined extensions to an existing interface (e.g., NanoKLAUS, VOX). But, in order to amortize the cost of developing practical, robust, and efficient interfaces over multiple applications,
the implementer productivity requirements are as important as user habitability. We treat each set of criteria in turn, drawing from our experience in
XCALIBUR, and in LanguageCraft a commercially available environment
and run-time module for rapid development of domain-oriented natural
language interfaces. In our discussion we distill the general lessons accrued
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
Abstracts of CurrentLiterature
from several years of experience using these systems, and conducting
several small-scale user studies.
Q&A: Already a Success?
(Responses to moderator's question based on Q&A.)
Gary G. Hendrix
Symantec Corporation
Cupertino, CA 95014
COLING'86, pp. 164-166
The Commercial Application of Natural
Language Interfaces
Harry Tennant
Computer Science Center
Texas Instruments
Dallas, Texas
COLING'86 p. 167
PANELIST STATEMENT (abridged): I don't think that natural language
interfaces are a very good idea. By that I mean conventional natural
language interfaces - the kind where the user types in a question and the
system tries to understand it. Oh sure, when (if?) computers have world
knowledge that is comparable to what humans need to communicate with
each other, natural language interfaces will be easy to build and, depending
on what else is available, might be a good way to communicate with
computers. But today we are soooo far away from having that much
knowledge in a system, conventional natural language interfaces don't
make sense.
There is something different that makes more sense - NLMenu. It is a
combination of menu technology with natural language understanding
technology, and it eliminates many of the deficiencies one finds with
conventional natural language interfaces while retaining the important
benefits.
...end o f p a n e L .
The Role of Inversion and PP-Fronting in
Relating Discourse Elements
Mark Vincent LaPolla
The Artificial Intelligence Laboratory and
The Department of Linguistics
University of Texas at Austin
Austin, Texas
70LING'86, pp. 168-173
Situational Investigation of Presupposition
Seiki Akama
Fujitsu Ltd.
2-4-19 ShinYokohama
Yokohama, Japan
Masahito Kawamori
This paper will explore and discuss the less obvious ways syntactic structure is used to convey information and how this information could be used
by a natural language database system as a heuristic to organize and search
a discourse space.
The primary concern of this paper will be to present a general theory of
processing which capitalizes on the information provided by such non-SVO
word orders as inversion, (wh) clefting, and prepositional phrase (PP)
fronting.
This paper gives a formal theory of presupposition using situation semantics developed by Barwise and Perry. We will slightly modify Barwise and
Perry's original theory of situation semantics so that we can deal with nonmonotonic reasonings which are very important for the formalization of
presupposition in natural language. This aspect is closely related to the
formulation of incomplete knowledge in artificial intelligence.
Sophia University
7 Kioicho, Chiyodaku
Tokyo, Japan
COLING'86, pp. 174-176
Linking Propositions
D.S. Brde, R.A. Smit
Rotterdam School of Management
Erasmus University
P.O.B. 1738
NL-3000 DR Rotterdam, The Netherlands
COLING'86, pp. 177-180
The function words of a language provide explicit information about how
propositions are to be related. We have examined a subset of these function words, namely the subordinating conjunctions which link propositions
within a sentence, using sentences taken from corpora stored on magnetic
tape. On the basis of this analysis, a computer program for Dutch language generation and comprehension has been extended to deal with the
subordinating conjunctions. We present an overview of the underlying dimensions that were used in describing the semantics and pragmaties of the
Dutch subordinating conjunctions. We propose a Universal set of Linking
Dimensions, sufficient to specify the subordinating conjunctions in any
language. This ULD is a first proposal for the representation required for a
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
105
The FINITESTRING Newsletter
Abstracts of Current Literature
computer program to understand or translate the subordinating conjunctions of any natural language.
Discourse and Cohesion in Expository
Text
Alien B. Tucker, Sergei Nirenburg
Department of Computer Science
Colgate University
Victor Raskin
Department of English
Purdue University
COLING'86, pp. 181-183
D e g r e e s of Understanding
Eva Haj~ovd, Petr Sgall
Faculty of Mathematics and Physics
Charles University
Malostransk6 n. 25
Prague 1, Czechoslovakia
COLING'86, pp. 184-186
Categorial Unification Grammars
Hans Uszkoreit
Artificial Intelligence Center, SRI International and Center for the Study of Languages and Information, Stanford University
COLING'86, pp. 187-194
106
This paper discusses the role of discourse in expository text, text which
typically comprises published scholar papers, textbooks, proceedings of
conferences, and other highly stylized documents. Our purpose is to examine the extent to which those discourse-related phenomena that generally
assist the analysis of dialogue text - where speaker, hearer, and speech-act
information are more actively involved in the identification of plans and
goals - can be used to help with the analysis of expository text. In particular, we make the optimistic assumption that expository text is strongly
connected, i.e., that all adjacent pairs of clauses in such a text are connected by "cohesion markers", both explicit and implicit. We investigate
the impact that this assumption may have on the depth of understanding
that can be achieved, the underlying semantic structures, and the supporting knowledge base for the analysis. An application of this work in designing the AI-based machine translation model, TRANSLATOR, is discussed in
Nirenburg et al. (page 627 of these Proceedings).
Along with "static" or "declarative" descriptions of language system,
models of language use (the regularities of communicative competence) are
constructed. One of the outstanding aspects of this transfer of attention
consists in the efforts devoted to automatic comprehension of natural
language which, since Winograd's SHRDLU, are presented in many different contexts. One speaks about understanding, or comprehension,
although it may be noticed that the term is used in different, and often
unclear, meanings. In machine translation systems, as the late B.
Vauquois pointed out (see now Vauquois and Boitet, 1985), a flexible
system combining different levels of automatic analysis is necessary (i.e.,
the transfer component should be able to operate at different levels). The
human factor cannot be completely dispensed with; it seems inevitable to
include post-edition, or such a division of labor as that known from the
system METEO. Not only should the semantico-pragmatic items present in
the source language structure be reflected but also certain aspects of factual knowledge (see Slocum 1985: 16). It was pointed out by Kirschner
(1982: 18) that, to a certain degree, this requirement can be met by means
of a system of semantic features. For NL comprehension systems the automatic formulation of a partial image of the world often belongs to the core
of the system; such a task certainly goes far beyond pure linguistic analysis
and description.
Winograd (1976: 269,275) claims that a linguistic description should
handle "the entire complex of the goals of the speaker". It is then possible
to ask what are the main features relevant for the patterning of this
complex and what are the relationships between understanding all the
goals of the speaker and having internalized the system of a natural
language. It seems to be worthwhile to reexamine the different kinds and
degrees of understanding.
Categorial unification grammars (CUGs) embody the essential properties of
both unification and categorial grammar formalisms. Their efficient and
uniform way of encoding linguistic knowledge in well-understood and
widely-used representations makes them attractive for computational applications and for linguistic research.
In this paper, the basic concepts of CUGs and simple examples of their
application will be presented. It will be argued that the strategies and
potentials of CUGs justify their further exploration in the wider context of
research on unification grammars. Approaches to selected linguistic
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
Abstracts of CurrentLiterature
phenomena such as long-distance dependencies, adjuncts, word order, and
extraposition are discussed.
Dependency Unification Grammar
Peter Hellwig
University of Heidelberg
D-6900 Heidelberg, West Germany
COLING'86, pp. 195-198
The Weak Generative Capacity of Parenthesis-Free Categorial Grammars
Jovce Friedman, Dawei Dai, Weiguo Wang
Computer Science Department
Boston University
111 Cummington Street
Boston, MA 02215
COLING'86, pp. 199-201
Tree Adjoining and Head Wrapping
E. Vijay-Shanker, David J. Weir,
Aravind K. Joshi
Department of Computer and Information
Science
University of Pennsylvania
Philadelphia, PA 19104
This paper describes the analysis component of the language processing
system PLAIN from the viewpoint of unification grammars. The pnnciples
of Dependency Unification Grammar (DUGs) are discussed. The computer
language DRL (Dependency Representation Language) is introduced in
which DUGs can be formulated. A unification-based parsing procedure is
part of the formalism. PLAIN is implemented at the universities of Heidelberg, Bonn, Flensburg, Kiel, Zurich, and Cambridge, U.K.
We study the weak generative capacity of a class of parenthesis-free categorial grammars derived from those of Ades and Steedman by varying the
set of reduction rules. With forward cancellation as the only rule, the
grammars are weakly equivalent to context-free grammars. When a backward combination rule is added, it is no longer possible to obtain all the
context-free languages. With suitable restriction of the forward partial
rule, the languages are still context-free and a push-down automaton can
be used for recognition. Using the unrestricted rule of forward partial
combination, a context-sensitive language is obtained.
In this paper we discuss the formal relationship between the classes of
languages generated by Tree Adjoining Grammars and Head Grammars.
In particular, we show that Head Languages are included in Tree Adjoining Languages and that Tree Adjoining Grammars are equivalent to a
modification of Head Grammars called Modified Head Grammars. The
inclusion of MHL in HL, and thus the equivalence of HGs and TAGs, in the
most general case remains to be established.
COLING'86, pp. 202-207
Categorial Grammars for Strata of NonCF Languages and their Parsers
Michal P. Chytil
Charles University
Malostransk6 nddm.25
118 00 Praha 1, Czechoslovakia
We introduce a generalization of categorial grammar extending its descriptive power, and a simple model of categorial grammar parser. Both tools
can be adjusted to particular strata of languages via restricting grammatical
or computational complexity.
Hans Karlgren
KVAL
SOdermalstorg 8
116 45 Stockholm, Sweden
COLING'86, pp. 208-210
A Simple Reconstruction of GPSG
Smart M. Shieber
Artificial Intelligence Center, SRI International and Center for the Study of Language and Information, Stanford University
COLING'86, pp. 211-215
Kind Types in Knowledge Representation
K. Dahlgren
IBM Los Angeles Scientific Center
11601 Wilshire Blvd.
Like most linguistic theories, the theory of generalized phrase structure
grammar (GPSG) has described language axiomatically, that is, as a set of
universal and language-specific constraints on the well-formedness of
linguistic elements of some sort. The coverage and detailed analysis of
English grammar in the ambitious recent volume by Gazdar, Klein, Pullum,
and Sag entitled Generalized Phrase Structure Grammar, are impressive, in
part because of the complexity of the axiomatic system developed by the
authors. In this paper, we examine the possibility that simpler descriptions
of the same theory can be achieved through a slightly different, albeit still
axiomatic, method. Rather than characterize the well-formed trees directly, we progress in two stages by procedurally characterizing the wellformedness axioms themselves, which in turn characterize the trees.
This paper describes Kind Types (KT), a system which uses commonsense
knowledge to reason about natural language text. KT encodes some of the
knowledge underlying natural language understanding, including category
distinctions and descriptions differentiating real-world objects, states, and
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
107
The FINITE STRING Newsletter
Los Angeles, CA 90025
J. McDowell
Department of Linguistics
University of Southern California
Los Angeles, CA 90089
Abstracts of Current Literature
events. It embeds an ontology reflecting the ordinary person's top-level
cognitive model of real-world distinctions and a data base of prototype
descriptions of real-world entities. KT is transportable, empirically-based
and constrained for efficient reasoning in ways similar to human reasoning
processes.
COLING'86, pp. 216-221
DCKR - Knowledge Representation in
Prolog and Its Application to Natural
Language Processing
Hozumi Tanaka
Tokyo Institute of Technology
Department of Computer Science
O-okayama, 2-12-1, Megro-ku
Tokyo, Japan
COLING'86, pp. 222-225
Conceptual Lexicon Using an ObjectOriented Language
Shoichi Yokoyama
Electrotechnieal Laboratory
Tsukuba, Ibaraki, Japan
Kenji Hanakata
Universitat Stuttgart
Stuttgart, F.R. Germany
COLING'86, pp. 226-228
Elementary Contracts as a Pragmatic Basis
of Language Interaction
E.L. Pershina
AI Laboratory, Computer Center
Siberian Division of the USSR Ac. Sci.
Novosibirsk 630090, USSR
COLING'86, pp. 229-231
Communicative Triad as a Structural
Element of Language Interaction
F. G. Dinenberg
AI Laboratory, Computer Center
Siberian Division of the USSR Ac. Sei.
Novosibirsk 630090, USSR
COLING'86, pp. 232-234
TBMS: Domain Specific Text Management and Lexicon Development
108
Semantic processing is one of the important tasks for natural language
processing. Basic to semantic processing is descriptions of lexical items.
The most frequently used form of description of lexical items is probably
Frames or Objects. Therefore in what form Frames or Objects are expressed is a key issue for natural language processing. A method of the
Object representation in Prolog called DCKR will be introduced. It will be
seen that if part of general knowledge and a dictionary are described in
DCKR, part of context-processing, and the greater part of semantic processing can be left to the functions built in Prolog.
This paper describe the construction of a lexicon representing abstract
concepts. This lexicon is written by an object-oriented language, CTALK,
and forms a dynamic network system controlled by object-oriented mechanisms. The content of the lexicon is constructed using a Japanese dictionary. First, entry words and their definition parts are derived from the
dictionary. Second, syntactic and semantic information is analyzed from
these parts. Finally, superconcepts are assigned in the superconcept part in
an object, static parts in the slot values, and dynamic operations to the
message parts, respectively. One word has one object in a world, but
through the superconcept part and slot part this connects to the subconcept
of other words and worlds. When relative concepts are accumulated, the
result will be a model of human thoughts which have conscious and unconscious parts.
Language interaction (LI) as a part of interpersonal communication is
considerably influenced by psychological and social roles of the partners
and their pragmatic goals. These aspects of communication should be
accounted for while elaborating advanced user-computer dialogue systems
and developing formal models of LI. We propose here a formal description
of communicative context of LI-situation, namely, a system of indices of LI
agents' interest in achieving various pragmatic purposes and a system of
contracts which reflect social and psychological roles of the LI agents and
conventionalize their "rights" and "duties" in the LI-process. Different
values of these parameters of communication allow us to state possibility
a n d / o r necessity of certain types of speech acts under certain conditions of
LI-situation.
Researches on dialogue natural-language interaction with intellectual
" h u m a n - c o m p u t e r " systems are based on models of language " h u m a n - t o human" interaction, these models representing descriptions of communication laws. An aspect of developing language interaction models is an
investigation of dialogue structure. In the paper a notion of elementary
communicative triad (SR-triad) is introduced to model the "stimulusreaction" relation between utterances in the dialogue. The use of the SRtriad apparatus allows us to represent a scheme of any dialogue as a triad
structure. SR-triad structure being inherent both to natural and programming language dialogues, SR-system is claimed to be necessary while developing dialogue processors.
The definition of a Text Base Management System is introduced in terms
of software engineering. That gives a basis for discussing practical text
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
S. Goeser, E. Mergenthaler
Universtity of Ulm
Federal Republic of Germany
Abstracts of CurrentLiterature
administration, including questions on corpus properties and appropriate
retrieval criteria. Finally, strategies for the derivation of a word data base
from an actual TBMS will be discussed.
COLING'86, pp. 235-240
Text Analysis and Knowledge Extraction
Fujio Nishida, Shinobu Takamatsu,
Tadaaki Tani, Hiroji Kusaka
Department of Electrical Engineering
Faculty of Engineering
University of Osaka Prefecture
Sakai, Osaka, 591 Japan
COLING'86, pp. 241-243
Context Analysis System for Japanese
Text
Hitoshi Isahara, Shun Ishizaki
Electrotechnical Laboratory
1-!-4, Umezono, Sakura-mura, Niihari-gun
Ibaraki, Japan 305
COLING'86 pp. 244-246
Disambiguation and Language Acquisition
through the Phrasal Lexicon
Uri Zernik, Michael G. Dyer
Artificial Intelligence Laboratory
Computer Science Department
3531 Boelter Hall
University of California
Los Angeles, CA 90024
COLING'86, pp. 247-252
Linguistic Knowledge Extraction from Real
Language Behavior
K. Shirai, T. Hamada
Department of Electrical Engineering
Waseda University
3-4-10hkubo Shinjuku-ku, Tokyo, Japan
COLING'86, pp. 253-255
The study of text understanding and knowledge extraction has been actively done by many researchers. The authors also studied a method of structured information extraction from texts without a global text analysis. The
method is available for a comparatively short text such as a patent claim
clause and an abstract of a technical paper.
This paper describes the outline of a method of knowledge extraction
from a longer text which needs a global text analysis. The kinds of texts
are expository texts or explanation texts. Expository texts described here
mean those which have various hierarchical headings such as a title, a
heading of each section and sometimes an abstract. In this definition, most
texts, including technical papers, reports, and newspapers, are expository.
Text of this kind disclose the main knowledge in a top-down manner and
show not only the location of an attribute value in a text but also several
key points of the content. This property of expository texts contrasts with
that of novels and stories in which an unexpected development of the plot
is preferred.
This paper pays attention to such characteristics of expository texts and
describes a method of analyzing texts by referring to information
contained in the intersentential relations and the headings of texts and then
extracting requested knowledge such as a summary from texts in an efficient way.
A natural language understanding system is described which extracts contextual information from Japanese texts. It integrates syntactic, semantic, and contextual processing serially. The syntactic analyzer obtains
rough syntactic structures from the text. The semantic analyzer treats
modifying relations inside noun phrases and case relations among verbs
and noun phrases. Then, the contextual analyzer obtains contextual information from the semantic structure extracted by the semantic analyzer.
Our system understands the context using precoded contextual knowledge
on terrorism and plugs the event information in input sentences into the
contextual structure.
The phrase approach to language processing emphasizes the role of the
lexicon as a knowledge source. Rather than maintaining a single generic
lexical entry for each word, e.g., take, the lexicon contains many phrases,
e.g., take on, take to the streets, take to swimming, take over, etc. Although
this approach proves effective in parsing and in generation, there are two
acute problems which still require solutions. First, due to the huge size of
the phrase lexicon, especially when considering subtle meanings and idiosyncratic behavior of phrases, encoding of lexical entries cannot be done
manually. Thus phrase acquisition must be employed to construct the lexi.con. Second, when a set of phrases is morpho-syntactically equivalent,
disambiguation must be performed by semantic means. These problems
are addressed in the program RINA.
An approach to extract linguistic knowledge from real language behavior is
described. This method depends on the extraction of word relations,
patterns of which are obtained by structuring the dependency relations in
sentences called Kakari-Uke relation in Japanese. As the first step of this
approach, an experiment of a word classification utilizing those patterns
was made on the 4178 sentences of real language data. A system was
made to analyze dependency structure of sentences utilizing the knowledge
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
109
The FINITE STRING Newsletter
Abstracts of CurrentLiterature
base obtained through this word classification and the effectiveness of the
knowledge base was evaluated. To develop this approach further, the
relation matrix which captures multiple interaction of words is proposed.
Tailoring Importance Evaluation to Reader's Goals: A Contribution to Descriptive
Text Summarization
Danilo Fum, Giovanni Guido, Carlo Tasso
Istito di Matematica, Informatica e
Sistemistica
Universitfi di Udine, Italy
COLING'86, pp. 2 5 6 - 2 5 9
Domain Dependent Natural Language
Understanding
Klaus Heje Munch
Department of Computer Science
Technical University of Denmark
DK-2800 Lyngby, Denmark
COLING'86, pp. 260-262
Morphological Analysis for a German
Text-to-Speech System
Amanda Pounder, Markus Kommenda
Institut for Nachrichtentechnik und
Hochfrequenztechnik
Technische Universitat Wien
Gusshausstrasse 25, A-1040 Wien, Austria
COLING'86, pp. 2 6 3 - 2 6 8
Synergy of Syntax and Morphology in
Automatic Parsing of French Language
with a Minimum of Data
Jacques Vergne, Pascale Pagbs
Inalco Paris
This paper deals with a new approach to importance evaluation of descriptive texts developed in the framework of SUSY, an experimental system in
the domain of text summarization. The problem of taking into account the
reader's goals in evaluating importance of different parts of a text is first
analyzed. A solution to the design of a goal interpreter capable of computing a quantitative measure of the relevance degree of a piece of text according to a given goal is then proposed, and an example of goal interpreter operation is provided.
A natural language understanding system for a restricted domain of
discourse - thermodynamic exercises at an introductory level - is
presented. The system transforms texts into a formal meaning representation language based on cases. The semantical interpretation of sentences
and phrases is controlled by case frames formulated around verbs and
surface grammatical roles in noun phrases. During the semantical interpretation of a text, semantic constraints may be imposed on elements of the
text. Each sentence is analyzed with respect to context, making the system
capable of solving anaphoric references such as definite descriptions,
pronouns, and elliptic constructions.
The system has been implemented and successfully tested on a selection
of exercises.
A central problem in speech synthesis with unrestricted vocabulary is the
automatic derivation of correct pronunciation from the graphemic form of
a text. The software module GRAPHON was developed to perform this
conversion for G e r m a n and is currently being extended by a morphological
analysis component. This analysis is based on a morph lexicon and a set of
rules and structural descriptions for G e r m a n y word-forms. It provides
each text input item with an individual characterization such that the
phonological, syntactic, and prosodic components may operate upon it.
This systematic approach thus serves to minimize the number of wrong
transcriptions and at the same time lays the foundation for the generation
of stress and intonation patterns, yielding more intelligible, natural-sounding, and generally acceptable synthetic speech.
We intend to present in this paper a parsing method of French language
whose particularities are: a multi-level approach: syntax and morphology
working simultaneously, the use of string pattern matching and the absence
of dictionary. We want here to evaluate the feasibility of the method
rather than to present an operational system.
COLING'86, pp. 269-271
A Morphological Recognizer with Syntactic and Phonologic Rules
John Bear
Artificial Intelligence Center
SRI International
333 Ravenswood Avenue
Menlo Park, CA 94025
This paper describes a morphological analyzer which, when parsing a
word, uses two sets of rules: rules describing the syntax of words, and
and rules describing facts about orthography.
COLING'86, pp. 2 72-2 76
A Dictionary and Morphological Analyser
for English
G.J. Russell, S.G. Pulman
110
This paper describes the current state of a three-year project aimed at the
development of software for use in handling large quantities of dictionary
information within natural language processing systems. The project ... is
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
Computer Laboratory
University of Cambridge
G.D. Ritchie, A.W. Black
Department of Artificial Intelligence
University of Edinburgh
COLING'86, pp. 2 77-2 79
A Kana-Kanji Translation System for
Non-Segmented Input Sentences based on
Syntactic and Semantic Analysis
Masahiro Abe, Yoshimitsu Ooshima,
Katsuhiko Yuura, Nobuyuki Takeichi
Central Research Laboratory
Hitachi, Ltd.
Kokubunji, Tokyo, Japan
COLING'86, pp. 280-285
A Compression Technique for Arabic
Dictionaries: The Affix Analysis
Abdelmafid Ben Hamadou
D6partement of Computer Science FSEG Faculty
B.P. 69 - Route de l'a6roport
SFAX Tunisia
COLING'86, pp. 286-288
Machine Learning of Morphological Rules
by Generalization and Analogy
Klaus Wothke
Arbeitsstelle Linguistische Datenverarbeitung
Institut for Deutsche Sprache
Mannheim, West Germany
COLING'86, pp. 289-293
Linguistic Developments in Eurotra since
1983
Lieven Jaspaert
Abstracts of CurrentLiterature
one of three closely related projects funded under the Alvey IKBS Programme (Natural Language Theme); a parser is under development at Edinburgh by Henry Thompson and John Phillips), and a sentence grammar
is being devised by Ted Biscoe and Clare Grover at Lancaster and Bran
Boguraev and John Carroll at Cambridge. It is intended that the software and rules produced by all three projects will be directly compatible and capable of functioning in an integrated system.
This paper presents a disambiguation approach for translating non-segmented-Kana into Kanji. The method consists of two steps. In the first
step, an input sentence is analyzed morphologically and ambiguous
morphemes are stored in a network form. In the second step, the best
path, which is a string of morphemes, is selected by syntactic and semantic
analysis based on case grammar. In order to avoid the combinatorial
explosion of possible paths, the following heuristic search method is
adopted. First, a path that contains the smallest number of weighted-morphemes is chosen as the quasi-best path by a best-first-search technique.
Next, the restricted range of morphemes near the quasi-best path is
extracted from the morpheme network to construct preferential paths.
An experimental system incorporating large dictionaries has been developed and evaluated, m translation accuracy of 90.5 was obtained. This
can be improved to about 95°/6 by optimizing the dictionaries.
In every application that concerns the automatic processing of natural
language, the problem of the dictionary size is posed. In this paper we
propose a compression dictionary algorithm based on an affix analysis of
the non-diacritical Arabic.
It consists in decomposing a word into its first elements, taking into
account the different linguistic transformations that can affect the morphological structures.
This work has been achieved as part of a study of the automatic
detection and correction of spelling-errors in the non-diacritical Arabic
texts.
This paper describes an experimental procedure for the inductive automated learning of morphological rules from examples. At first an outline
of the problem is given. Then a formalism for the representation of
morphological rules is defined. This formalism is used by the automated
procedure, whose anatomy is subsequently presented. Finally, the
performance of the system is evaluated and the most important unsolved
problems are discussed.
I wish to put the theory and metatheory currently adopted in the Eurotra
project into a historical perspective, indicating where and why changes to
its basic design for a transfer-based MT (TBMT) system have been made.
Katholieke Universiteit Leuven
Belgium
COLING'86, pp. 294-296
The < C , A > Framework in Eurotra: A
Theoretically Committed Notation for MT
D.J. Arnold
University of Essex
Colchester, Essex CO4 3SQ, UK
S. Krauwer, L. des Tombe
University of Utrecht
Trans 14, 3512 JK
Utrecht, The Netherlands
This paper describes a model for MT, developed within the Eurotra MT
project, based on the idea of compositional translation, by describing a
basic, experimental notation which embodies the idea. The introduction
provides background, section 1 introduces the basic ideas and the notation,
and section 2 discusses some of the theoretical and practical implications of
the model, including some concrete extensions, and some more speculative
discussion.
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
111
The FINITE STRING Newsletter
Abstracts of Current Literature
M. Rosner
ISSCO
54, Route des Acacias
1227 Geneva, Switzerland
G.B. Varile
Commission of the European Communities
L-2928 Luxembourg
COLING'86, pp. 297-303
Generating Semantic Structures in
Eurotra-D
Erich Steiner
IAI - Eurotra - D
Martin-Luther-Strasse 14
D-6600 Saarbrticken, West Germany
COLING'86, pp. 304 306
Valency Theory in a Stratificational MT
System
Paul Schmidt
IAI Eurotra-D
Martin-Luther-Strasse 14
D-6600 Saarbr0eken, West Germany
COLING'86 pp. 307-312
A Compositional Approach to the Translation of Temporal Expressions in the
Rosetta System
Lisette Appelo
Philips Research Laboratories
Eindhoven, The Netherlands
The following paper is based on work done in the multi-lingual MT project
Eurotra, and MT project of the European Community.
Analysis and generation of clauses within the Eurotra framework
proceeds through the levels of (at least) Eurotra constituent structure
(ECS), Eurotra relation structure (ERS), and interface structure (IS).
At IS, labelling of nodes consists of labellings for time, modality, semantic features, semantic relations, and others. In this paper, we shall be
concerned exclusively with semantic relations (SRs), to which we shall also
refer as "participant roles" (PR).
According to current Eurotra legislation, these SRs are assigned to
dictionary entries of verbs (and other word classes, which will be disregarded in this paper) by coders, and through these entries to clauses in a
pattern matching process.
This approach, while certainly valid in principle, leads to the problem of
inter-coder-consistency, at least as long as the means for identifying SRs
are paraphrase tests for SRs. In Eurotra-D, we have for some time now
been experimenting with a set of SRs, or PRs, which are identified with the
help of syntactic criteria. This approach will be outlined in this paper.
This paper tries to investigate valency theory as a linguistic tool in machine
translation. There are three main areas in which major questions arise:
(1) Valency theory itself. I sketch a valency theory in linguistic terms
which includes the discussion of the nature of dependency representation
as an interface for semantic description.
(2) The dependency representation in the translation process. I try to
sketch the different roles of dependency representation in analysis and
generation.
(3) The implementation of valency theory in an MT system. I give a few
examples for how a valency description could be implemented in the Eurotra formalism.
This paper discusses the translation of temporal expressions, in the framework of the machine translation system Rosetta. The translation method
of Rosetta, the "isomorphic grammar method", is based on Montague's
Compositionality Principle. It shows that a compositional approach leads
to a transparent account of the complex aspects of time in natural language
and can be used for the translation of temporal expressions.
COLING'86, pp. 313-318
Idioms in the Rosetta Machine Translation
System
Andrd Schenk
Philips Research Laboratories
Eindhoven, The Netherlands
COLING'86, pp. 319-324
NARA: A Two-Way Simultaneous Interpretation System between Korean and
Japanese - A Methodological Study
Hee Sung (?hung, Tosiyasu L. Kunii
l 12
This paper discusses one of the problems of machine translation, namely the translation of idioms. The paper describes a solution to this problem
within the theoretical framework of the Rosetta machine translation system. Rosetta is an experimental translation system which uses an intermediate language and translates between Dutch, English, and, in the
future, Spanish.
This paper presents a new computing model for constructing a two-way
simultaneous interpretation system between Korean and Japanese. We
also propose several methodological approaches to the construction of a
two-way simultaneous interpretation system, and realize the two-way
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
Department of Information Science
Faculty of Science, University of Tokyo
7-3-1 Hongo, Bunkyo-ku Tokyo, 113 Japan
Abstracts of CurrentLiterature
interpreting process as a model unifying both linguistic competence and
linguistic performance. The model is verified theoretically and through
actual applications.
COLING'86 pp. 325-328
Strategies for Interactive Machine Translation: The Experience and Implications of
the UMIST Japanese Project
P.J. Whitelock, M. McGee Wood,
B.J. Chandler, N. Holden, H.J. Horsfall
Centre for Computational Linguistics
University of Manchester Institute of Science
and Technology
PO Box 88, Manchester M60 1QD UK
COLING'86 pp. 329-334
Pragmatics in Machine Translation
Annely Rothkegel
Universit~it Saarbrticken
Sonderforschungsbereich 100
Elektronische Sprachforschung
D 6600 Saarbrticken, West Germany
COLING'86, pp. 335-337
A Metric for Computational Analysis of
Meaning: Toward an Applied Theory of
Linguistic Semantics
Sergei Nirenburg
Department of Computer Science
Colgate University
Hamilton, NY 13346
SERGEC@IOLGATE
Victor Raskin
Department of English
Purdue University
West Lafayette, IN 47907
JHP@ZURDUE-ASC.CSNET
At the Centre for Computational Linguistics, we are designing and implementing an English-to-Japanese interactive machine translation system.
The project is funded jointly by the Alvey Directorate and International
Computers Limited (ICL). The prototype system runs on the ICL PERQ,
though much of the development work has been done on a VAX 11/750.
It is implemented in Prolog, in the interests of rapid prototyping, but
intended for optimization. The informing principles are those of modern
complex-feature-based linguistic theories, in particular Lexical-Functional
G r a m m a r (Bresnan (ed.) 1982, Kaplan and Bresnan 1982), and Generalized Phrase Structure G r a m m a r (Gazdar et al. 1985).
For development purposes we are using an existing corpus of 10,000
words of continuous prose from the PERQ's graphics documentation; in the
long term, the system will be extended for use by technical writers in fields
other than software. At the time of writing, we have well-developed
system development software, user interface, and grammar and dictionary
handling facilities. The English analysis grammar handles most of the
syntactic structures of the corpus, and we have a range of formats for
output of linguistic representations and Japanese text. A transfer grammar
for English-Japanese has been prototyped, but is not yet fully adequate to
handle all constructions in the corpus; a facility for dictionary entry in
Kanji is incorporated. The aspect of the system we will focus on in the
present paper is its interactive nature, discussing the range of different
types of interaction which are provided or permitted for different types of
users.
TEXAN is a system of transfer-oriented text analysis. Its linguistic concept
is based on a communicative approach within the framework of speech act
theory. In this view texts are considered to be the result of linguistic
actions. It is assumed that they control the selection of translation equivalents. The transition of this concept of linguistic actions (text acts) to the
model of computer analysis is performed by a context-free illocution grammar processing categories of actions and a propositional structure of states
of affairs. The grammar which is related to a text lexicon, provides the
connection of these categories and the linguistic surface units of a single
language.
A metric for .assessing the complexity of semantic (and pragmatic) analysis
in natural language processing is proposed as part of a general applied
theory of linguistic semantics for NLP. The theory is intended as a
complete projection of linguistic semantics onto NLP and is designed as an
exhaustive list of possible choices among strategies of semantic analysis at
each level, from the word to the entire text. The alternatives are summarized in a chart, which can be completed for each existing or projected NLP
system. The remaining components of the applied theory are also outlined.
COLING'86, pp. 338-340
Collative Semantics
Dan Fass
Computing Research Laboratory
This paper introduces Collative Semantics (CS), a new domain-independent semantics for natural language processing (NLP) which addresses the
problems of lexical ambiguity, metonymy, various semantic relations
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
113
The FINITE STRING Newsletter
New Mexico State University
Las Cruces, NM 88003
COLING'86, pp. 341..343
A Logical Formalism for the Representation of Determiners
Barbara Di Eugenio, Leonardo ILesmo, Paolo
Pogliano, Pietro Torasso, Francesco Urbano
Dipartimento di Informatica Universitfi di Torino
Via Valperga Caluso 37 10125 Torino - Italy
Abstracts of Current Literature
(conventional relations, redundant relations, contradictory relations, metaphorical relations, and severely anomalous relations) and the introduction
of new information. We explain the two techniques CS uses for matching
together knowledge structures (KSs) and why semantic vectors, which
record the results of such matches, are informative enough to tell apart
semantic relations and be the basis for lexical disambiguation.
Determiners play an important role in conveying the meaning of an utterance, but they have often been disregarded, perhaps because it seemed
more important to devise methods to grasp the global meaning of a
sentence, even if not in a precise way. Another problem with determiners
is their inherent ambiguity.
In this paper we propose a logical formalism, which, among other things,
is suitable for representing determiners without forcing a particular interpretation when their meaning is still not clear.
COLING'86, pp. 344-346
A Compositional Semantics for Directional
Modifiers - Locative Case Reopened
Erhard W. Hinrichs
Bolt Beranek & Newman Laboratories
10 Moulton Street
Cambridge, MA 02238
This paper presents a model-theoretic semantics for directional modifiers
in English. The semantic theory presupposed for the analysis is that of
Montague G r a m m a r (cf. Montague 1970, 1973) which makes it possible to
develop a strongly compositional treatment of directional modifiers. Such
a treatment has significant computational advantages over case-based
treatments of directional modifiers that are advocated in the AI literature.
COLING'86, pp. 347-349
Temporal Relations in Texts and Time
Logical Inferences
Jiirgen Kunze
Central Institute of Linguistics
Academy of Sciences of GDR
DDR- 1100 Berlin
COLING'86, pp. 350-352
Linguistics Bases for Machine Translation
Christian Rohrer
Institut for Linguistik
Universit~t Stuttgart
KeplerstraBe 17
7000 Stuttgart 1
A calculus is presented which allows an efficient treatment of the following
components: Tenses, temporal conjunctions, temporal adverbials (of
"definite" type), temporal quantifications, and phases. The phases are a
means for structuring the set of time-points t where a certain proposition is
valid. For one proposition, there may exist several "phase"-perspectives.
The calculus has integrative properties, i.e., all five components are represented by the same formal means. This renders possible a rather easy
combination of all information and conditions coming from the aforesaid
components.
My aim in organizing this panel is to stimulate the discussion between
researchers working on MT and linguists interested in formal syntax and
semantics. I am convinced that a closer cooperation will be fruitful for
both sides. I will be talking about experimental MT or MT as a research
project and not as a development project.
COLING'86, pp. 353-355
Combining Deictic Gestures and Natural
Language for Referent Identification
Alfred Kobsa, Jiirgen Allgayer, Carola Redding,
Norbert Reithinger, Dagmar Schmauks, Karin
Harbusch, Wolfgang Washlster
SFB 314: AI - Knowledge-Based Systems
University of Saarbrucken
D-6600 SaarbrOken l 1, West Germany
COLING'86, pp. 356-361
114
In virtually all current natural-language dialog systems, users can only refer
to objects by using linguistic descriptions. However, in human face-to-face
conversation, participants frequently use various sorts of deictic gestures as
well. In this paper, we will present the referent identification component
of XTRA, a system for a natural-language access to expert systems. XTRA
allows the user to combine NL input together with pointing gestures on the
terminal screen in order to refer to objects on the display. Information
about the location and type of this deictic gesture, as well as about the
linguistic description of the referred object, the case frame, and the dialog
memory are utilized for identifying the object. The system is tolerant in
respect to impreciseness of both the deictic and the natural language input.
The user can thereby refer to objects more easily, avoid referential failures,
and employ vague everyday terms instead of precise technical notions.
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
An Approach to Non-Singular Terms in
Discourse
Tomek Strzalkowski
School of Computing Science
Simon Fraser University
Burnaby, B.C. Canada V5A 1S6
COLING'86, pp. 362-364
Processing Clinical Narratives in
Hungarian
Gdbor Praszdky
National Education Library and Museum
Computer Department
Honv6d u. 19
H-1055 Budapest, Hungary
COLING'86, pp. 365-367
Definite Noun Phrases and the Semantics
of Discourse
Manfred Pinkal
c/o Fraunhofer-Institute IAO
Holzgartenstrasse 17,
D 7000 Stuttgart 1
a n d Institut fur Linguistik
Universit~it Stuttgart
COL1NG '86, pp. 368-3 73
Learning the Space of Word Meanings for
Information Retrieval Systems
Koichi Hori, Seinosuke Toda, Hisashi Yasunaga
National Institute of Japanese Literature
1- 16-10 Yutakacho Shingawaku
Tokyo 142 Japan
COLING '86, pp. 3 74-3 79
On the Use of Term Associations in Automatic Information Retrieval
Gerard Salton
Department of Computer Science
Cornel University
Ithaca, NY 14853
COLING'86, pp. 380-386
Abstracts of Current Literature
A new Theory of Names and Descriptions that offers a uniform treatment
for many types of non-singular concepts found in natural language
discourse is presented. We introduce a layered model of the language
denotational base (the universe) in which every world object is assigned a
layer (level) reflecting its relative singularity with respect to other objects
in the universe. We define the notion of relative singularity of world
objects as an abstraction class of the layer-membership relation.
This paper describes a system that extracts information from Hungarian
descriptive texts of medical domain. Texts of clinical narratives define a
sublanguage that uses limited syntax but holds the main characteristics of
the language, namely free word order and rich morphology. We offer a
fairly general parsing method for free word order languages and how to
use it for parsing Hungarian clinical texts. The system can handle simple
cases of ellipses, anaphora, unknown words, and typical abbreviations of
clinical practice. The system translates texts of anamneses, patient visits,
laboratory tests, medical examinations, and discharge summaries into an
information format usable for a medical expert system. Similarly to this
expert system, the information formatting program has been written in
MPROLOG language and its experimental version runs on PROPER-16, a
Hungarian-made (IBM-XT compatible) microcomputer.
In this talk I will first give a short overview of the basic Discourse Representation Theory system (Kamp 1981), and sketch K a m p ' s proposal for
the treatment of definite noun phrases. Then I will indicate how the basic
reference establishing function and the "side-effects" of different types of
definite NPs can be described in more detail. In doing this, I will refer to
the work about anaphora done in the NLP area (especially by Barbara
Grosz, Candy Sidner, and Bonnie Webber), integrating some of their
assumptions into the DRT framework, and critically commenting on some
others.
Several methods to represent meanings of words have been proposed.
However, they are not useful for information retrieval systems, because
they cannot deal with the entities that cannot be universally represented by
symbols.
In this paper we propose a notion of semantic space. Semantic space is
an Euclidean space where words and entities are put. A word is one point
in the space. The meanings of the word are represented as the space
configuration around the word. The entities that cannot be represented by
symbols can be identified in the space by the location the entity should be
settled in. We also give a learning mechanism for the space. We prove the
effectiveness of the proposed method by an experiment on information
retrieval for the study of Japanese literature.
It has been recognized that single words extracted from natural language
texts are not always useful for the representation of information content.
Associated or related terms, and complex content identifiers derived from
thesauruses and knowledge bases, or constructed by automatic word
grouping techniques, have therefore been proposed for text identification
purposes.
The area of associative content analysis and information retrieval is
reviewed in this study. The available experimental evidence shows that
none of the existing or proposed methodologies are guaranteed to improve
retrieval performance in a replicable manner for document collections in
different subject areas. The associative techniques are most valuable for
restricted environments covering narrow subject areas, or in iterative
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
115
The FINITESTRINGNewsletter
Abstracts of Current Literature
search situations where user inputs are available to refine previously available query formulations and search output.
Towards the Automatic Acquisition of
Lexical Data
H. Trost, E. Buchberger
Department of Medical Cybernetics and
Artificial Intelligence
University of Vienna, Austria
COLING'86, pp. 3 8 7 - 3 8 9
PeriPhrase: Lingware for Parsing and
Structural Transfer
Kenneth R. Beesley, David Hefner
A.L.P. Systems
190 West 800 North
Provo, UT 84604
Creating a knowledge base has always been a bottleneck in the implementation of AI systems. This is also true for Natural Language Understanding
systems, particularly •for data-driven ones. While a perfect system for
automatic acquisition of all sorts of knowledge is still far from being realized, partial solutions are possible. This holds especially for lexical data.
Nevertheless, the task is not trivial, in particular when dealing with
languages rich in inflectional forms like German. Our system is to be used
by persons with no specific linguistic knowledge, thus linguistic expertise
has been put into the system to ascertain correct classification of words.
Classification is done by means of a small rule based system with lexical
knowledge and language-specific heuristics. The key idea is the identification of three sorts of knowledge which are processed distinctly and the
optimal use of knowledge already contained in the existing lexicon.
PeriPhrase is a high-level computer language developed by A.L.P. Systems
to facilitate parsing and structural transfer. It is designed to speed the
development of computer-assisted translation systems and grammar checkers. We describe the syntax and semantics of this tool, its integrated development environment, and some of our experience with it.
COLING'86, pp. 390-392
SCSL: A Linguistic Specification Language
for MT
R~mi Zajac
GETA, BP 68
Universit6 de Grenoble
38402 Saint-Martin-d'H~res, France
COLING'86, pp. 3 9 3 - 3 9 8
A User Friendly ATN Programming Environment (APE)
Hans Haugeneder, Manfred Gehrke
siemens AG, ZT ZTI INF
West Germany
Nowadays, MT systems grow to such a size that a first specification step is
necessary if we want to be able to master their development and maintenance, for the software part as well as for the linguistic part ("lingwares").
Advocating for a clean separation between linguistics tasks and programming tasks, we first introduce a specification/implementation/
validation framework for NLP, then SCSL, a language for the specification
of analysis and generation modules.
APE is a workbench to develop ATN grammars based on an active chart
parser. It represents the networks graphically and supports the grammar
writer by window- and menu-based debugging techniques.
COLING'86, pp. 399-401
A Language for Transcriptions
Yves LePage
GETA, BP 68
Universit6 Scientifique et M6dicale de
Grenoble
38402 Saint-Martin-d'H~res, France
To deal with specific alphabets is a necessity in natural language processing. In Grenoble, this problem is solved with the help of transcriptions.
Here we present a language (LT) designed to the rapid writing of passage
from one transcription to another (transducers) and give some examples of
its use.
COLING'86, pp. 402-404
Variables et Categories Grammticales
dans un Modele Ariane
Jean-Phillipe Guilbaud
GETA, BP 68
Universit6 Scientifique et M6dicale de
Grenoble
38402 Saint-Martin-d'H~res, France
pp. 4 0 5 - 4 0 7
116
Toutes les cat6gories grammaticales utilis6es dans un module de traduction
Ariane sont formalis6es et cod6es de facon mn6monique en tant que variables et valeurs de variables. L'ensemble des variables d'un mod61e donn6
constitute le vocabulaire du m6talangage qui permet de d6crire la langue
source et la langue cible de ce mod61e.
•La structure de donn6es du syst6me est une arborescence dont chaque
noeud porte une d6coration. Les d6corations contiennent les variables
d6clar6es pour le syst6me et affect6es de certaines valeurs. Les variables
apparaissent 6galement dans les grammaires d'analyse, de transfert et de
g6n6ration, dans les dictionnaires monolingues d'analyse ou de g6n6ration
Computational Lingltdstics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
Abstracts of Current Literature
et bilingues de transfert lexical, ainsi que dans les sp6cifications de mod61e
linguistique (grammaires statiques).
Di~duction Automatique et Syst~mes
Transformationnels
J. Chauchd
C.E.L.T.A. 23, Boulevard Albert ler
54000 - Nancy, France
Les syst6mes transformationnels utilisent des processus d6ductifs d'une
approche diff6rente des syst6mes utilis6s en intelligence artificielle. A travers une comparaison du language Proglog et du language Sygmart, il est
montr6 comment r6aliser dans les syst6mes transformationnels des applications utilisant des raisonnements et des bases de connaissances.
COLING'86, pp. 408-411
CRITAC - A Japanese Text Proofreading
System
Koichi Takeda, Tet~nosuke Fujisaki,
Emiko Suzuki
Japan Science Institute
IBM Japan, Ltd.
5- i 9 Sanban-cho, Chiyoda-ku,
Tokyo 102, Japan
CRITAC (CRITiquing using ACcumulated knowledge) is an experimental
expert system for proofreading Japanese text. It detects mistypes, Kanato-Kanji rnisconversions, and stylistic errors. This system combines
Prolog-coded heuristic knowledge with conventional Japanese text processing techniques which involve heavy computation and access to large
language data bases.
COLING'86, pp. 412-417
Storing Text using Integer Codes
Raja Noor Ainon
Computer Centre
University of Malaya
59100 Kuala Lumpur, Malaysia
COLING'86, pp. 418-420
BetaText: An Event Driven Text Processing and Text Analyzing System
Benny Brodda
Department of Linguistics
University of Stockholm
S-106 91 Stockholm, Sweden
COLING'86, pp. 421-422
Toward Integrated Dictionaries for M(a)T
Ch, Boitet, N. Nedobejkine
GETA, BP 68
Universit6 de Grenoble
38402 Sint-Martin-d'H6res, France
COLING'86, pp. 423-428
lndexage Lexical au GETA
Jedrzej Bukowski
Traditionally, text is stored on computers as a stream of characters. The
goal of this research is to store text in a form that facilitates word manipulation whilst reducing storage space. A word list with syntactic linear
ordering is stored and words in a text are given two-byte integer codes that
point to their respective positions in this list. The implementation of the
encoding scheme is described and the performance statistics of this encoding scheme are presented.
BetaText can be described as an event driven production system, in which
(combinations of) text events lead to certain actions, such as the printing
of sentences that exhibit certain, say, syntactic phenomena. The analysis
mechanism used allows for arbitrarily complex parsing, but is particularly
suitable for finite state parsing. A careful investigation of what is actually
needed in linguistically relevant text processing resulted in a rather small
but carefully chosen set of "elementary actions" to be implemented.
In the framework of Machine (aided) Translation systems, two types of
lexical knowledge are used, "natural" and "formal", in the form of on-line
terminological resources for human translators or revisors and of coded
dictionaries for Machine Translation proper.
A new organization is presented, which allows one to integrate both
types in a unique structure, called " f o r k " integrated dictionary, or FID. A
given FID is associated with one natural language and may give access to
translations into several other languages.
The FIDs associated with languages L1 and L2 contain all information
necessary to generate coded dictionaries of M(a)T systems translating from
L1 into L2 or vice-versa. The skeleton of a FID may be viewed as a classical bilingual dictionary. Each item is a tree structure, constructed by
taking the "natural" information (a tree) and "grafting" onto it some
"formal" information.
Various aspects of this design are refined and illustrated by detailed
examples, several scenarios for the construction of FIDs are presented, and
some problems of organization and implementation are discussed. A
prototype implementation of the FID structure is underway in Grenoble.
L'aspect lexicographique de la traduction assist6e par ordinateur est
pr6sent6 et illustr6 par des exemples de traduction du russe en francais
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
117
The FINITE STRING Newsletter
Abstracts of CurrentLiterature
GETA, BP 68
Universit6 Scientifique et M6dicale de
Grenoble
38402 Saint-Martin-d'H~res, France
COLING'86, pp. 429-431
r6alis6e par le GETA ~t Grenoble.
Experiments with an MT-Directed Lexicai
Knowledge Bank
A crucial test for any MT system is its power to solve lexical ambiguities.
The size of the lexicon, its structural principles, and the availability of
extra-linguistic knowledge are the most important aspects in this respect.
This paper outlines the experimental development of the SWESIL system: a
structured lexicon-based word expert system designed to play a pivotal role in
the process of Distributed Language Translation which is being developed
in the Netherlands. It presents SWESIL's organizing principles, gives a
short description of the present experimental set-up, and shows how
SWESIL is being tested at this moment.
B.C. Papegaaij, V. Sadler, A.P.M. Witkam
BSO/Research
Bureau voor Systeemontwikkeling
P.O. Box 8348
3503 RH Utrecht, The Netherlands
COLING'86, pp. 432-434
A Word Database for Natural Language
Processing
Brigitt Barnett, Hubert ILehmann,
Magdalena Zoeppritze
IBM Scientific Center
TiergartenstraBe 15
6900 Heidelberg,
Federal Republic of Germany
COLING'86, pp. 435-440
Lexical Database Design: The Shakespeare Dictionary Model
H. Joachim Neuhaus
The paper describes the design of a fair sized lexical data base that is ~o be
used with a natural language based expert system with G e r m a n as the
language of interaction. Sources for entries and tools for constructing and
maintaining the database are discussed, as well as the information needed
in the lexicon for the purposes of syntactic and semantic processing.
This paper describes the data and presents some preliminary design considerations along with a sample schema.
Westf~dische Wilhelms-Universitat, FB 12
D-4400 MOnster, West Germany
COLING'86, pp. 441-444
An Attempt to Automatic Thesaurus
Construction from an Ordinary Japanese
Language Directory
Hiroaki Tsurumaru
Department of Electronics
Nagasaki University
Nagasaki 852, Japan
H o w to obtain hierarchical relations (e.g., superordinate-hyponym relation,
synonym relation) is one of the most important problems for thesaurus
construction. A pilot system for extracting these relations automatically
from an ordinary Japanese language dictionary (Shinmeikai Kokugojiten,
published by Sansei-do, in machine readable form) is given. The features
of the definition sentences in the dictionary, the mechanical extraction of
the hierarchical relations, and the estimation of the results are discussed.
Tom Hitaka, Sho Yoshida
Department of Electronics
Kyushu University 36
Fujuoka 812, Japan
COLING'86, pp. 445-447
Acquisition of Knowledge Data by
Analyzing Natural Language
Yasuhito Tanaka
Himeji College
1-1-12 Shinzaike Honmachi
Himeji City Hyogoken
670 Japan
Automatic identification of homonyms in kana-to-kanji conversion systems
and of multivocal words in machine translation systems cannot be sufficiently implemented by the mere combination of grammar and word
dictionaries. This calls for a new concept of knowledge data. What the
new knowledge data is and how it can be acquired are mentioned in the
paper. In natural language research, active discussion has been made within the framework of knowledge and samples of knowledge.
Sho Yoshida
Kyushu University
6-10-1 Hakozaki Higashiku
Fukuoka City Fukuokaken
812 Japan
COLING'86, pp. 448-450
118
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The F I N I T E S T R I N G Newsletter
Model for Lexical Knowledge Base
Michio Isoda, Hideo Also
Faculty of Science and Technology
Keio University
Noriyuki Kamibayashi, Yoshifumi Matsunaga
System Technology Laboratory
Fuji Xerox Co., Ltd.
COLING'86, pp. 451-453
User Specification of Syntactic Case
Frames in TELl, A Transportable, UserCustomized Natural Language Processor
Bruce W. Ballard
AT&T Bell Laboratories
600 Mountain Avenue
Murray Hill, NJ 07974
COLING'86, pp. 454-460
Functional Structures for Parsing Dependency Constraints
H. Jiippinen, A. Lehtola, K. Valkonen
SITRA Foundation
P.O. Box 329
Helsinki, Finland
and Helsinki University of Technology
Abstracts of Current Literature
This paper describes a model for a lexical knowledge base (LKB). An LKB
is a knowledge base management system (KBMS) which stores various
kinds of dictionary knowledge in a uniform framework and provides multiple viewpoints to the stored knowledge.
KBMSs for natural language knowledge will be fundamental components
of knowledgeable environments where non-computer professionals can use
various kinds of support tools for document preparation or translation.
However, basic models for such KBMSs have not been established yet.
Thus, we propose a model for an LKB focusing on dictionary knowledge
such as that obtained from machine-readable dictionaries.
When an LKB is given a key from a user, it accesses the stored knowledge associated with that key. In addition to conventional direct retrieval,
the LKB has a more intelligent access capability to retrieve related knowledge through relationships among knowledge units. To represent complex
and irregular relationships, we employ the notion of implicit relationships.
In contrast to conventional database models where relationships between
data items are statically defined at data generation time, the LKB extracts
relationships dynamically by interpreting the contents of stored knowledge
at run time. This makes the LKB more flexible; users can add new functions or new knowledge incrementally at any time. The LKB also has the
capability to define and construct new virtual dictionaries from existing
dictionaries. Thus users can define their own customized dictionaries suitable for their specific purposes.
The proposed model provides a logical foundation for building flexible
and intelligent LKBs.
In this paper, we present methods that allow the users of a natural
language processor (NLP) to define, inspect, and modify any case frame
information associated with the words and phrases known to the system.
An implementation of this work forms a critical part of the Transportable
English-Language Interface (TEL1) system. However, our techniques have
enabled customization capabilities largely independent of the specific NLP
for which information is being acquired.
The primary goal of the syntactic acquisitions of TEL1 is to redress the
fact that many NL prototypes have failed (1) to make known to users
exactly what inputs are allowed (e.g., what words and phrases are defined)
and (2) to meet the needs of a given user or group of users (e.g., appropriate vocabulary, syntax, and semantics). Experience has shown that neither
users nor system designers can predict in advance all the words, phrases,
and associated meanings that will arise in accessing a given data base (cf.,
Tennant 1977). Thus, we have chosen to make TEL1 "transportable" in an
extreme sense, where customizations may be performed (1) by end users,
as opposed to computer professionals, and (2) at any time during English
processing.
This paper outlines a high-level language FUNDPL for expressing functional structures for parsing dependency constraints. The goal of the
language is to allow a grammar writer to pin down his or her grammar with
minimal commitment to control. FUNDPL interpreter has been implemented on top of a lower-level language DPL, which we had implemented
earlier.
COLING'86, pp. 461-463
Controlled Active Procedures as a Tool for
Linguistic Engineering
Heinz-Dirk Luckhardt, Manfred Thiel
Sonderforschungsbereich I00
Controlled active procedures are productions that are grouped under and
activated by units called "scouts". Scouts are controlled by units called
"mission", which also select relevant sections from the data structure for
rule application. Following the problem reduction method, the parsing
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
I 19
The FINITE STRING Newsletter
"Elektronische Sprachforschung"
Universitat des Saarlandes
D-6600 Saarbrticken 1 l
Bundesrepublik Deutschland
COLING'86, pp. 464-469
A New Predictive Analyzer of English
Hiroyuki Musha
Department of Information Science
Tokyo Institute of Technology
Ohokayama, Meguro-ku, Tokyo 152, Japan
COLING'86, pp. 470-472
Generalized Memory Manipulating Actions
for Parsing Natural Language
Irina Prodanof
Istituto di Linguistica Computazionale
CNR-Pisa
Giacomo Ferrari
Abstractsof CurrentLiterature
problem is subdivided into ever smaller subproblems, each one of which is
r e p r e s e n t e d b y a mission. The elementary problems are represented by
scouts. The CAP grammar formalism is based on experience gained with
natural language (NL) analysis and translation by computer in the Sonderforschungsbereich at the University of Saarbrticken over the past twelve
years and dictated by the wish to develop an efficient parser for random
NL texts on a sound theoretical basis. The idea has ripened in discussions
with colleagues from the EUROTRA project and is based on what HeinzDieter Maas has developed in the framework of the SUSY-II system.
In the present paper, CAP is introduced as a means of linguistic engineering (cf., Simmons 1985), which covers aspects like rule writing, parsing strategies, syntactic and semantic representation of meaning,
representation of lexical knowledge, etc.
Aspects of syntactic predictions made during the recognition of English
sentences are investigated. We reinforce Kuno's original predictive analyzer by introducing five types of predictions. For each type of prediction, we
discuss and present its necessity, its description method, and recognition
mechanisms. We make use of three kinds of stacks whose behavior is
specified by grammar rules in an extended version of Greibach normal
form. We also investigate other factors that affect the predictive recognition process, i.e., preferences among syntactic ambiguities and necessary
amount of lookahead. These factors as well as the proposed handling
mechanisms of predictions are tested by analyzing two kinds of articles. In
our experiment, more than seventy percent of sentences are recognized
and looking two words ahead seems to be the critical length for the predictive recognition.
Current (computational) linguistic theories have developed specific formalisms for representing linguistic phenomena such as unbounded dependencies, relative, etc. In this contribution we present a model of linguistic
structures storing and accessing, which accounts for the same phenomena
in a procedural way. Such a model has been implemented in the frame of
an ATN parser.
Department of Linguistics
University of Pisa
COLING'86, pp. 473-475
Distributed Memory: A Basis for Chart
Parsing
Jon M. Slack
Human Cognition Research Laboratory
Open University
Milton Keynes, MK7 6AA England
COLING'86, pp. 476-481
120
The properties of distributed representations and memory systems are
explored as a potential basis for non-deterministic parsing mechanisms.
The structure of a distributed chart parsing representation is outlined.
Such a representation encodes both immediate-dominance and terminal
projection information on a single composite memory vector. A parsing
architecture is described which uses a permanent store of context-free rule
patterns encoded as split composite vectors, and
two interacting working memory units. These latter two units encode
vectors which correspond to the active and inactive edges of an active
chart parsing scheme. This type of virtual parsing mechanism is compatible with both a macro-level implementation based on standard sequential
processing and a micro-level implementation using a massively parallel
architecture.
The research to be discussed here differs from previous work in that it
explores the properties of distributed representations as a basis for
constructing parallel parsing architectures. Rather than being represented
by localized networks of processing units, the grammar rules are encoded
as patterns which have their effect through simple, yet well-specified,
forms of interaction. The aim of the research is to devise a virtual machine
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
Abstracts of Current Literature
for parsing context-free languages based on the mutual interaction of relatively simple memory components.
The Treatment of M o v e m e n t Rules in a
LFG Parser
Haus-UIrich Block, Ham Haugeneder
Siemens AG, Mtinchen
ZT ZT1 INF, West Germany
COLING'86, pp. 482-486
A Concept of Derivation for LFG
Jiirgen Wedekind
Department of Linguistics
University of Stuttgart
West Germany
COLING'86, pp. 487-489
Incremental Construction of C - and
F-Structure in a LFG Parser
Haus-Ulrich Boock, Rudolf Hunze
ZTI INF 3
Siemens AG
Mtmchen, West Germany
COLING'86, pp. 490-493
Getting Things Out of Order
Laus Netter
Department of Linguistics
University of Stuttgart
West Germany
COLING'86, pp. 494-496
In this paper we propose a way of treating long-distance movement
phenomena as exemplified in (1) in the framework of an LFG-based
parser.
(1) Who do you think Peter tried to meet
'You think Peter tried to meet who'
We therefore concentrate first on the theoretical status of so-called whor long-distance movement in Lexical Functional G r a m m a r (LFG) and in
the Theory of Government and Binding (GB), arguing that a general mechanism that is compatible with both LFG and GB treatment of long-distance
movement can be found. Finally, we present the implementation of such a
movement mechanism in a LFG parser.
In this paper a version of LFG will be developed which has only one level
of representation and is equivalent to the modified version of Kaplan,
presented in Bresnan (1982) and Kaplan and Zaenen (1986). The structures of this monostratal version are f-structures, augmented by additional
information about the derived symbols and their linear order. For these
structures it is possible to define an adequate concept of direct derivability
by which the derivation process becomes more efficient, as the f-description solution algorithm is directly simulated during the derivation of these
structures, instead of being postponed. Apart from this, it follows from
this reducibility that LFG as a theory in its present form does not make use
of the c-structure information that goes beyond the mere linear order of
the derived symbols.
In this paper we present a parser for Lexical Function G r a m m a r (LFG)
which is characterized by incrementally constructing the c- and f-structure
of a sentence during the parse. We then discuss the possibilities of the
earliest check on consistenc),, coherence, and completeness. Incremental
construction of f-structure leads to an early detection and abortion of
incorrect paths and so increases parsing efficiency. Furthermore those
semantic interpretation processes that operate on partial structures can be
triggered at an earlier state. This also leads to a considerable improvement
in parsing time. LFG seems to be well suited for such an approach because
it provides for locality principles by the definition of coherence and
completeness.
One of the most characteristic features of G e r m a n word order seems to be
a contrast between fixed ordering rules concerning the order of verbal
elements and a much more variable ordering of their corresponding nominal arguments. As a consequence, German word order seems to yield a
large number of phenomena that may be classified as " u n b o u n d e d " or
"long-distance dependencies", without necessarily involving wh-constituents or " m o v e m e n t " across sentence boundaries. Whereas in traditional
LFG long-distance dependencies are treated by means of constituent
control, we will follow a recent proposal by Kaplan and Zaenen (1986) to
give up the constraint known as "functional locality" and instead allow
regular expressions to appear as functional schemata annotated to c-structure rules. Exploiting the principles of completeness and coherence we will
thus be able to cope even with absolutely free word order without the need
of generating empty terminal nodes at all. The empirical assumption,
underlying the proposed analysis in its most radical form, is the hypothesis
that (with very few exceptions) the nominal arguments have to appear on
the left of the verb by which they are assigned case. We will restrict the
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
121
The FINITE STRING Newsletter
Abstracts of CurrentLiterature
discussion to sentences with one finite verb as well as to subcategorized
nominal arguments, largely ignoring ADJuncts.
TOPIC Essentials
Udo Hahn, Ulrich Reimer
Universit~it Konstanz
lnformationswissenschaft
Postfach 5560
D-7750 Konstanz, F.R.G.
COLING'86, pp. 497-503
Towards Discourse-Oriented N o n m o n o tonic System
Barbara Dunin-Keplicz, Witold Lukaszewicz
Institute of Informatics
Warsaw University
P.O. Box 1210
00-901 Warszawa, Poland
An overview of TOPIC is provided, a knowledge-based text information
system for the analysis of German-language texts. TOPIC supplies text
condensates (summaries) on variable degrees of generality and makes
available facts acquired from the texts. The presentation focuses on the
major methodological principles underlying the design of TOPIC: a frame
representation model that incorporates various integrity constraints, text
parsing with focus on text cohesion and text coherence properties of expository texts, a lexically distributed semantic text grammar in the format
of word experts, a model of partial text parsing, and text graphs as appropriate representation structures for text condensates.
The purpose of this paper is to analyse the phenomenon of nonmonotonicity in a natural language and to formulate a number of general principles
which should be taken into consideration while constructing a discourse
oriented nonmonotonic formalism.
COLING'86, pp. 504-506
Japanese Honorifics and Situation
Semantics
R. Sugimura
Institute for New Generation Computer
Technology (ICOT) Japan
COLING'86, pp. 507-510
Two Approaches to Commonsense Inferencing for Discourse Analysis
Marc Dymetman
Universit6 Seientifique et M6dieale de
Grenoble
Groupe d'Etudes pour la Traduction
Automatique B.P. 68
38042 Saint Martin d'H6res, France
COLING'86, pp. 511-514
Speech Acts of Assertion in Cooperative
Informational Dialogue
I.S. Kononenko
AI Laboratory, Computer Center
Siberian Division of the USSR Ac. Sci
Novosibirsk 630090, USSR
COLING'86, pp. 515-519
Pragmatic Considerations in Man-Machine
Discourse
122
A model of Japanese honorific expressions in situation semantics is
proposed. Situation semantics provides considerable power for analyzing
the complicated structure of Japanese honorific expressions. The main
'feature of this model is a set of basic rules for context switching in honorific sentences. Mizutani's theory of Japanese honorifics is presented and
incorporated in the model which has been used to develop an experimental
system capable of analyzing honorific context. Some features of this
system are described.
The dominant philosophy regarding the formalization of Commonsense
Inferencing in the physical domain consists in the exploitation of the
"tarskian" scheme axiomatization < - > interpretation borrowed from
mathematical logic. The commonsense postulates constitute the axiomatization, and the real world provides the " m o d e l " for this axiomatization.
The observation of the effective activity of linguistic communication and
of the commonsense inferencing processes which are involved in it show
the unacceptability of this scheme.
An alternative is proposed, where the notion of "conceptual category"
plays a principal role, and where the principle of logical adequation of an
axiomatization to a model is replaced by a notion of "projection" of a
conceptual structure onto the observed reality.
Dialogue systems should provide a cooperative informational dialogue
aimed at knowledge sharing. In the paper speech acts of assertion (SAA)
are assumed to be the means of achieving this goal. A typology of SAAs is
proposed which reflects certain cognitive aspects of communicative situation at different stages of mutual informing process. Information constituents of the type assertions are formally described to represent a current
cognitive state of the speaker's knowledge base, each proposition in it
being characterized by a subjective verisimilitude evaluation. The general
,scheme of information flow in the cooperative dialogue is considered.
With regard to this scheme the dialogue functions of SAAs are discussed.
This paper presents nothing that has not been noted previously by research
in Artificial Intelligence but seeks to gather together various ideas that
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
Walther v. Hahn
Research Unit for Information Science and
Artificial Intelligence
University of Hamburg
D-2000 Hamburg 13, West Germany
Abstracts of Current Literature
have arisen in the literature. It collects those arguments which are in my
view crucial for further progress and is intended only as a reminder of
insights which might have been forgotten for some time.
COLING'86, pp. 520-527
Formal Specification of Natural Language
Syntax using Two-Level Grammar
Barret R. Bryant, Dale Johnson,
Galanjaninath Edupuganty
Department of Computer and Information
Science
The University of Alabama
Birmingham, Alabama 35294
COLING '86, pp. 527-533
On Formalizations of Marcus's Parser
R. Nozohoor-Farshi
The two-level grammar is investigated as a notation for giving formal specification of the context-free and context-sensitive aspects of natural
language syntax. In this paper, a large class of English declarative
sentences, including post-noun-modification by relative clauses, is formalized using a two-level grammar. The principal advantages of two-level
grammar are: 1) it is very easy to understand and may be used to give a
formal description using a structured form of natural language; 2) it is
formal with many well-known mathematical properties; and 3) it is directly
implementable by interpretation. The significance of the latter fact is that
once we have written a two-level grammar for natural language syntax, we
can derive a parser automatically without writing any additional specialized
computer programs. Because of the ease with which two-level grammars
may express logic and their Turing computability, we expect that they will
also be very suitable for future extensions to semantics and knowledge
representation.
LR(k,t), BCP(m,n), and LRRL(k) grammars, and their relations to Marcus
parsing are discussed.
Department of Computing Science
University of Alberta
Edmonton, Canada T6G 2H1
COLING'86, pp. 533-535
A Grammar Used for Parsing and Generation
Jean-Marie Lancel, Nathalie Simonin
CAP Sogeti Innovation
129, rue de I'Universit6
75007 Paris, France
Fratwpis Roasselot
University of Strasbourg II
22, rue Descartes
67084, Strasbourg, France
This text presents the outline of a system using the same grammar for parsing and generating sentences in a given language. This system has been
devised for a "multilingual document generation" project.
The Functional G r a m m a r notation described here allows a full symmetry between parsing and generating. Such a grammar may be read easily
from the point of view of the parsing and from the point of view of the
generation. This allows one to write only one grammar of a language,
which minimizes the linguistics costs in a multilingual scheme.
COLING'86, pp. 536-539
BUILDRS: An Implementation of DR
Theory and LFG
Hajime Wada
Department of Linguistics
Nicholas Asher
Department of Philosophy, Center for
Cognitive Science
The University of Texas at Austin
CGLING'86, pp. 540-545
A Prolog Implementation of GovernmentBinding Theory
Robert J. Kuhns
Artificial Intelligence Center
This paper examines a particular Prolog implementation of Discourse
Representation theory (DR theory) constructed at the University of Texas.
The implementation also contains a Lexical Functional G r a m m a r parser
that provides f-structures: these f-structures are then translated into the
semantic representations posited by DR theory, structures which are
known as Discourse Representation Structures (DRSs). Our program
handles some linguistically interesting phenomena in English such as (i)
scope ambiguities of singular quantifiers, (it) functional control phenomena, and (iii) long distance dependencies. Finally, we have implemented an
algorithm for anaphora resolution. Our goal is to use purely linguistically
available information in constructing a semantic representation of
discourse as far as is feasible and to forego appeals to world knowledge.
A parser founded on Chomsky's Government-Binding Theory and implemented in Prolog is described. By focussing on systems of constraints as
proposed by this theory, the system is capable of parsing without an elaborate rule set and subcategorization features on lexical items. In addition to
Computational Linguistics, Volume 13, Numbers 1-2, January. June 1987
123
The FINITE STRING Newsletter
Arthur D. Little, Inc.
Cambridge, MA 02140
Abstracts of Current Literature
the parse, theta, binding, and control relations are determined simultaneously.
COLING'86, pp. 546..550
A Lexical Functional Grammar System in
Prolog
Andreas Eisele, Jochen DOrre
Department of Linguistics
University of Stuttgart
West Germany
COLING'86, pp. 551-553
Knowledge Structures for Natural
Language Generation
Paul S. Jacobs
Knowledge-Based Systems Branch
General Electric Corporate Research and
Development
Schenectady, NY 12301
COLING'86, pp. 554-559
Semantic-Based Generation of Japanese
German Translation System
K. Hanakata
Institut f. Informatik
University of Stuttgart
Herdweg 51
D-7000 Stuttgart 1, F.R. Germany
This paper describes a system in Prolog for the automatic transformation
of a grammar, written in LFG formalism, into a DCG-based parser. It
demonstrates the main principles of the transformation, the representation
of f-structures and constraints, the treatment of long-distance dependencies, and left recursion.
Finally some problem areas of the system and possibilities for overcoming them are discussed.
The development of natural language interfaces to Artificial Intelligence
systems is dependent on the representation of knowledge. A major impediment to building such systems has been the difficulty in adding sufficient
linguistic and conceptual knowledge to extend and adapt their capabilities.
This difficulty has been apparent in systems which perform the task of
language production, i.e., the generation of natural language output to
satisfy the communicative requirements of a system.
The Ace framework applies knowledge representation fundamentals to
the task of encoding knowledge about language. Within this framework,
linguistic and conceptual knowledge are organized into hierarchies, and
structured associations are used to join knowledge structures that are metaphorically or referentially related. These structured associations permit
specialized linguistic knowledge to derive partially from more abstract
knowledge, facilitating the use of abstractions in generating specialized
phrases. This organization, used by a generator called KING (Knowledge
INtensive Generator), promotes the extensibility and adaptability of the
generation system.
Project SEMSYN*** achieved a state where a prototype system generates
G e r m a n texts on the basis of the semantic representation produced from
Japanese texts by ATLAS/II of Fujitsu Laboratory. This paper describes
some problems that are specific to our semantic-based approach and some
results of the evaluation study that has been made by the Germanist group.
A. Lesniewski
Standard Elektrik Lorenz AG
Ostendstrasse 3
D-7530 Pforzheim, F.R. Germany
S. Yokoyama
Electrotechnical Laboratory
Umezono, Sakuramura, Nihari
Ibaraki 305, Japan
COLING'86 560-562
Synthesizing Weather Forecasts from
Formatted Data
R. Kittredge, A. Polgubre
D6partement de Linguistique
Universit6 de Montr6al
E. Goldberg
Atmosphere Environment Services
Environment Canada, Toronto
This paper describes a system (RAREAS) which synthesizes marine weather forecasts directly from formatted weather data. Such synthesis appears
feasible in certain natural sublanguages with stereotyped text structure.
RAREAS draws on several kinds of linguistic and non-linguistic knowledge
and mirrors a forecaster's apparent tendency to ascribe less precise temporal adverbs to more remote meteorological events. The approach can easily be adapted to synthesize bilingual or multi-lingual texts.
COL1NG'86, pp. 563-565
From Structure to Process
Michael Zock, GErard Sabah
124
This paper describes an implemented tutoring system designed to help
students to generate clitic-constructions in French. While showing various
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
LIMS1 - Langues Naturelles
B.P. 30 - Orsay C6dex / France
Christophe Alviset
INSSEE - 3, av. P. Larousse
94241 Malakoff - France
COLING'86, pp. 5 6 6 - 5 6 9
Generating a Coherent Text Describing a
Traffic Scene
Hans-Joachim Novak
Fachbereich Informatik
Universit~t Hamburg
D-2000 Hamburg 13, West Germany
COLING '86, pp. 5 70-5 75
Generating Natural Language Text in a
Dialog System
Mare Koit
Department of Programming
2 Juhan Liivi Street
Madis Saluveer
Artificial Intelligence Laboratory
78 Tiigi Street
Tartu State University
202400 Tartu Estonia USSR
Abstracts of CurrentLiterature
ways of converting a given meaning structure into its corresponding
surface expression, the system helps not only to discover w h a t data to process but also h o w this information processing should take place. In other
words, we are concerned with efficiency in verbal planning (performance).
Recognizing that the same result can be obtained by various methods,
the student should find out which one is best suited to the
circumstances (what is known, task demands, etc.). Informational states,
hence the processor's needs, may vary to a great extent, as may his strategies or cognitive styles. In consequence, in order to become an efficient
processor, the student has to acquire not only STRUCTURAL or RULE
KNOWLEDGE but also PROCEDURAL KNOWLEDGE (skill).
With this in mind we have designed three modules in order to foster a
reflective, experimental attitude in the learner, helping him to discover
insightfully the most efficient strategy.
If a system that embodies a reference semantic for motion verbs and prepositions is to generate a coherent text describing the recognized motions, it
needs a decision procedure to select the events. In NAOS event selection is
done by use of a specialization hierarchy of motion verbs. The strategy of
anticipated visualization is used for the selection of optional deep cases.
The system exhibits low-level strategies which are based on verb inherent
properties that allow the generation of a coherent descriptive text.
The paper deals with generation of natural language text in a dialog
system. The approach is based on principles underlying the dialog system
TARLUS under development at Tartu State University. The main problems
concerned are the architecture of a dialog system and its knowledge base.
Much attention is devoted to problems which arise in answering the user
queries - the problems of planning an answer, the non-linguistic and
linguistic phases of generating an answer.
COLING '86, pp. 5 76-580
Generating English Paraphrases from
Formal Relational Calculus Expressions
A.N. De Roeck, B.G.T. Lowden
University of Essex
Wivenhoe Park
Colchester, United Kingdom
COLING'86, pp. 581-583
The Computational Complexity of
Sentence Derivation in Functional
Unification Grammar
Graeme Ritchie
Department of Artificial Intelligence
University of Edingburgh
Edinburgh 3H1 1HN
This paper discusses a system for producing English descriptions (ox
"paraphrases") of the content of formal relational calculus formulae
expressing a database query. It explains the underlying design motivations
and describes a conceptual model and focus selection mechanism necessary
for delivering coherent paraphrases. The general paraphrasing strategy is
discussed, as are the notions of "desirable" paraphrase and "paraphrasable
query". Two examples are included. The system was developed and
implemented in Prolog at the University of Essex under a grant from ICL.
Functional unification (FU) grammar is a general linguistic formalism based
on the merging of feature-sets. An informal outline is given of how the
definition of derivation within FU grammar can be used to represent the
satisfiability of an arbitrary logical formula in conjunctive normal form.
This suggests that the generation of a structure from an arbitrary FU grammar is NP-hard, which is an undesirably high level of computational
complexity.
COLING'86, pp. 5 8 4 - 5 8 6
Parsing Spoken Language: a Semantic
Caseframe Approach
Philip J. Hayes, Alexander G. Hauptmann,
Jaime G. Carbonell, Masaru Tomita
Computer Science Department
Parsing spoken input introduces serious problems not present in parsing
typed natural language. In particular, indeterminacies and inaccuracies of
acoustic recognition must be handled in an integral manner. Many techniques for parsing typed natural language do not adapt well to these extra
demands. This paper describes an extension of semantic caseframe parsing
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
125
The FINITE STRING Newsletter
Carnegie-Mellon University
Pittsburgh, PA 15213
COLING'86, pp. 58 7-592
Divided and Valency-Oriented Parsing in
Speech Understanding
Gerh. Th. Niedermair
Zt ZTI INF, Siemens AG
Otto-Hahn-Ring 6
8000 Mtmchen 83
COLING'86, pp. 593-595
The Role of Semantic Processing in an
Automatic Speech Understanding System
Astrid Brietzraann, Ute Ehrlich
Lehrstuhl fuer Informatik 5
(Mustererkennung)
Universitaet Erlangen-Nuernberg
Martesstr. 3, 8520 Erlangen, F.R. Germany
COLING'86, pp. 596-598
Synthesis of Spoken Message from
Semantic Representations
Laurence Danlos, Eric LaPorte
Laboratoire d'Automatique Documentaire et
Linguistique
Universit6 de Paris 7
2, place Jussieu
75251 Paris Cedex 05
Abstracts of CurrentLiterature
to restricted-domain spoken input. The semantic caseframe grammar
representation is the same as that used for earlier work on robust parsing
of typed input. Due to the uncertainty inherent in speech recognition, the
caseframe grammar is applied in a quite different way, emphasizing island
growing from caseframe headers. This radical change in application is
possible due to the high degree of abstraction in the caseframe representation. The approach presented was tested successfully in a preliminary
implementation.
A parsing scheme for spoken utterances is proposed that deviates from
traditional " o n e go" left to right sentence parsing in that it divides the
parsing process first into two separate parallel processes. Verbal constituents and nominal phrases (including prepositional phrases) are treated
separately and only brought together in an utterance parser. This allows
especially the utterance parser to draw on valency information right from
beginning when amalgamating the nominal constituents to the verbal core
by means of binary sentence rules. The paper also discusses problem of
representing the valency information in case-frames arising in a spoken
language environment.
We present the semantics component of a speech understanding and
dialogue system that was developed at our institute. Due to pronunciation
variabilities and vagueness of the word recognition process, semantics in a
speech understanding system has to resolve additional problems. Its main
task is not only to build up a representation structure for the meaning of an
utterance, as in a system for written input, semantic knowledge is also
employed to decide between alternative word hypotheses, to judge the
plausibility of syntactic structures, and to guide the word recognition process by expectations resulting from partial analyses.
A semantic-representation-to-speech system communicates orally the
information given in a semantic representation. Such a system must integrate a text generation module, a phonetic conversion module, a prosodic
module, and a speech synthesizer. We will see how the syntactic information elaborated by the text generation module is used for both phonetic
conversion and prosody, so as to produce the data that must be supplied to
the speech synthesizer, namely a phonetic chain including prosodic information.
Franfoise Emerard
Centre National d'Etudes des
T616communications
22301 Lannion Cedex
COLING'86, pp. 599-604
The Procedure to Construct a Word
Predictor in a Speech Understanding
System from a Task-Specific Grammar
Defined in a CFG or a DCG
Yasuhisa Niirai, Shigeru Uzuhara,
Yutaka Kobayashi
This paper describes a method for converting a task-dependent grammar
into a word predictor of a speech understanding system. Since the word
prediction is a top-down operation, left recursive rules induce an infinite
looping. We have solved this problem by applying an algorithm for
bottom-up parsing.
Department of Computer Science
Kyoto Institute of Technology
Matsugasaki, Sakyo-ku, Kyoto 606, Japan
COLING'86, pp. 605-607
The Role of Phonology in Speech
Processing
Richard Wiese
Seminar ftir Allgemeine Sprachwissenschaft
126
In this paper, I discuss the role of phonology in the modelling of speech
processing. It will be argued that recent models of nonlinear representation in phonology should be put to use in speech processing systems
(SPS). Models of phonology aim at the reconstruction of the phonological
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
Universit~it DUsseldorf
D-4000 Dtisseldorf, FRG
COLING '86, pp. 608-611
Computational Phonology: Merged, not
Mixed
Egon Berendsen
Department of Phonetics
University of Utrecht
The Netherlands
Simone Langeweg
Phonetics Laboratory
University of Leyden
The Netherlands
Abstracts of CurrentLiterature
knowledge that speakers possess and utilize in speech processing. The most
important function of phonology in SPS is, therefore, to put constraints on
what can be expected in the speech stream. A second, more specific function relates to the particular emphasis of the phonological models
mentioned above and outlined in section 4: It has been realized that many
SPSs do not make sufficient use of the suprasegmental aspects of the
speech signal. But it is precisely in the domain of prosody where nonlinear
phonology has made important progress in our insight into the phonological component of language.
From the phonetic point of view, phonological knowledge is higher level
knowledge just as syntactic or semantic information. But since phonological knowledge is in an obvious way closer to the phonetic domain than
syntax or semantics, it is even more surprising that phonological knowledge
has been rarely applied systematically in SPS.
Research into text-to-speech systems has become a rather important topic
in the areas of linguistics and phonetics. Particularly for English, several
text-to-speech systems have been established (cf., for example, Hertz
1982, Klatt 1976). For Dutch, text-to-speech systems are being developed
at the University of Nijmegen (cf. Wester 1984) and at the Universities of
Utrecht and Leyden and at the Institute of Perception Research Eindhoven
as well. In this paper we will be concerned with the grapheme-to-phoneme
conversion component as part of the Dutch text-to-speech system which is
being developed in Utrecht, Leyden, and Eindhoven.
Hugo van Leeuwen
Institute of Perception Research
Eindhoven, The Netherlands
COLING'86, pp. 612-614
Phonological Pivot Parsing
Grzegorz Dogil
Universit~it Bielefeld
Fakult~tt fur Linguistik und
Literaturwissenschaft
D-4800 Bielefeld, West Germany
COLING'86, pp. 615-617
A Description of the VESPRA Speech
Processing System
Roll Haberbeck
FU Berlin, FB Germanistik
D-1000 Berlin 33
TU Berlin, FB Informatik
There are two basic mysteries about natural language: the speed and ease
with which it is acquired by a child, and the speed and ease with which it is
processed. Similarly to language acquisition, language processing faces a
strong input-data-deficiency problem. When we speak we alter a great
deal in the idealized phonological and phonetic representations. We delete
whole phonemes, we radically change allophones, we shift stresses, we
break up intonational patterns, we insert pauses at the most unexpected
places, etc. If to this crippled "phonological string" we add all the noise
from the surroundings which does not help comprehension either, it is
bewildering that the parser is supposed to recognize anything at all.
However, even in the most difficult circumstances (foreign accent, loud
environment, being drunk, etc.), we do comprehend speech quickly and
efficiently. There must be then some signals in the phonetic string which
are particularly easy to grasp and to process. I call these signals pivots and
call the parsers working with these signals pivot parsers.
I present here an idea of what a fast parser which requires the minimum
of phonologically invariant informat;.on might look like. This parser works
in a sequentially-looping manner and the decisions it makes are non-deterministic. It is universally applicable, it is faster, and it seems to be no less
efficient than other phonological parsers that have been proposed.
The VESPRA system is designed for the processing of chains of (not
connected utterances of) wordforms. These strings of wordforms correspond to sentences except that they are not realized in connected speech.
VESPRA means: Verarbeitung und Erkennung gesprochener Sprache
(processing and recognition of speech). VESPRA will be used to control
different types of machines by voice input (for instance: non critical
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
127
The FINITESTRING Newsletter
D-1000 Berlin 10
COLING'86, pp. 618-620
Translation by Understanding: A Machine
Translation System LUTE
Hirosato Nomura, Shozo Naito,
Yasuhiro Katagiri, Akira Shimazu
NTT Basic Research Laboratories
Musashino-shi
Tokyo, 180, Japan
COLING'86, pp. 621-626
On Knowledge-Based Machine Translation
Sergei Nirenburg
Colgate University
Victor Raskin
Purdue University
Allen Tucker
Colgate University
COLING'86, pp. 627-632
Another Stride towards Knowledge-Based
Machine Translation
Masaru Tomita, Jaime Carbonell
Computer Science Department
Carnegie-Mellon University
Pittsburgh, PA 15213
COLING'86, pp. 633-638
English - Malay Translation System:
a Laboratory Prototype
Loon-Cheong Tong
Computer Aided Translation Project
School of Mathematical and Computer
128
Abstracts of Current Literature
control functions in cars and in trucks, voice box in digital telephone
systems, text processing systems, different types of office workstations).
This paper presents a linguistic model for language understanding and
describes its application to an experimental machine translation system
called LUTE. The language understanding model is an interactive model
between the memory structure and a text. The memory structure is hierarchical and represented in a frame-network. Linguistic and non-!inguistic
knowledge is stored and the result of understanding the text is assimilated
into the memory structure. The understanding process is interactive in that
the text invokes knowledge and the understanding procedure interprets the
text by using that knowledge. A linguistic model, called the Extended Case
Structure model, is defined by adopting three kinds of information: structure, relation, and concept. These three are used recursively and iteratively
as the basis for memory organization. These principles are applied to the
design and implementation of the LUTE which translates Japanese i n t o
English and vice versa.
This paper describes the design of the knowledge representation medium
used for representing concepts and assertions, respectively, in a subworld
chosen for a knowledge-based machine translation system. This design is
used in the TRANSPORTATION machine translation project. The knowledge representation language, or interlingua, has two components, DIL
and TIL. DIL stands for 'dictionary of interlingua' and describes the semantics of a subworld. TIL stands for 'text of interlingua' and is responsible
for producing an interlingua text, which represents the meaning of an input
text in the terms of the interlingua. We maintain that involved analysis of
various types of linguistic and encyclopedic meaning is necessary for the
task of automatic translation. The mechanisms for extracting and manipulating and reproducing the meaning of texts will be reported in detail elsewhere. The linguistic (including the syntactic) knowledge about source
and target languages is used by the mechanisms that translate texts into
and from the interlingua. Since interlingua is an artificial language, we can
(and do, through TIL) control the syntax and semantics of the allowed
interlingua elements. The interlingua suggested for TRANSPORTATION
has a broader coverage than other knowledge representation schemata for
natural language. It involves the knowledge about discourse, speech acts,
focus, time, space, and other facets of the overall meaning of texts.
Building on the well-established premise that reliable machine translation
requires a significant degree of text comprehension, this paper presents a
recent advance in multi-lingual knowledge-based machine translation
(KBMT). Unlike previous approaches, the current method provides for
separate syntactic and semantic knowledge sources that are integrated
dynamically for parsing and generation. Such a separation enables the
system to have syntactic grammars, language specific but domain general,
and semantic knowledge bases, domain specific but language general.
Subsequently, grammars and domain knowledge are precompiled automatically in any desired combination to produce very efficient and very thorough real-time parsers. A pilot implementation of our KBMT architecture
using functional grammars and entity-oriented semantics demonstrates the
feasibility of the new approach.
This paper presents the results obtained by an English to Malay computer
translation system at the level of a laboratory prototype. The translation
output obtained for a selected text (secondary school chemistry textbook)
is evaluated using a grading scheme based on ease of post-editing. The
effect of a change in area and typology of text is investigated by comparing
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
Science
Universiti Sains Malaysia
11800 Penang, Malaysia
COLING'86, pp. 639-642
A Prototype Machine Translation based on
Extracts from Data Processing
E. Luctkens, Ph. Fermont
Department of Information Science and
Documentation
Free University of Brussels
Belgium
COLING '86, pp. 643-645
A Prototype English-Japanese Machine
Translation System for Translating IBM
Manuals
Taijiro Tsutsumi
Natural Language Processing
Science Institute, IBM Japan, Ltd.
5-19, Sanban-cho, Chiyoda-ku
Tokyo 102, Japan
Abstracts of Current Literature
with the translation output obtained for a university level computer science
text. An analysis of the problems which give rise to incorrect translations
is discussed. This paper also provides statistical information on the English
to Malay translation system and concludes with an outline of further work
being carried out on this system with the aim of attaining an industrial
prototype.
The following article presents a prototype for the machine translation of
English into French . . . . The prototype aims to provide a diagnostic
study that lays the foundations for further development rather than immediately producing an accurate but limited realization.
By way of experiment, the corpus for translation was based on selected
extracts from computer systems manuals. After studying the basic material, as well as assessing the various decision criteria, it was decided to
construct a prototype made up of three components: analysis, transfer, and
generation.
Although the prototype was designed with multilingual applications in
mind, it appeared preferable at this stage not to set up a system with interlingua since the elaboration of the interlingua alone would have taken up a
disproportionate amount of time, thus handicapping the development of
the prototype itself.
This paper describes a prototype English-Japanese machine-translation
(MT) system developed at the Science Institute of IBM Japan, Ltd. This
MT system currently aims at the translation of IBM computer manuals. It is
based on a transfer approach in which the transfer phase is divided into
two sub-phrases. English transformation and English-Japanese conversion. An outline of the system and a detailed description of the EnglishJapanese transfer method are presented.
COLING'86, pp. 646-648
Construction of a Modular and Portable
Translation System
Fujio Nishida, Yoneharu Fujita,
Shinohu Takamatsu
Department of Electrical Engineering
Faculty of Engineering
University of Osaka Prefecture
Sakai, Osaka, Japan 591
COLING'86, pp. 649-651
When Mariko Talks to Siegfried
Dietmar R6sner
Projekt SEMSYN, Institut fur Informatik
Universit~it Stuttgart, Herdweg 51
D-7000 Stuttgart I West Germany
This paper has two purposes. One of them is to show a method of
constructing an MT system on a library module basis with the aid of a
programming construction system called L-MAPS. The MT system can be
written in any programming language designated by a user if an appropriate data base and the appropriate functions are implemented in advance.
For example, it can be written in a compiler language like C language,
which is preferable for a workstation with a relatively slow running
machine speed.
The other purpose is to give a brief introduction of a program generating system called Library-Module Aided Program Synthesizing system
(abbreviated to L-MAPS) running on a library module basis. L-MAPS
permits us to write program specifications in a restricted natural language
like Japanese and converts them to formal specifications. It refines the
formal specifications using the library modules and generates a readable
comment of the refined specification written in the above natural language
every refinement in option. The conversion between formal expressions
and natural language expressions is performed efficiently on a case grammar basis.
In this paper we will report on our experiences from a two and a half
year project that designed and implemented a prototypical Japanese to G e r m a n
translation system for titles of Japanese papers.
COLING'86, pp. 652- 654
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
129
The FINITE STRING Newsletter
Future Directions of Machine Translation
Jun-ichi Tsujii
Department of Electrical Engineering
Kyoto University
Sakyo-ku, Kyoto 606, Japan
COLING'86, pp. 655-668
Discourse, Anaphora, and Parsing
Mark Johnson
Center for the Study of Language and Information and Department of Linguistics
Stanford University
Ewan Klein
Centre for Cognitive Science
Edinburgh University
Abstracts of Current Literature
In this paper, we will discuss several problems concerned with "understanding and translation", especially how we can integrate the two lines
of research, with their different histories and different techniques, into
unified frameworks, and the difficulties we might encounter in attempting such an integration. The discussion will reveal some of the reasons why MT researchers are so separated from research in the other
application fields of NLP. We will also list some of the key problems, both
linguistic and computational, which we encountered during the development of our MT systems, and whose resolutions we consider to be of
essential importance for future MT research and development.
Discourse Representation Theory, as formulated by Hans Kamp and
others, provides a model of inter- and intra-sentential anaphoric dependencies in natural language. In this paper, we present a reformulation of the
model which, unlike Kamp's is specified declaratively. Moreover, it uses
the same rule formalism for building both syntactic and semantic structures. The model has been implemented in an extension of Prolog, and
runs on a VAX 1 1 / 7 5 0 computer.
COLING'86, pp. 669-675
Selected Dissertation Abstracts
Compiled by:
Susanne M. Humphrey, National Library of Medicine, Bethesda, MD 20209
Bob Krovetz, University of Massachusetts, Amherst, MA 01002
The following are citations selected by title and abstract as being related to computational linguistics or knowledge
representation, resulting from a computer search, using the BRS Information Technologies retrieval service, of the
Dissertation Abstracts International (DAI) data base produced by University Microfilms International.
Included are the title; author; university, degree, and, if available, number of pages; DAI subject category chosen by
the author of the dissertation; an abstract; and the UM order number and year-month of entry into the data base.
References are sorted first by DAI subject category and second by author. Citations denoted by an MAI reference do
not yet have abstracts in the data base and refer to abstracts in the published Masters Abstracts International.
Unless otherwise specified, paper or microform copies of dissertations may be ordered from
University Microfilms International
Dissertation Copies
Post Office Box 1764
Ann Arbor, MI 48106
telephone for U.S. (except Michigan, Hawaii, Alaska): 1-800-521-3042
for Canada: 1-800-268-6090.
Price lists and other ordering and shipping information are in the introduction to the published DAI. An alternate
source for copies is sometimes provided at the end of the abstract.
The dissertation titles and abstracts contained here are published with permission of University Microfilms International, publishers of Dissertation Abstracts International (copyright 1986 by University Microfilms International), and
may not be reproduced without their prior permission.
The Effect of Knowledge Representation
and Psychological Type on Human Understanding in the Human-Computer Interface
People need to understand the logic of an information system because they
must tell the system what to do, how to do it, and determine what the
system did. Because of the limits of human memory, the logic of the
system must be represented in a form that both people and computers can
Wallace Irving Castle
use.
The University of Texas at Arlington Ph.D.
1985, 192 pages
Business Administration, General
One hundred thirty graduate and undergraduate business students
participated in an experiment to evaluate the effect on human understanding of representation type and psychological type. A story represented in
English was compared to a story represented in predicate logic. Psychological type was measured with the Myers-Briggs Type Indicator. Human
understanding was measured with an inference recognition test. The
hypothesis was that sensing psychological types perform better with predi-
University Microfilms International
ADG86-074 78
130
ComputationalLinguistics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
Abstracts of Current Literature
cate logic than with English while intuitive psychological types perform
better with English than with predicate logic.
The results of the experiment supported the hypothesis that representation type and psychological type interact to effect human understanding;
however, both sensing and intuitive psychological types performed better
with predicate logic. The interaction occurred because the sensing psychological types performed as well with English as with predicate logic while
the intuitive psychological types performed well with predicate logic but
poorly with English.
English is not the best representation type for helping the intuitive
psychological type to understand the logic of an information system. The
results of this experiment show that it is not correct to assume that English
is always superior to any other representation regardless of the people
using the system.
In designing the human-computer interface, the alternative representations should be evaluated using an experimental design to determine their
effect on different psychological types of users. The results of an experiment may show that the representation that is easy for the computer may
also be the best for the people using the system.
Towards a Representation of Lisp
Semantics
Nizar Mohammed Awartani
Lehigh University Ph.D. 1986, 89 pages
Computer Science
University Microfilms International
AD.G86-16151
Debugging Programs in a Distributed
System Environment
Peter Charles Bates
University of Massachusetts Ph.D. 1986,
239 pages
Computer Science
University Microfilms International
ADG86-12013
In this dissertation we chose to examine the semantics of a subset of RLISP
and give its formal specifications. In this subset we do not include loops or
recursive procedures.
We defined the language elements which are fundamental to the statements within the language such as symbolic expressions, variables, symbolic expressions with quotation marks and a few others. By this we gain
precision and completeness in RLISP specification at the fundamental level.
We described RLISP syntactic constructs in a consistent way on a single
logical level. The dissertation presents a system of formal rules that permit
the establishment of rigorous proofs using only the uninterpreted program
text. The method we used depended on repeated substitutions for occurrences of expressions in a given RLISP program.
To explore the subtlety of RLISP we included an informal description of
the rules and provided several examples illustrating them.
Debugging is an activity that attempts to locate the sources of errors in the
specification and coding of a software system and to suggest possible
repairs that might be made to correct the errors. Debugging complex
distributed programs is a frustrating and difficult task. This is due primarily to the predominance of a low-level, computation-unit view of systems.
This extant perspective is necessarily detail intensive and offers little aid in
dealing with the higher level operational characteristics of a system or the
complexities inherent in distributed systems.
In this dissertation we develop a high-level debugging approach in which
debugging is viewed as a process of creating models of actual behavior
from the activity of the system and comparing these to models of expected
system behavior. The differences between the actual and expected models
can be used to characterize errorful behavior. The basis for the approach
is viewing the activity of a system as consisting of a stream of significant,
distinguishable events that may be abstracted into high-level models of
system behavior. An example is presented to demonstrate the use of event
based model building to investigate an error in a distributed program.
Behavior abstraction and system understanding are characterized as
problems in pattern recognition that must operate in a noisy, uncertain
environment. Pattern recognition in support of behavioral abstraction is
thus shown to be more than a simple parsing exercise. A formal model is
developed for event based behavioral abstraction which provides a basis
for rigorous discussions of debugging as behavior modelling and forms a
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
131
The FINITE STRING Newsletter
Abstracts of Current Literature
guide for implementing tools to support debugging in terms of events and
higher level abstractions of system behavior.
A prototype distributed behavior recognition system which has been
constructed to demonstrate and evaluate the feasibility of the EBBA
approach is described. The prototype toolset identifies a range of debugging tools useful for distributed systems. Remote debugging, filtered
remote debugging with preset actions, simple cooperative debugging, and
distributed debugging progressively increase the power of debugging
agents at individual nodes by reducing communication requirements,
increasing overall transparency of the debugging tools, and distributing
debugging tool functionality throughout the system.
Semantic Query Optimization in Deductive
Data Bases. (Volumes I and II)
Upendranath Sharma Chakravarthy
University of Maryland Ph.D. 1985,
304 pages
Computer Science
University Microfilms International
A DG86-08788
This thesis addresses the problem of efficient query evaluation over a
deductive data base and proposes several methods to optimize the evaluation of a query. The problems addressed in this thesis and the solutions
proposed, under the central theme of query optimization, can be discussed
under (i) Techniques for interfacing PROLOG with relational data bases,
(ii) A formalism for semantic query optimization using integrity constraints, and (iii) Multiple query evaluation in deductive data bases.
We propose several ways in which a PROLOG interpreter can be modified so that it can be interfaced effectively with a database system. Three
solutions, namely, a simple modification to the PROLOG query evaluation
strategy to accomplish the complied approach, a meta-level interpreter
without any modifications to PROLOG and a set evaluation strategy using
tables, are proposed in this thesis.
A general framework in which domain specific knowledge - in the form
of integrity constraints - is used to transform a query, is proposed in this
thesis and is termed semantic query optimization. The process of semantic
query optimization is carried out in two phases. Initially, the axioms of a
data base are semantically compiled, wherein integrity constraints are integrated into the axioms in a suitable manner. Semantic compilation is
performed only once prior to the submission of any query. Subsequently,
the compiled axioms are utilized for query transformation at the time of
query evaluation. The transformed query has restrictions imposed on it by
the integrity constraints and hence it may be evaluated more efficiently
over the data base than the original query.
Multiple queries arise in several contexts. In the case of deductive data
bases, a single query on an intensional predicate may result in several
disjunctive queries which may have overlapping computations.
We extend the connection graph decomposition algorithm to generate a
single plan for a set of disjunctive queries. A multi-query graph is used as
a non-procedural representation for a set of queries. The algorithm
proposed in this thesis minimizes the number of accesses to the secondary
storage where the relations are physically stored as well as the total
number of joins.
Visual Programming with Icons
This dissertation describes the design approach of an iconic system on a
modern lisp machine. The proposed system has applications in image
information system, visual programming, computer aided design (CAD),
and multimedia communications. Potential applications include computer
vision systems, visual languages, and iconic expert systems.
A generalized definition of an icon is proposed: a visual representation
of an object (physical or abstract) which has relational dependencies with
other icons. An experimental iconic system has been designed around an
Icon Manager and an Icon Editor. The Icon Manager includes facilities for
icon creation, icon interpretation, icon exploration, icon saving (on files),
and an interactively programmable menu system; it provides basic support
to create icons and relate them to other icons. The Icon Editor supports
Olivier Bernard Clarisse
Illinois Institute of Technology Ph.D. 1985,
256 pages
Computer Science
University Microfilms International
ADG86-06485
132
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
T h e F I N I T E S T R I N G Newsletter
Abstracts of Current Literature
several methods to interactively edit an icon from sketch representations or
icon structure representations. The overall system allows the creation and
organization of flat pictorial objects of any shape in a two and a half
dimensional space.
The image processing tools required by the iconic system include a
general technique for halftone transformation of images, a general region
growing technique, and methods for progressive transmission of images.
Applications of this iconic programming environment to visual programming, to program design and electronic circuit design (from icon selection
and editing), to knowledge systems based on icon graph matching and to
multimedia communications are studied in detail. Finally, a possible hardware structure to support an icon management system is proposed which is
based on a pyramid of microprocessors architecture.
Hierarchical Reasoning: Simulating
Complex Processes over Multiple Levels
of Abstraction
Paul Anthony Fishwick
University of Pennsylvania Ph.D. 1986,
196 pages
Computer Science
University Microfilms International
ADG86-14793
Linguistic Solid Modeling using Graph
Grammars
Patrick Arthur Fitzhorn
Colorado State University Ph.D. 1985,
105 pages
Computer Science
University Microfilms International
ADG86-07641
This thesis describes a method for simulating processes over multiple levels
of abstraction. There has been recent work with respect to data, object,
and problem-solving abstraction; however, abstraction in simulation has
not been adequately explored. We define a process as a hierarchy of
distinct production rule sets that interface to each other so that abstraction
levels may be bridged where desired. In this way, the process may be
studied at abstraction levels that are appropriate for the specific task:
notions of qualitative and quantitative simulation are integrated to form a
complete process description. The advantages to such a description are
increased control, computational efficiency, and selective reporting of
simulation results. Within the framework of hierarchical reasoning, we will
concentrate on presenting the primary concept of process abstraction.
A C o m m o n Lisp implementation of the hierarchical reasoning theory
called HIRES is presented. HIRES allows the user to reason in a hierarchical fashion by relating certain facets of the simulation to levels of
abstraction specified in terms of actions, objects, reports, and time. The
user is free to reason about a process over multiple levels by weaving
through the levels either manually or via automatically controlled specifications. Capabilities exist in HIRES to facilitate the creation of graph-based
abstraction levels. For instance, the analyst can create continuous system
models (CSMP), petri net models, scripts, or generic graph models that
define the process model at a given level. We present a four-level elevator
system and a two-level "dining philosophers" simulation to demonstrate
the efficacy of process abstraction.
The goal of this work is to develop formal relationships between language
theory and topologically correct computer representations of objects in
Euclidean three-space (E3), that is, physical solids. Thus the concern is to
generate grammars whose languages can be interpreted as classes of
representations of possibly proper subsets of physical solids. A methodology is then studied for the implementation of the developed grammars.
The grammars of interest are variants of graph grammars whose languages are sets of directed graphs with node and edge labels, and whose
productions rewrite graphs into other graphs. Graph grammars are of
interest here since they generate structures similar to plane models (topological representations of the class of 3D solids). Since it can be shown
that plane models are sufficient representations of the topology of E 3 polytopes, a class of graph grammars that generate all such models should be of
interest. The strings generated by these grammars will then be representations of physical solids that, although based on formal topological guarantees, can be manipulated with formal language theory.
Computer implementation of the grammars is considered, and it is
shown that a natural method that encompasses storage of the representation, as well as the grammar itself, is one based on the predicate calculus.
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
133
The FINITE STRING Newsletter
Abstractsof CurrentLiterature
In this implementation, the vertices and edges of a representation are
stored as facts in a logic data base, while the grammar that rewrites subsets
of the graph with other graphs becomes a set of relations on graphs. The
programming language PROLOG is used for implementation, since it is
based closely on the first order predicate calculus.
In conclusion, it is shown that the current, major representations of
physical solids have analogs in the developed graph grammars to the same
level of representation validity. That being the case, graph grammars can
replace current, heuristic implementations of physical solid representations
with formal methods from language theory.
A Logic Data Model for the Machine
Representation of Knowledge
Randolph George Goebel
The University of British Columbia (Canada)
Ph.D. 1985
Computer Science
This item is not available from University
Microfilms International.
ADG05-58418
A Fully Lazy Higher Order Purely Functional Programming Language with
Reduction Semantics
Kevin John Greene
Syracuse University Ph.D. 1985, 262 pages
Computer Science
University Microfilms International
ADG86-03 760
134
DLOG is a logic-based data model developed to show how logic-programming can combine contributions of Data Base Management (DBM) and
Artificial Intelligence (AI). The DLOG data model is based on a logical
formulation that is a superset of the relation data model (Reiter83), and
uses Bowen and Kowalski's notion of an amalgamated meta and object
language (Bowen82) to describe the relationship between data model
objects. The DLOG specification includes a language syntax, a proof (or
query evaluation) procedure, a description of the language's semantics, and
a specification of the relationships between assertions, queries, and application data bases.
DLOG's basic data description language is the Horn clause subset of
first order logic (Kowalski79, Kowalski81), together with embedded
descriptive terms and non-Horn integrity constraints. The embedded terms
are motivated by Artificial Intelligence representation language ideas,
specifically, the descriptive terms of the KRL language (Bobrow77). A
similar facility based on logical descriptions is provided in DLOG. The
DLOG language permits the use of definite and indefinite descriptions of
individuals and sets in both queries and assertions.
The meaning of DLOG's extended language is specified by writing Horn
clauses that describe the relation between the basic language and the
extensions. The experimental implementation is the appropriate Prolog
program derived from that specification.
The DLOG implementation relies on an extension to the standard Prolog
proof procedure. This includes a "unification" procedure that matches
embedded terms by recursively invoking the DLOG proof procedure (cf.,
&loglisp. Robinson82). The experimental system includes logic-based
implementations of traditional database facilities (e.g., transactions, integrity constraints, data dictionaries, data manipulation language facilities),
and an idea for using logic as the basis for heuristic interpretation of
queries. This heuristic uses a notion of partial match or sub-proof to
produce assumptions under which plausible query answers can be derived.
The experimental DLOG database (or "knowledge base") management
system is exercised by describing an undergraduate degree program. The
example application database is a description of the Bachelor of Computer
Science degree requirements at The University of British Columbia. This
application demonstrates how DLOG's embedded terms provide a concise
description of degree program knowledge, and how that knowledge is used
to specify student programs, and select program options.
In the first third of this thesis, three well-known reduction calculi - A.
Church's X-calculus, M. Schonfinkel's SKI-calculus, and C. P. Wadsworth's graph oriented A-calculus - are defined. Schonfinkel's classic
transformation of ?,-calculus well-formed formulas (wffs) into variable-free
SKI-calculus wffs is also presented. A new notion, lazy-normal form, a
generalization of the SKI-calculus concept of normal form, is then defined
and compared with Wadsworth's concept of head-normal form. Head-normal form is a generalized notion of normal form in the A-calculus. It is
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
Abstracts of Current Literature
demonstrated that a SKI-calculus wff in lazy-normal form is an outline of
the wff's normal form (if one exists) - i.e., its normal form will have the
same initial atom and the same number of arguments. Other results relating h-calculus wffs in head-normal form to SKI-calculus wffs in lazy-normal form are stated and proved.
The ideas behind M. Schonfinkel's SKI-calculus, C. P. Wadsworth's
graph oriented h-G-calculus, and D. A. Turner's SASL implementation are
combined with the concept of lazy-normal form to produce a new deterministic combinator based graph and machine oriented reduction calculus:
the LFN-calculus. The LFN-calculus is equivalent in power to the
h-calculus et al., but is much more directly and efficiently implementable.
This is due primarily to the structure sharing properties of the LFN-calculus
wffs. Both garbage nodes and forwarding arcs (indirection pointers),
concepts that are usually relegated to a calculus's implementation, are
given formal definitions in this calculus.
The design and experimental Lisp Machine implementation of LFN, a
fully lazy higher order purely functional programming language with
reduction semantics, are discussed. The LFN compiler transforms high
level expressions into representations of LFN-calculus wffs. LFN's runtime
system, a direct realization of the LFN-calculus's "is reducible to" relation,
takes as input LFN-calculus wffs and produces irreducible wffs (wffs in
lazy-normal form) as result. The thesis ends with brief discussions of
alternate approaches to functional programming language compilation and
runtime system organization.
Learning by Understanding Analogies
Russell Greiner
Stanford University Ph.D. 1985, 417 pages
Computer Science
University Microfilms International
ADG86-02479
Pattern-Based and Knowledge-Directed
Query Compilation for Recursive Data
Bases
The phenomenon of learning has intrigued scholars for ages; this fascination is reflected in Artificial Intelligence, which has always considered
learning to be one of its major challenges. This dissertation provides a
formal account of one mode of learning, learning by analogy. In particular, it defines the useful analogical inference process (UAI), which uses a
given analogical hint of the form " A is like B" and a particular target problem to map known facts about B onto proposed conjectures about A. UAI
only considers conjectures which are useful to the target problem; that is,
the conjectures must lead to a plausible solution to that problem.
To construct a procedure that can effectively find these useful analogies,
we use two sets of heuristics to refine the general UA| process. The first set
is based on the intuition that useful analogies often correspond to
"coherent" clusters of facts. This suggests that UAI seeks only the analogies that correspond to common abstractions, where abstractions are
relations that encode solution methods to past problems. The other set of
rules embody the claim that "better analogies impose fewer constraints on
the world". Basically, these rules prefer the analogies which require the
fewest additional conjectures.
This dissertation also describes a running program, NLAG, which implements this model of analogy. It is then used in a battery of tests, designed
to empirically validate our claim that UAI is an effective technique for
acquiring new facts. This data also demonstrates that the heuristics are
effective, and suggests why.
In summary, the primary contributions of this research are (1) a formal
definition of UAI, described semantically (using a new variant of Tarskian
semantics), syntactically and operationally; (2) a collection of heuristics
which efficiently guide this process towards useful analogies; and (3) various empirical results, which illustrate the source of power underlying this
approach.
Expert database systems (EDSs) comprise an interesting class of computer
systems which represent a confluence of research in artificial intelligence,
logic, and database management systems. They involve knowledge-direct-
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
135
The FINITE STRING Newsletter
Jiawei Han
The University of Wisconsin - Madison Ph.D.
1985, 216 pages
Computer Science
University Microfilms International
ADG86-01539
Abstracts of Current Literature
ed processing of large volumes of shared information and constitute a new
generation of knowledge management systems•
Our research is on the deductive augmentation of relational database
systems, especially on the efficient realization of recursion, We study the
compilation and processing of recursive rules in relational database
systems, investigating two related approaches: pattern-based recursive rule
compilation and knowledge-directed recursive rule compilation and planning•
Pattern-based recursive rule compilation is a method of compiling and
processing recursive rules based on their recurslon patterns• We classify
recursive rules according to their processing complexity and develop three
kinds of algorithms for compiling and processing different classes of recursive rules: transitive closure algorithms, SLSR wavefront algorithms, and
stack-directed compilation algorithms• These algorithms, though distinct,
are closely related. The more complex algorithms are generalizations of
the simpler ones, and all apply the heuristics of performing selection first
and utilizing previous processing results (wavefronts) in reducing query
processing costs. The algorithms are formally described and verified, and
important aspects of their behavior are analyzed and experimentally tested.
To further improve search efficiency, a knowledge-directed recursive
rule compilation and planning technique is introduced• We analyze the
issues raised for the compilation of recursive rules and propose to deal with
them by incorporating functional definitions, domain-specific knowledge,
query constants, and a planning technique• A prototype knowledge-directed relational planner, RELPLAN, which maintains a high level user view
and query interface, has been designed and implemented, and experiments
with the prototype are reported and illustrated.
•
A Theory of Scalar lmplicature
Julia Bell Hirschberg
University of Pennsylvania Ph.D. 1985,
230 pages
Computer Science
University Microfilms International
ADG86-03648
136
#.
Speakers may convey many sorts of 'meaning' via an utterance. While
each of these contributes to the utterance's overall communicative effect,
many are not captured by a truth-functional semantics. One class of nontruth-functional, context-dependent meanings, has been identified by
Griee (1975) as conversational implicatures. This thesis presents a formal
account of one type of conversational implicature, termed here scalar implicature identified from a study of a large corpus of naturally occurring data
collected by the author and others from 1982 through 1985. Scalar implicatures rely for their generation and interpretation upon the assumption
that cooperative speakers will say as much as they truthfully can that is
relevant to a conversational exchange. For example, B's utterance of (la):
(1) A: How was the party last night?
a.
B:
Some people left early.
b.
Not all people left early•
may convey to A that, as far as B knows, ( l b ) also holds - even though
the truth of ( l b ) clearly does not follow from the truth of (la).
Scalar implicatures may be distinguished from other conversational
implicatures in that their generation and interpretation is dependent upon
the identification of some salient relation that orders a concept referred to
in an utterance with other concepts• In 1, for example, the salience of an
inclusion relation between 'some people' and 'all people' in the discourse is
prerequisite to B's implicating that ( l b ) - and to A's understanding that
(1 b) has in fact been implicated.
To illustrate potential applications of the theory presented, a module of
a natural-language interface, QUASI, is described. QUASI calculates scalar
implicatures that might be licensed by simple direct responses to yes-no
questions. Where licensable implicatures are not consistent with the
system's knowledge base, QUASI proposes alternative responses• This
system demonstrates how natural language interfaces can use the calcu-
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
Abstracts of Current Literature
lation of implicit meanings to avoid conveying misinformation and to
convey desired information more succinctly.
A Knowledge-based Approach to
Language Production
Paul Schafran Jacobs
University of California, Berkeley Ph.D.
1985, 278 pages
Computer Science
University Microfilms International
ADG86-10067
A Knowledge-based System for Debugging
Concurrent Software
Carol Helfgott LeDoux
University of California, Los Angeles Ph.D.
1985, 322 pages
Computer Science
University Microfilms International
ADG86-03965
The development of natural language interfaces to Artificial Intelligence
systems is dependent on the representation of knowledge. A major impediment to building such systems has been the difficulty in adding sufficient
linguistic and conceptual knowledge to extend and adapt their capabilities.
This difficulty has been apparent in systems which perform the task o f
language production, i. e. the generation of natural language output to
satisfy the communicative requirements of a system.
The problem of extending and adapting linguistic capabilities is
rooted in the problem of integrating abstract and specialized knowledge
and applying this knowledge to the language processing task. Three
aspects of a knowledge representation system are highlighted by this problem: hierarchy, or the ability to represent relationships between abstract
and specific knowledge structures; explicit referential knowledge, or knowledge about relationships among concepts used in referring to concepts;
and uniformity, the use of a common framework for linguistic and conceptual knowledge. The knowledge-based approach to language production
addresses the language generation task from within the broader context of
the representation and application of conceptual and linguistic knowledge.
This knowledge-based approach has led to the design and implementation of a knowledge representation framework, called Ace, geared towards
facilitating the interaction of linguistic and conceptual knowledge in
language processing. Ace is a uniform, hierarchical representation system,
which facilitates the use of abstractions in the encoding of specialized
knowledge and the representation of the referential and metaphorical
relationships among concepts.
A general-purpose natural language generator, KING (Knowledge
INtensive Generator), has been implemented to apply knowledge in the
Ace form. The generator is designed for knowledge-intensivity and incrementality, to exploit the power of the Ace knowledge in generation. The
generator works by applying structured associations, or mappings, from
conceptual to linguistic structures, and combining these structures into
grammatical utterances. This has proven to be a simple but powerful
mechanism, easy to adapt and extend, and has provided strong support for
the role of conceptual organization in language generation.
The recent development of high-level concurrent programming languages
has emphasized the problem of limited debugging tools to support the
development of applications using these languages. A new approach is
necessary to improve the efficacy of debugging tools and to adapt them to
the framework of a concurrent software environment.
A knowledge-based debugging approach is presented that aids diagnosis
of a variety of run-time errors that can occur in concurrent programs written in the Ada 1 programming language. In this approach, an event stream
of program activity is captured in an historical database and accessed using
Prolog-based queries constrained by temporal-logic primitives. Diagnosis
is aided by applying rule-based descriptions of some common classes of
software errors and by matching program specifications against the trace
data base.
This approach was used in building a prototype debugger, called Your
Own Debugger for Ada (YODA). The design of YODA is described and
analyses of several sample Ada programs are presented to illustrate diagnosis of errors associated with concurrency, including deadness errors and
misuse of shared data.
1Ada is a registered trademark of the U.S. Government - Ada Joint
Program Office.
ComputationalLinguistics, Volume 13, Numbers 1-2, January-June 1987
137
The FINITE STRING Newsletter
Plan Recognition and Discourse Analysis:
an Integrated Approach for Understanding
Dialogues
Diane Judith Litman
The University of Rochester Ph.D. 1986,
197 pages
Computer Science
University Microfilms International
ADG86-10863
Correcting object-related misconceptions
Kathleen Filliben McCoy
University of PennsYlvania Ph.D. 1985,
166 pages
Computer Science
University Microfilms International
ADG86-03674
Inferring Domain Plans in QuestionAnswering
Martha Elizabeth Pollack
138
Abstracts of CurrentLiterature
One promising computational approach to understanding dialogues has
involved modeling the goals of the speakers in the domain of discourse. In
general, these models work well as long as the topic follows the goal structure closely, but they have difficulty accounting for interrupting subdialogues such as clarifications and corrections. Furthermore, such models
are typically unable to use many processing clues provided by the linguistic
phenomena of the dialogues.
This dissertation presents a computational theory and partial implementation of a discourse level model of dialogue understanding. T h e theory
extends and integrates plan-based and linguistic-based approaches to
language processing, arguing that such a synthesis is needed to computationally handle many discourse level phenomena present in naturally occurring dialogues. The simple, fairly syntactic results of discourse analysis
(for example, explanations of phenomena in terms of very local discourse
contexts as well as correlations between syntactic devices and discourse
function) will be input to the plan recognition system, while the more
complex inferential processes relating utterances have been totally reformulated within a plan-based framework. Such an integration has led to a
new model of plan recognition, one that constructs a hierarchy of domain
and meta-plans via the process of constraint satisfaction. Furthermore, the
processing of the plan recognizer is explicitly coordinated with a set of
linguistic clues. The resulting framework handles a wide variety of difficult
linguistic phenomena (for example, interruptions, fragmental and elliptical
utterances, and presence as well as absence of syntactic discourse clues),
while maintaining the computational advantages of the plan-based
approach. The implementation of the plan recognition aspects of this
framework also addresses two difficult issues of knowledge representation
inherent in any plan recognition task.
Analysis of a corpus of naturally occurring data shows that users conversing with a database or expert system are likely to reveal misconceptions
about the objects modelled by the system. Further analysis reveals that
the sort of responses given when such misconceptions are encountered
depends greatly on the discourse context. This work develops a contextsensitive method for automatically generating responses to object-related
misconceptions with the goal of incorporating a correction module in the
front-end Of a database or expert system. The method is demonstrated
through the ROMPER system (Responding to Object-related Misconceptions using PERspective), which is able to generate responses to two
classes of object-related misconceptions: misclassifications and misattributions.
The transcript analysis reveals a number of specific strategies used by
human experts to correct misconceptions, where each different strategy
refutes a different kind of support for the misconception. In this work
each strategy is paired with a structural specificati6n of the kind of support
it refutes. ROMPER uses this specification, and a model of the user, to
determine which kind of support is most likely. The corresponding
response strategy is then instantiated.
The above process is made context sensitive by a proposed addition to
standard knowledge-representation systems termed object perspective.
Object perspective is introduced as a method for augmenting a standard
knowledge-representation system to reflect the highlighting affects of
previous discourse. It is shown how this resulting highlighting can be used
to account for the context-sensitive requirements of the correction process.
The importance of plan inference in models of conversation has been
widely noted in the computational-linguistics literature, and its incorporation in question-answering systems has enabled a range of cooperative
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
Abstracts of Current Literature
University of Pennsylvania Ph.D. 1986,
191 pages
Computer Science
University Microfilms International
ADG86-14850
behaviors. The plan inference process in each of these systems, however,
has assumed that the questioner (Q) whose plan is being inferred and the
respondent (R) who is drawing the inference have identical beliefs about
the actions in the domain. I demonstrate that this assumption is too strong,
and often results in failure not only of the plan inference process but also
of the communicative process that plan inference is meant to support. In
particular, it precludes the principled generation of appropriate responses
to queries that arise from invalid plans. I present a model of plan inference
in conversation that distinguishes between the beliefs of the questioner and
the beliefs of the respondent. This model rests on an account of plans as
mental phenomena: "having a plan" is analyzed as having a particular
configuration of beliefs and intentions. Judgments that a plan is invalid are
associated with particular discrepancies between the beliefs that R ascribes
to Q, when R believes Q has some particular plan, and the beliefs R herself
holds. I define several types of invalidities from which a plan may suffer,
relating each to a particular type of belief discrepancy, and show that the
types of any invalidities judged to be present in the plan underlying a query
can affect the content of a cooperative response. The plan inference
model has been implemented in SPIRIT - a System for Plan Inference that
Reasons about Invalidities Too - which reasons about plans underlying
queries in the domain of computer mail.
Rational Interaction: Cooperation among
Intelligent Agents
Jeffrey Solomon Rosenschein
Stanford University Ph.D. 1986, 145 pages
Computer Science
University Microfilms International
ADG86-08219
The development of intelligent agents presents opportunities to exploit
intelligent cooperation. Before this can occur, however, a framework must
be built for reasoning about interactions. This dissertation describes such a
framework, and explores strategies of interaction among intelligent agents.
The formalism that has been developed removes some serious restrictions that underlie previous research in distributed artificial intelligence, particularly the assumption that the interacting agents have identical
or non-conflicting goals. The formalism allows each agent to make various
assumptions about both the goals and the rationality of other agents. A
hierarchy of rationality assumptions is presented, along with an analysis of
the consequences that result when an agent believes a particular level in
the hierarchy describes other agents' rationality.
In addition, the formalism presented allows the modeling of restrictions
on communication and the modeling of binding promises among agents.
Computation on the part of each individual agent can often obviate the
need for inter-agent communication. However, when communication and
promises are allowed, fewer assumptions need be made about the rationality of other agents when choosing one's own rational course of action.
Recursions and Rule Selections on a High
Level Relation Processor for KnowledgeBase Machine
Dongpil Shin
The University of Oklahoma Ph.D. 1986,
132 pages
Computer Science
University Microfilms International
ADG86-13735
For the development of knowledge-based systems, various ways of incorporating a relational database system into a PROLOG-based questionanswering system have been investigated. To improve the performance
of a knowledge-based system, a deductive search involving a large set of
facts is performed separately by a relational database subsystem. The correctness property of an inference performed by such a system is formally
studied.
Problems associated with the scheme, such as determining a termination
in a recursion, incorporating " c u t " operations, and the capabilities of
relation processors of performing these operations, are studied. To solve
these problems, first, database queries are classified into six levels based on
their operations so that queries involvingrecursions and " c u t " can be identified. Next, relation processors are also classified so that corresponding
query expressions can be evaluated. Then, a high-level machine, Data flow
Relation Processor (DFRP), is designed so that all the problems defined
previously can be solved with this machine.
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
139
The FINITE STRING Newsletter
Abstracts of Current Literature
The closed-connection graph is introduced so that a recursion could be
visualized. The conditions for terminating a recursion are defined in terms
of the closed-connection graph. A procedure that synthesizes a level-5
query is developed. Finally, functional definitions of DFRP are studied to
evaluate a recursive query, and its simulation model is built. The major
results of the simulation are (i) a binary join takes a constant time regardless of the cardinality of relations; (ii) n-ary join takes O(n) time, where n
is the number of relations involved; (iii) DFRP is 104 times faster in
performing a binary join operation than A which is the intermediate result
of Japan's fifth-generation computer project; (iv) a knowledge-based
system with DFRP is 10 to 20 times faster than MPDC in performing
PROLOG queries of 2 to 30 subgoals, each involving 4096 facts.
Controlling Inference
David Eugene Smith
Stanford University Ph.D. 1985, 237 pages
Computer Science
University Microfilms International
ADG86-02539
Effective control of inference is a fundamental problem in Artificial Intelligence. Unguided inference leads to a combinatorial explosion of facts or
subgoals for even simple domains. To overcome this problem, expert
systems have used powerful domain-dependent control information in
conjunction with syntactic domain-independent methods like depth-first
backward chaining. While this is possible for some applications, it is not
always feasible or appropriate for problem solvers that must solve a wide
variety of different problems. In this dissertation I argue that a kind of
semi-independent control is essential for problem solvers that must face a
wide variety of different problems.
Semi-independent control is based on the idea that there is underlying
domain-independent rationale behind any good control decision. This
rationale takes the form of simple utility theory applied to the expected
cost and probability of success of different inference steps and strategies.
These basic principles are domain-independent, but their application to any
particular problem relies on global information about the nature and extent
of facts and rules in the problem solver's data base.
This approach to control is used in the solution of four different control
problems: halting inference when all answers to a query have been found,
halting recursive inference, ordering conjunctive queries when no inference
is involved, and choosing the best inference step for problems where only a
single answer is required. The first two control problems are cases of
recognizing redundant portions of a search space, while the final two cases
involve computing the expected cost for alternative strategies to a problem.
Several novel theorems about control (for specific situations) are developed in these case studies.
The issue of efficiency is also addressed. Semi-independent control
often involves considerable computation, and may not be cost-effective for
the majority of problems encountered in a particular domain. Interleaving
of inference and control is proposed as a means of making this kind of
control practical.
The Essence of Rum: a Theory of the
Intensional and Extensional Aspects of
Lisp-type Computation
Carolyn L. Talcott
R u m is a theory of applicative, side-effect free computations over an algebraic data structure. It goes beyond a theory of functions computed by
programs, treating both intensional and extensional aspects of computation. Powerful programming tools such as streams, object-oriented
programming, escape mechanisms, and co-routines can be represented.
Intensional properties include the number of multiplications executed, the
number of the context switches, and the maximum stack depth required in
a computation. Extensional properties include notions of equality for
streams and co-routines and characterization of functionals implementing
strategies for searching tree-structured spaces. Precise definitions of
informal concepts such as stream and co-routine are given and their mathematical theory is developed. Operations on programs treated include
program transformations which introduce functional and control
Stanford University Ph.D. 1985, 248 pages
Computer Science
University Microfilms International
ADG86-02549
140
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
Abstracts of Current Literature
abstractions; a compiling morphism that provides a representation of
control abstractions as functional abstractions; and operations that transform intensional properties to extensional properties. The goal is not only
to account for programming practice in Lisp, but also to improve practice
by providing mathematical tools for developing programs and building
programming systems.
Rum views computation as a process of generating computation structures - trees for context-independent computations and sequences for
context-dependent computations. The recursion theorem gives a fixedpoint function that computes computationally minimal fixed points. The
context insensitivity theorem says that context-dependent computations
are uniformly parameterized by the calling context and that computations
in which context dependence is localized can be treated like context-independent computations. Rum machine structure and morphism are introduced to define and prove properties of compliers. The hierarchy of
comparison relations on programs ranges from intensional equality to
maximum approximation and equivalence relations that are extensional.
The fixed-point function computes the least fixed point with respect to the
maximum approximation. Comparison relations, combined with the interpretation of programs using computation structures, provide operations on
programs both with meanings to preserve and meanings to transform.
INT-AID: The Intelligent Aid for Relational Database Construction
Mustafa Mahmoud
Lehigh University Ph.D. 1986, 214 pages
Computer Science
University Microfilms International
ADG86-16179
The Interactive Effects of Micro- and
Macrostructural Processing during Text
Comprehension
Nicholas Geleta
The Catholic University of America Ph.D.
1986, 170 pages
Education, Psychology
University Microfilms International
ADG86-13460
In this dissertation we developed a system, INT-AID: The INTelligent AID
for relational database construction. It is an intelligent interactive system
that aids the relational database systems designers in constructing a good
design. We defined a workable methodology by integrating a wide variety
of algorithms, theories, and techniques in one system. The system uses a
set of Functional Dependencies (FDs) to construct relations in Third
Normal Form (3NF) following the Synthetic Approach in relational database design.
We proposed a novel methodology in deriving and generating a set of
functional dependencies. Unlike the standard conventional method which
uses structured analysis techniques to build a set of FDs; our approach is
an unstructured method that does not depend on structured analysis. This
new method suits the evolutionary approach in database design. In our
approach we deal with an incoherent body of data, that contains many
unrelated and unstructured facts. From this body of data we want to let
the natural relationships emerge dynamically, rather than imposing unnatural relationships on the data. As a result of this we might uncover some
hidden relationships that were not known before. This new approach is an
attempt towards the establishment of causal effects. We developed a
formula using mathematical induction, to give the total number of what we
called proposed FDs (their validity is yet to be determined).
The Synthetic Approach, which we followed in our system to construct
3NF relations, does not always produce relations that are lossless with
respect t o the join operation. To overcome this shortcoming we added an
algorithm to check whether the synthesized relations are lossless with
respect to join or not.
This research investigated the text comprehension model first described in
Kintsch and van Dijk (1978) and later elaborated by van Dijk and Kintsch
(1983). The central research question involved determining if certain
micro- and macropropositions were accessible to the r e a d e r / c o m p r e h e n d e r
in working-memory during on-going reading of prose passages. Sixty
subjects read texts presented one processing cycle at a time on the display
screen of an Apple lie microcomputer. While reading they were interrupted by a probe to a specific Micro- (Mi) or Macroproposition (Ma)
that varied in terms of the processing cycle in which it appeared before
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
t41
The FINITE STRING Newsletter
Abstracts of Current Literature
presentation of the probe-(i.e., Next-To-Last Cycle (NTLC) or Prior Cycle
(PC)). The response time, measured in milliseconds, of a subject's judgement as to whether the probe's informational content was consistent with
that of the passage was recorded. It was hypothesized that NTLC/Mi's,
NTLC/Ma's and PC/Ma's would be responded to more rapidly than PC/Mi's
since the propositions were predicted to be resident in working-memory.
The results supported this prediction. It was concluded that good readers
construct micro- and macrostructural representations of the text on-line.
Time Course of Activation for High and
Low Centrality Nouns in Scripts
Jacqueline Sullivan Gorski
The Catholic University of America Ph.D.
1986, 206 pages
Education, Psychology
University Microfilms International
ADG86-032 79
The Effects of Expertise and Sentence
Form on Reading Rate and Vocalization
Latency
Ann Lamiell Landy
The Pennsylvania State University Ph.D.
1986, 187 pages
Education, Psychology
University Microfilms International
ADG86-15210
142
This study investigated whether scripts, such as eating in a restaurant, are
prestored or consciously constructed long term memory units and whether
centrality is an organizing dimension. Five models for the activation of
script nouns were proposed. These were differentiated by their predictions
for the time course of activation for high and low centrality script nouns at
each of three intervals.
Sixty subjects generated script nouns. Twenty rated them on centrality.
Forty-eight subjects participated in a computerized lexical decision task.
Primes were script names and neutral XXXs. Targets were high and low
centrality script nouns and nonwords. When the prime was a word and the
target was a word, the prime either named the same script from which the
word was taken or a different script. The interval between prime and
target was varied between 250, 500, and 750 msec. The dependent variable was time to respond word or nonword to the target.
Scores indicating whether same or different script primes facilitated or
inhibited responding were computed by subtracting response times after
script primes from response times after XXX primes. Facilitation was indicated by positive and inhibition by negative values. T-tests were
conducted on mean scores for high and low centrality script nouns at each
interval to determine the type (automatic or conscious) and extent of activation as indicated by the observed facilitation. Analyses of variance were
conducted separately on scores for same and different script primes to
identify possible effects due to list, time interval, and centrality.
Results supported the Prestorage and Computation model. Same script
primed responses to high centrality nouns were facilitated at all three intervals, while those for low centrality nouns were facilitated only at the longest. This suggested that highly important script concepts form a prestored
unit which is automatically activated, while less important concepts must
be consciously activated. An associative network theory of script memory
representation accounted well for the data (cf. Yekovich & Walker 1985,
in press). Suggestions for teachers and computer tutorial designers
included cuing learners to consciously activate low level domain knowledge
and providing adequate time to do.
Two experiments were carried out to test the effects of knowledgeability
and technical vocabulary on processing speed for sentences from familiar
and unfamiliar technical domains. Using a priming paradigm, reading rate
for sentence stems and vocalization latencies for target words that
followed the stems were obtained for technically worded and simplified
sentences.
In the first experiment, biochemists' reading rate and vocalization latencies were compared for familiar (biochemistry) technical and simplified
sentences, unfamiliar (psychopathology) technical and simplified
sentences, and general expository sentences. In the second experiment,
the relationship of distances within semantic networks to processing speed
was explored by obtaining vocalization latencies for target words that
followed related, neutral, and unrelated sentence stems with familiar
(psychopathology) and unfamiliar (biochemistry) content. Clinical
psychologists were subjects.
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
Abstracts of Current Literature
For both groups of experts, knowledgeability played a greater role than
vocabulary in sentence processing speed. Familiar sentences were processed faster than unfamiliar sentences regardless of the wording. Knowledgeability also interacted with vocabulary. Familiar technical sentences
were processed at the same rate as familiar simplified sentences while
unfamiliar simplified sentences were processed faster than unfamiliar technical sentences. The results are interpreted as supporting spreading activation models of memory organization and retrieval.
Parallel Processing of Combinatorial
Search Problems
Guo-Jie Li
Purdue University Ph.D. 1985, 208 pages
Engineering, Electronics and Electrical
University Microfilms International
,4DG86-065 75
The search for solutions in a combinatorially large problem space is a
major problem in artificial intelligence and operations research. Parallel
processing of combinatorial searches has become a key issue in designing
new generation computer systems. The research gives a theoretical foundation of parallel processing of various combinatorial searches upon which
the architectures are based. In this thesis parallel processing of searching
AND trees (graphs), OR trees (graphs), and AND/OR trees (graphs) are
investigated, and different functional requirements of the architecture are
identified.
Some of the difficulties in building parallel computers for searching arise
from the inability to predict the performance of the resulting systems. One
important issue in implementing AND-tree searches is to determine the
granularity of parallelism. In this thesis, the optimal granularity of
AND-tree searches is found and analyzed. Another important result of this
research is in finding the bounds of performance of parallel OR-trees
searches and a variety of conditions to cope with anomalies of parallel
OR-tree searches that involve approximations and dominance tests. In
contrast to previous results, our theoretical analysis and simulations show
that a near-linear speedup can be achieved with respect to a large number
of processors.
Logic programming, one of the foundations of new generation computers, can be represented as searching AND/OR trees. In this research, an
optimal search strategy that minimizes the expected overhead of searching
AND/OR trees is found. An efficient heuristic search strategy for evaluating logic programs, which can be implemented on a multiprocessor architecture (MANIP-2), is proposed.
Dynamic programming problems, a class of problems that can be formulated in multiple ways and solved by different architectures, are used to
illustrate the results obtained on graph and tree searches. Dynamic
programming formulations are classified into four types and various parallel processing schemes for implementing different formulations of dynamic
programming problems are presented. In particular, efficient systolic
arrays for solving monadic-serial dynamic programming problems are
developed.
Boundaries and the Treatment of Control
The unifying theme of the dissertation is that properties of lexical argument structure "drive" the syntax in a number of interesting ways. First,
lexical argument structure plays an important role in the determination of
extraction possibilities in the syntax. Second, lexical properties are important in determining a number of phenomena at Logical Form; in particular,
lexical semantics plays an important role in determining the interpretation
of structures of "arbitrary" control.
Chapters two and three of the dissertation deal with boundaries to
extraction, particularly the phenomena subsumed under the Subject Condition and the Constraint on Extraction Domains. Chapter two focuses on a
restricted class of nominals in English. The main puzzle addressed is the
ability to strand prepositions in these nominals but not in other sorts of
nominals. In chapter three, the ability to extract from a constituent is"
related to the thematic relations which the constituent in question enters
Robin Lee Clark
University of California, Los Angeles Ph.D.
1985, 401 pages
Language, Linguistics
University Microfilms International
ADG86-03940
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
143
The FINITE STRING Newsletter
Abstracts of Current Literature
into. It is further demonstrated that the Boundary Condition allows us to
abstract away from details of tree configuration in providing an account of
these island phenomena.
Chapters four and five develop an account of control based on recent
research on Non-overt Operators. Particular attention is paid to so-called
arbitrary control and it shows that arbitrary control differs from obligatory
control only insofar as the former is a property of Logical F o r m while the
latter is an S Structure property. Particular attention is given to the nature
of Logical Form, how implicit arguments are realized at that level and how
adverbs of quantification enter into control relations. The treatment of
control is shown to bear a strong relationship to such diverse structures as
purposive clauses, parasitic gaps, infinitival relatives, " t o u g h " movement
constructions and certain sentential predicates.
The Effect of Kind of Anaphor on the
Accessibility of Antecedent Information
Marylene Cloitre
Columbia University Ph.D. 1985, 114 pages
Language, Linguistics. Psychology, General
University Microfilms International
ADG86-04609
These studies compare processing differences during the resolution of
anaphoric relationships for two types of anaphors: pronouns and repeated
nouns. The initial studies show that subjects responded to antecedent-related information more rapidly following a pronoun than a noun-anaphor.
Further investigation, using a levels-of-processing methodology, suggests
that the pronominal advantage is largely derived from a difference in the
level of representation at which the initial interpretation of antecedent
information occurs. Specifically, the data suggest that pronouns directly
access the conceptual representation of their antecedent while noun-anaphors initially access a more superficial form of representation.
In each of five experiments, subjects were presented with a probe word
following a sentence-final anaphor. The probe word was always an adjective which had modified the antecedent noun. The results in both a listening and a reading situation (Experiments 1 and 3) showed that subjects not
only recognized the probe adjective faster following the pronoun than the
noun-anaphor but also responded differentially to type of adjective.
Subjects showed more rapid responses to concrete than abstract adjectives
following the pronoun but showed little differential response following the
noun-anaphor. The facilitation observed following the noun-anaphor, not
associated with a differential response, was hypothesized as sensitivity to a
more superficial level of analysis of antecedent information. In a delayed
probe study (Experiment 2), the differential responses to a b s t r a c t / c o n crete information was observed following the noun-anaphor, providing
evidence for the hypothesis that the noun-anaphor eventually shows sensitivity to the conceptual aspects of its antecedent though involved in some
preliminary perhaps 'surface' level analysis of antecedent information.
A direct investigation of the nature of the initial processing activities for
each kind of anaphor was undertaken using the task-oriented methodology
of the Levels-of-Processing paradigm. In a Lexical Decision Task, subjects
showed greater response facilitation following the noun-anaphor than the
pronoun. Subjects' sensitivity to 'surface' information during the lexical
decision task following the noun-anaphor was suggested not only by
response facilitation when the probe was a real (antecedent-related) word
but also by the response inhibition following nonword probes easily
confusable with the real word probes. Responses following pronouns did
not show either of these effects. In contrast, subjects showed much stronger facilitation following pronouns than noun-anaphors in a Category
Decision Task.
Pronoun Resolution in Two-Clause
Sentences
Alison Matthews
City University of New York Ph.D. 1986,
147 pages
This dissertation examines the resolution of anaphoric pronoun references
in two-clause sentences with the pronoun in the second clause and potential antecedents in the first. Evidence suggests that pronoun resolution
involves a search of short-term memory. Experiments were performed to
evaluate the predictions of linear, hierarchical, and parallel function
144
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
Abstracts of Current Literature
Language, Linguistics
University Microfilms International
ADG86-14690
searches in a word-by-word reading comprehension task. The results of
Experiment 1 showed that when gender cues are present, pronoun coreference is resolved more quickly than when the cues are absent, and in their
absence there were strong effects of left-right position of the antecedent
on comprehension time. Experiment 2 varied the linear position and
syntactic level of embedding of the antecedents in order to test the linear
and hierarchical search models. Results were most consistent with a leftto-right, top-down breadth-first search such as that proposed by Hobbs
(1978). Main subordinate clause order had no effect. Experiment 3 tested
the predictions of the parallel function model using pronouns that had the
same grammatical role as the contextually appropriate antecedent or a
different grammatical role. Results indicated no significant effect of parallel function, although the positional differences found in Experiments 1
and 2 were once again obtained. The failure to find an effect of pronoun
position suggests that the search may begin at the topmost node of the
preceding clause rather than at the pronoun, requiring a modification of
Hobb's model. Experiment 4 examined the psychological mechanism
underlying the search for antecedents. Work by Holmes and Forster
(1979) and Mehler et al (1978) indicate that the memory strength of
adjectives and adverbs in a sentence may be related to their position and
level of embedding. Experiment 4 used a rapid serial visual presentation
task to measure memory for nouns as a function of these variables.
Results for nouns showed significant effects of position and level o f probability of recall. This suggests that the left-to-right, top-down breadth-first
search order may simply reflect the memory strength of the noun phrases
which are potential antecedents for an anaphoric pronoun.
Teaching Discourse Organization to the
Deaf: the Use of Case-Role Detection in
Text Analysis
Stephanie Ruth Polowe
The University of Rochester Ed.D. 1985,
214 pages
Language, Linguistics
University Microfilms International
ADG86-O1324
Charles Fillmore is credited with bringing the concept of "case-roles" with
their importance for semantic interpretation, into the field of postChomskyan linguistics. What Fillmore and his colleague, Wallace Chafe,
have argued is that the notion of "case", as we have it from our study of
"case-based" European languages, provides a "grammar" of semantic
theory, a map for deep structure, which organizes the concepts used in
lexical notation.
Case grammar also makes the assumption that transformations do not
retain the deep structure; that, while much of the deep structure may
remain on the sentential level, more than a stylistic choice is being made as
different transformations are chosen~ Several theoretical investigations
(e.g., Sidner, 1980) have shown that the choice of a transformation is
made on the basis of the discourse context of the utterance.
Five case roles which seem critical to meaning interpretation of
sentences are the Agent (the actor, the subject of an active sentence), the
Neutral or Patient (the direct object of an active sentence), the Experiencer (the psychological recipient of the action of a verb), the Benefactive
(the animate recipient of the Neutral object), and the Locative (an element
in the determination of source, goal and state). Sidner in her work with
natural language computer processing, has made the claim that the
discourse "Focus" is placed in the neutral case, if other "focus-specifying"
structures are absent. Focus specifiers are present in sentences which
originally specify the topic (focus) of a discourse. Where these structures
are absent, the neutral argument is the default choice for focus detection.
Where the neutral case is absent, a hierarchy of candidate antecedents
applies. The neutral argument of a neutral complement is a traditional
focus specifier of choice.
In this study, a test was developed to assess the subjects' use of functional (case) roles of sentence constituents in language processing.
Instructional materials for teaching functional role detection and focus
specification were written and used as an experimental curriculum. Analy-
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
145
The FINITE STRING Newsletter
Abstracts of Current Literature
sis of pre- and post-test data indicates that making functional role
detection explicit may contribute to the language proficiency of deaf
students.
Computer Thought: Propositional Attitudes
and Meta-knowledge
Eric Stanley Dietrich
The University of Arizona Ph.D. 1985,
220 pages
Philosophy, Computer Science
University Microfilms International
ADG86-0333 7
Conventions and Speech Acts
Seumas Roderick Macdonald Miller
University of Melbourne (Australia) Ph.D.
1985, 401 pages
Philosophy
University Microfilms International
ADG86-08982
Reference and Intentions to Refer: an
Analysis of the Role of Intentions to
Refer in a Theory of Reference
Crolis Gayda Swain
Loyola University of Chicago Ph.D. 1986,
275 pages
Philosophy
University Microfilms International
ADG86-05559
146
Though artificial intelligence scientists frequently use words such as
"belief" and "desire" when describing the computational capacities of their
programs and computers, they have completely ignored the philosophical
and psychological theories of belief and desire. Hence, their explanations
of computational capacities which use these terms are frequently little
better than folk-psychological explanati6ns. Conversely, though philosophers and.psychologists attempt to couch their theories of belief and desire
in computational terms, they have consistently misunderstood the notions
of computation and computational semantics. Hence, their theories of
such attitudes are frequently inadequate.
A computational theory of propositional attitudes (belief and desire) is
presented here. It is argued that the theory of propositional attitudes put
forth by philosophers and psychologists entails that propositional attitudes
are a kind of abstract data type. This refined computational view of
propositional attitudes bridges the gap between artificial intelligence,
philosophy and psychology.
Lastly, it is argued that this theory of propositional attitudes has consequences for meta-processing and consciousness in computers.
Conventions play a large part in our lives. Our mode of dress, manner of
eating, and linguistic performances, for example, are all governed by
conventions. In Parts A and B of the thesis, a theory of convention is
provided. In Part C the primary concern is with the question of the
conventionality of speech acts.
Part C includes a discussion of the convention to truth-tell, and an
attempt to develop a theory of assertion taking H. P. Grice's account of
speaker-meaning as a starting point.
The theory of convention put forward in Parts A and B arises out of a
detailed treatment of David Lewis' book entitled, Convention. Lewis'
theory analyses conventions in terms of preferences and expectations. For
example, I drive on the left because I prefer to do so, given others do
so,and I expect others to do so. In Parts A and B it is argued that: (1)
Lewis' preference structures need replacement. (2) The notion of a collective end needs to be introduced. (3) Convention followers' expectations
depend on their having acquired "standing procedures" to conform. An
important characteristic of such procedures is that if an agent A, has a
standing procedure to X, then there is a presumption in favour of A's
X-ing.
This dissertation challenges the claim that reference is determined by
intentions to refer by using a 'divide and conquer' strategy. The claim that
reference is determined by intentions to refer is divided into two claims:
one is a claim about how reference is disambiguated; the other is about
how expressions in a language get their reference potential. By dividing
the claims in this way, we can see in what contexts, and to what extent,
reference is determined by intentions.
The first claim, that reference is disambiguated by what a speaker
intends to refer to, is the more plausible one. Part I of the dissertation
clarifies and defends this claim. It rules out non-intentionalist accounts, •
which try to explain disambiguation in terms of non-intentional contextual
factors alone, because the features of the context to which these accounts
appeal are themselves ambiguous. Nonetheless, it argues that contextual
features are important non-linguistic determinants of reference. Part I
concludes that the speaker's intentions do play a role in determining referComputational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
Abstracts of Current Literature
ence, but it also concludes that when linguistic as well as non-linguistic
determinants of reference are taken into account, the role that speakers'
intentions play in determining reference turns out to be quite small.
Part II of the dissertation refutes the claim that the set of possible referents for an expression in a language is determined by what some group of
people (either a majority of them or the 'experts') intend to refer to with
that expression. Part II argues that such accounts are either circular - they
explain semantic reference in terms of speaker's reference, while basing
speaker's reference on semantic reference, or they presuppose an untenable view of the way minds are related to the world.
Case-Based Reasoning: a Computer Model
of Subjective Assessment
William Michael Bain
Yale University Ph.D. 1986, 324 pages
Computer Science
DAI V47(08), SecB, pp3427
University Microfilms International
ADG86-2 725 7
Consolidation: a Method for Reasoning
about the Behavior of Devices
Thomas Clare Bylander
The Ohio State University Ph.D. 1986,
194 pages
Computer Science
DAI V47(07), SecB, pp2993
University Microfilms International
ADG86-25188
People tend to improve their abilities to reason about situations by amassing experiences in reasoning. The more situations a person knows about,
the more he can account for feature differences between new data and old
knowledge. Resorting to previous instances of similar situations for guidance is known as case-based reasoning. A computer program that can
improve its ability to reason must also have access to situations which it
has previously reasoned about. Previous experiences thus require some
mechanism for orderly storage and retrieval. The inability to save and
modify reasoning chains for future use represents a serious shortcoming of
most, if not all, rule-based expert systems.
This research has involved modelling by computer the behavior of judges who sentence criminals. We have viewed this task as one in which
people learn empirically from the process of producing relative assessments
of input situations with respect to several concerns. What differentiates
this task from many other reasoning tasks is that it provides little external
feedback. People can perform such subjective tasks by at least trying to
keep their assessments consistent; as a result, they often resort to using
case-based reasoning. For assessment tasks, this reasoning style involves
comparing a previous similar situation with an input one, and then extracting an assessment for the new input, based on both the assessment previously assigned to the older example, and differences found between them.
The JUDGE system is an implementation of a case-based reasoning model
for sentencing criminal cases in this manner. The system also stores input
items to reflect their relationships to situations already contained in m e m o ry.
Research on Naive Physics attempts to answer the questions: H o w do
people reason about physical phenomena? H o w can computers be
endowed with similar facilities? Artificial Intelligence research on Naive
Physics concentrates on the second question, and by doing so, also seeks to
achieve significant insight on the first.
This research addresses one problem of Naive Physics, that of deriving
the "potential behavior" of a device given the structure of the device and
the potential behavior of its parts. The potential behavior of a physical
object describes the object's behavioral characteristics without making
assumptions about the behavior of other objects external to that object.
The reasoning process that this research proposes is based on two strategies. The consolidation strategy is to select a "composite c o m p o n e n t "
consisting of two components and infer the potential behavior of the
composite from the potential behavior of its subcomponents. Successful
application of consolidation on increasingly larger composite components
results in inferring the potential behavior of the whole device.
The other strategy is to represent potential behavior with a small
number of "types of behavior" that allow behavioral interactions to be
described by rules of composition. A type of behavior is an action on a
substance at some location or on some path. The rules of composition,
called "causal patterns," describe how one type of behavior can arise from
Computational Linguistics, Volume 13,. Numbers 1-2, January-June 1987
147
The FINITE STRING Newsletter
Abstracts of CurrentLiterature
a structural combination of other types of behavior. For example, the
"pump move" causal pattern states that a " m o v e " behavior can arise from
an "allow" behavior and a " p u m p " behavior if both behaviors are on the
same path and if the path goes from a potential source to a potential sink.
These two strategies are incorporated into an overall framework for
representing simple devices and reasoning about their potential behavior.
In addition to introducing the consolidation framework and presenting
examples of applying it, this dissertation also discusses the kinds of Artificial Intelligence theories that are appropriate for Naive Physics, carefully
compares consolidation with qualitative simulation, and lists several areas
for future research, including suggestions on how to overcome shortcomings of the proposed consolidation framework.
Temporal Imagery: an Approach to
Reasoning about Time for Planning and
Problem Solving
Thomas Linas Dean
Yale University Ph.D. 1986, 299 pages
Computer Science
DAI V47(08), SecB, pp3428
University Microfilms International
ADG86-2 7245
Refinement of Expert System Knowledge
Bases: a Metalinguistic Framework for
Heuristic Analysis
Allen Ginsberg
Rutgers University the State U. of New Jersey
(New Brunswick) Ph.D. 1986, 276 pages
Computer Science
DAI V47(06), SecB, pp2509
University Microfilms International
ADG86-20034
148
Reasoning about time typically involves drawing conclusions on the basis
of incomplete information. Uncertainty arises in the form of ignorance,
indeterminacy, and indecision. Despite the lack of complete information a
problem solver is continually forced to make predictions in order to pursue
hypotheses and plan for the future. Such predictions are frequently
contravened by subsequent evidence. This dissertation presents a computational approach to temporal reasoning that directly confronts these
issues. The approach centers around techniques for managing a data base
of assertions corresponding to the occurrence of events and the persistence
of their effects over time. The resulting computational framework
performs the temporal analog of (static) reason maintenance (Doyle 1979)
by keeping track of dependency information involving assumptions about
the truth of facts spanning various intervals of time.
The system developed in this dissertation extends classical predicate-calculus data bases, such as those used by Prolog (Brown 1981) to deal with
time in an efficient and natural manner. The techniques presented here
constitute a solute to the problem of updating a representation of the world
changing over time as a consequence of various processes, otherwise
known as the frame problem (McCarthy 1969). These techniques
subsume the functionality of current approaches to dealing with time in
planning (e.g., Sacerdoti 1977, Tate 1977, Vere 1983, Allen 1983).
Applications in robot problem solving are stressed, but examples drawn
from other application areas are used to demonstrate the generality of the
techniques. The issues involved in processing temporal queries, propagating metric constraints, noticing the invalidation of default assumptions, and
reasoning with incomplete knowledge are discussed in conjunction with the
presentation of algorithms.
Knowledge base refinement involves the generation, testing, and possible
incorporation of plausible refinements to the rules in a knowledge base
with the intention of thereby improving the empirical adequacy o'f an
expert or knowledge-based system, i.e., its ability to correctly diagnose or
classify the cases in its domain of expertise.
The research presented in this thesis contributes to the development of
useful knowledge base refinement systems both at the concrete level of
system design, implementation, and testing, and also at the "meta-level" of
development of tools and methodologies for pursuing research in this area.
Relative to the former level, the following contributions have been made:
(1) the empirically-grounded heuristic approach to refinement generation
developed by Politakis and Weiss has been generalized and extended, i.e.,
the approach has been made applicable to a more powerful rule representation language, and heuristics encompassing a larger class of refinement
operations have been incorporated, (2) an automatic refinement system
utilizing this approach has been implemented and, based upon preliminary
testing, has been shown to be capable of generating effective refinements.
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The F I N I T E S T R I N G Newsletter
Abstracts of Current Literature
Relative to the level of tools and methodology, a high-level Refinement
Metalanguage, RM, allowing for the specification of a wide variety of alternative refinement concepts, heuristics, and strategies, has been designed
and implemented. In addition to allowing for the growth of refinement
systems by facilitating experimental research, RM also provides a means
for refinement system customization and possible enhancement through
the incorporation of domain-specific metaknowledge. The incorporation
of a formal metalanguage for knowledge base refinement represents an
extension of the traditional model of an expert system framework, and is a
step in the direction of more powerful, robust, and self-improving expert
system technology.
Syntactic Extensions in the Programming
Language Lisp
Eugene Edmund Kohlbecker Jr.
Indiana University Ph.D. 1986, 228 pages
Computer Science
DAI V47(08), SccB, pp3430
University Microfilms International
ADG86-27998
A Non-cognitive Formal Approach to
Knowledge Representation in Artificial
Intelligence
Jim A. McMannama
Air Force Institute of Technology Ph.D. 1986,
309 pages
Computer Science
DAI V47(05), SecB, pp2060
University Microfilms International
ADG86-17749
The traditional macro processing systems used in Lisp-family languages
have a number of shortcomings. We identify five problems with the declaration tools customarily available to programmers. First, the declarations
themselves are hard to read and write. Second, the declarations provide
little explicit information about the form macro calls are to take. Third,
syntactic checking of macro calls is usually ignored. Fourth, the notion of
a macro binding for an identifier gives rise to a poor understanding of what
macros really should be. Fifth, the unrestricted capabilities of the language
used to declare macros cause some to take advantage of macros in ways
inconsistent with their role as textual abstractions. Furthermore, the
conventional algorithm used for the expansion of macro calls within Lisp
often causes the inadvertent capture of an identifier appearing within the
macro call by a macro-generated, binding instance of the same identifier.
Lisp programmers have developed a few techniques for avoiding this problem, but they all have depended upon the macro writer taking some sort of
special preventative action.
We examine several existing macro processors, both inside and outside
of the Lisp-family. We then enumerate a set of design principles for macro
processing systems. These principles are general enough that they apply to
the organization of macro processing systems for a large number of highlevel languages. Taking our principles as guidelines, we design a new
macro processing system for Lisp. The new macro declaration tool
addresses each of the five problems from which the traditional tools suffer.
A description of the use of our tool and an annotated presentation of its
implementation are provided. We also develop a new macro expansion
algorithm that eliminates the capturing problem. The macro expander has
the responsibility for avoiding the unwanted capture of identifiers appearing within macro calls.
With the entry of Artificial Intelligence (AI) into real-time applications, a
rigorous analysis of AI expert systems, is required in order to validate them
for operational use. To satisfy this requirement for analysis of the associated knowledge representations, the techniques of formal language theory
are used. A combination of theorems, proofs and problem-solving techniques from formal language theory are employed to analyze language
equivalents of the more commonly used AI knowledge representations of
production rules (excluding working memory or situation data) and semantic networks.
Using formal language characteristics, it is shown that no single
support-tool or automatic-programming tool can ever be constructed that
can handle all possible production-rule or semantic-network variations.
Additionally, it is shown that the entire set of finite production-rule
languages is able to be stored in and retrieved from finite semantic-network languages. In effect, the semantic-network structure is shown to be a
viable candidate for a centralized database of knowledge.
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
149
The FINITE STRING Newsletter
Abstracts of Current Literature
Compiling Queries in Indefinite Deductive
Databases under the Generalized Closed
World Assumption
Hyung-Sik Park
Northwestern Uniw~rsity Ph.D. 1986,
118 pages
Computer Science
DAI V47(08), SecB, pp3433
University Microfilms International
ADG86-2 7386
This research report presents several fundamental results on compiling
queries that will correctly answer "true", indefinite, and "false" in IDDB
(Indefinite Deductive Databases) under the GCWA (Generalized Closed
World Assumption). IDDB does not allow function symbols, but does
allow non-Horn clauses. Further, although the GCWA is used to derive
negative assumptions, we do also allow negative clauses to occur explicitly.
Our goal is to develop the effective techniques for compiling queries in
such IDDB.
We show a fundamental relationship between indefiniteness and inference engines in IDDB under the GCWA. We introduce two basic notions
of NH (Non-Horn) and PSUB (Potential Subsumption) sets providing a
basis for compilation, and consider three representation alternatives to
separate the CDB (Clausal DB) from the RDB (Relational DB). We introduce a saturated resolution method to compile unit queries on CDB and
evaluate them through the RDB in non-recursive IDDB, and develop five
primitive NH-reduction rules and two NH-inheritance rules. We also present a basic idea on compiling unit queries in recursive IDDB by the pattern
generation method. Finally, we introduce the decomposition and evaluation theorems to evaluate disjunctive and conjunctive queries by decomposing them into their unit subqueries and utilizing the compiled
information for such subquedes.
Towards a Natural Language Interface for
Computer Aided Design
Tariq Samad
Carnegie-Mellon University Ph.D. 1986,
137 pages
Computer Science
DAI V47(05), SecB, pp2062
University Microfilms International
ADG86-16520
We propose a natural language interface as part of the solution to the
problems posed by the continuing increase in the number and sophistication of CAD tools. The advantages of a natural language interface for
CAD are numerous, but the complexity and the scope of the CAD domain
renders most previous work in natural language interfaces of limited utility;
an approach of much greater generality and power is required. We
describe a natural language interface (named Cleopatra) that we have
developed for the sub-domain of circuit-simulation post-processing. Cleopatra is the first step in a research program the ultimate goal of which
is the development of a natural language interface for an integrated design
environment. Cleopatra significantly extends what is in essence a lexically-driven case-frame parser by incorporating a couple of novel features:
high degrees of flexibility and parallelism. The flexibility of our approach
enables the representation of constraints that, for instance, cannot be
represented by semantic-grammar-based systems, and it also enables the
specification of arbitrary and idiosyncratic actions to guide the parsing
process. The parallelism, which is supplemented with a notion of
"confidence-levels", enables straightforward treatment of most kinds of
ambiguity. Cleopatra can handle simple nominal coordination, substitutional ellipsis, some kinds of subordinate clauses, there-insertion sentences
and wh-frontings, and its abilities make it a useful CAD tool in its own
right, as well as demonstrating the feasibility of our ultimate goal. Extending Cleopatra's linguistic coverage, as well as extending Cleopatra to other
sub-domains of CAD, should be greatly facilitated by the generality and
power of our approach.
Formalization and Representation of
Expert Systems
Miriam R. Tausner
Stevens Institute of Technology Ph.D. 1986,
182 pages
Computer Science
DAI V47(07), SecB, pp2825
University Microfilms International
ADG86-241 76
Based on a critical analysis of the canonical forms of expert systems, definitions of classical forward-chaining production rule based expert systems
and classical backward-chaining production rule based expert systems are
isolated as the fundamental basic definitions of an expert system. Two
representation theorems are presented for the two fundamental types of
expert systems defined. Both the classical forward-chaining production
rule based expert systems and classical backward-chaining production rule
based expert systems are shown to be representable as type 3 languages
(finite state automata). The pragmatic usefulness of the finite state repre-
150
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
Abstracts of CurrentLiterature
sentation of an expert system is established in the design of a multilevel
expert system for general systems problems solving. A type 3 language is
used to encapsulate the knowledge base and reasoning strategies of the
front-end of the expert system, thus representing the front-end as a deterministic finite automaton. This provides a new approach to the problem of
interfacing multilevel expert systems.
A Structured Memory Access Architecture
for Lisp
Matthew Jacob Thazhuthaveetil
The University of Wisconsin - Madison Ph.D.
1986, 186 pages
Computer Science
DA! V47(07), SecB, pp3004
University Microfilms International
ADG86-18296
A Speech Error Correction Algorithm for
Natural Language Input Processing
Peter James Wetterlind
Texas A&M University Ph.D. 1986, 119 pages
Computer Science
DAI V47(07), SecB, pp3004
University Microfilms International
ADG86-25455
Lisp has been a popular programming language for well over 20 years.
The power and popularity of Lisp are derived from its extensibility and
flexibility. These two features also contribute to the large semantic gap
that separates Lisp from the conventional von Neumann machine, typically
leading to the inefficient execution of Lisp programs. This dissertation
investigates how the semantic gap can be bridged.
We identify function calling, environment maintenance, list access, and
heap maintenance as the four key run-time demands of Lisp programs, and
survey the techniques that have been developed to meet them in current
Lisp machines. Previous studies have revealed that Lisp list access streams
show spatial locality as well as temporal locality of access. While the presence of temporal locality suggests the use of fast buffer memories, the
spatial locality displayed by a Lisp program is implementation dependent
and hence difficult for a computer architect to exploit. We introduce the
concept of structural locality as a generalization of spatial locality, and
describe techniques that were used to analyse the structural locality shown
by the list access streams generated from a suite of benchmark Lisp
programs. This analysis suggests architectural features for improved Lisp
execution.
The SMALL Lisp machine architecture incorporates these features. It
partitions functionality across two specialised processing elements whose
overlapped execution leads to efficient Lisp program evaluation. Tracedriven simulations of the SMALL architecture reveal the advantages of this
partition. In addition, SMALL appears to be a suitable basis for the development of a multi-processing Lisp system.
Computerized processing of human speech input may be accomplished by
(1) recognizing the phoneme sounds in the speech signals, (2) correctly
identifying the words in each spoken sentence, (3) interpreting the meaning of the sentence, and (4) generating proper responses for each utterance.
Individual speakers talk differently, and even an individual's enunciation
patterns change with differing environments and discourse domains. These
differences are called speaker idiosyncracies. Regional speech dialects are
included as a speaker idiosyncracy. The source of such speaker differences
has been identified as an individual's pronunciations of the vowel
phonemes. Computerized speech processors treat these speaker idiosyncracies as errors when the input phoneme sounds are unrecognizable. A
generalized speech recognition system must accommodate such speech
errors and, more particularly, ignore the speaker dependent pronunciations
of phonemes. This implies that vowel phoneme pronunciations, the source
of speaker idiosyncraeies and speech processing errors, should be overlooked during recognition of vocalized sentences.
The research experiment consisted of construction of a system for identifying a natural language sentence using only speaker independent
phonemes as the input. The motivating hypothesis for the experiment is
that spoken sentences can be recognized from limited phoneme input. The
research system accepts only strings of consonant phonemes, which are
recognizable in a speaker independent environment. The original Zspoken'
sentence is reproduced from the consonant phonemes and formatted as a
word sequence for subsequent transmission to a natural language process-
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
151
The FINITE STRING Newsletter
Abstracts of CurrentLiterature
ing system. The system uses a vocabulary of general words and an
expandable dictionary of domain specific words during the sentence reconstruction process.
The research conclusions are that such a system can be built, and that
the useful vocabulary must be expandable as the recognition system
becomes more frequently used.
The research system is intended as an interface between existing acoustic phoneme recognizers and existing natural language processors. The
system accomplishes word recognition using only the consonant phonemes
from continuous speech sentences, and generates word sequences in
sentence form for output to an existing natural language processor. The
domain specific vocabulary subsets used by the system facilitate its use as a
sentence pre-processor especially with natural language understanding
systems which rely on scripts, and the associated domain specific vocabularies, for semantic processing of topic oriented sentence groups.
EPILOG: a Parallel Interpreter for Logic
Programs
Michael J. Wise
University of New South Wales (Australia)
Ph.D. 1985
Computer Science
DAI V47(05), SecB, pp2063
This item is not available from University
Microfilms International.
ADG05-58882
Knowledge Representation using Linguistic
Fuzzy Relations
Wen-Ran Zhang
University of South Carolina Ph.D. 1986,
118 pages
Computer Science
DAI V47(08), SecB, pp3437
University Microfilms International
ADG86-26303
152
Through combining the logic programming language Prolog with a data
driven execution mechanism we may be closer to solving the problems
encountered when designing tightly coupled multiprocessors involving
more than a trivial number of processor elements. This is the central idea
around which the work is constructed.
The report begins with a review of current multiprocessors and a description of one of the more attractive alternative models - the data-flow
model of computation. Among the early chapters there is also an informal
introduction to Prolog seen from the point of view of the unification algo,rithm. Discussion then moves onto the substantive issues of the thesis - a
critique of the problems found in the data-flow model and the properties of
Prolog that make it a n attractive solution. The synthesis of these two
concepts is the EPILOG model which, stated simply, substitutes breadth
first execution for Prolog's depth first pattern, and then provides mechanisms for controlling the abundant parallelism that would otherwise lead
to combinatorial explosion. The EPILOG model is described in detail. This
EPILOG model is, however, just the first, more abstract stage - the so
called "basic" model. The next stage is to fit the basic model onto specific
architectures. This is done via a simulation written in Pascal, which is
described together with the set of underlying assumptions about the architectures being simulated. Results are then presented for the first set of
experiments and some tentative conclusions drawn. Finally, the related
work of other authors is reviewed in the light of the earlier discussion of
EPILOG.
This dissertation presents a theoretical framework for semantic representation, linguistic computation, knowledge representation, and approximate
reasoning about object relations in knowledge engineering. The notions of
term sets are extended; the notions of Linguistic Fuzzy Relation (LFR),
Linguistic Fuzzy Similarity Relation (LSR), and Linguistic Transitive
Closure (LTC) are proposed based on the theory of numerical fuzzy
relation, numerical similarity relation, the
extension principle, and the extended term set definitions.
Theorems are given that provide conditions for the existence and
uniqueness of the LTCs of an LFR under three different operations of
extended max-min, extended max-product, and extended max-A; two algorithms for obtaining the LTCs are presented, and some interesting features
of different LTCs are identified and illustrated by numerical examples.
POOL - a semantic model for approximate reaso.ning - is proposed
based on the theory of LFRs. A prototype system has been implemented
in Franz Lisp under the UNIX ~ Operating System (Berkley 4.2 bsd) on a
SUN 2 / 1 2 0 workstation. Results confirm that the proposed model can
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
Abstracts of Current Literature
provide knowledge-based systems with both representational and inferential power.
1UNIX is a trademark of AT&T Bell Laboratories.
Computer Generation of Meta-technicai
Utterances in Tutoring Mathematics
Ingrid Zukerman
University of California, Los Angeles Ph.D.
1986, 162 pages
Computer Science
DAI V47(06), SecB, pp2516
University Microfilms International
ADG86-21162
Instantiating Maps and Text
R. Robert Abel
Arizona State University Ph.D. 1986,
! 56 pages
Education, Psychology
DAI V47(05), SecA, pp1651
University Microfilms International
ADG86-16447
A technical discussion often contains conversational expressions like:
"however", "as I have stated before", "next", etc. These expressions,
denoted Meta-Technical Utterances (MTUs), carry important information
which the listener uses to speed up the comprehension process. The goal
of this research is to understand the semantics of text containing MTUs, the
mechanisms by which people generate them, and the processes required
for generating them mechanically. To achieve this goal, we model the meaning
of MTUs in terms of their anticipated effect on the listener comprehension,
and use these predictions to select MTUs and embed them in a computer
generated discourse. This paradigm was implemented in a computer
system called FIGMENT, which generates commentaries on the solution of
algebraic equations.
We classify MTUs according to their function, as seen by the speaker, in
transmitting the subject matter to the listener, and distinguish among three
main types of MTUs: (1) Knowledge Organization, (2) Knowledge Acquisition, and (3) Affect Maintenance. Knowledge-Organization MTUs
reflect the organization of the material in the speaker's mind (e.g.,
"however," "in order to"), Knowledge-Acquisition MTUs provide information that enables the listener to prepare adequate knowledge-assimilating
facilities (e.g., "we shall now introduce," "as I have stated before"), and
(3) Affect-Maintenance MTUs convey the affective impact of an event
(e.g., "fortunately"), and foster a positive attitude in the listener (e.g., "I
shall go over this explanation again").
This classification governs the generation of MTUs in the following
manner: Knowledge-Organization and some Affect-Maintenance MTUs
are generated directly from the organization of the system's knowledge of
the subject matter; Knowledge-Acquisition MTUs and the majority of
Affect-Maintenance MTUs are generated by consulting simplified models
of some mental processes which the user presumably activates upon
encountering a technical message. For example, determining the context in
which a technical message should be processed, building up motivation to
attend to the next item of discourse, and so on.
The main contribution of this dissertation is the presentation of an
explicit model for the generation of MTUs. This model can be incorporated into a text-generation facility to enable the generation of fluent and
coherent discourse.
The purpose of this research was to investigate the conjoint retention of
spatial and linguistic information in an instructional context. In Experiment 1, subjects wrote either physical descriptions, fictional narratives, or
personalized narratives while processing a reference map. Results indicated that both types of narrative processing significantly increased overall
map recall, supporting the prediction that richer semantic processing leads
to more effective spatial learning. HowevEr, the act of personalizing the
map space and concomitant semantic processing was, in fact,
detrimental to locational memory. Experiment 2 showed maps capable of
improving learning of general or abstract text, and lent further substance to
the notion that, under correct learning conditions, map features and related
text are stored as conjoint units in memory. Subjects producing idiosyncratic map features and placing them on the map as exemplars of text
concepts were much more likely to retain both perceptual map information
and conceptual text information. Data on order of recall indicated that
subjects who viewed intact maps as they processed text were able to rely
on both the map structure and the serial order of the original passage as a
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
153
The FINITESTRING Newsletter
Abstracts of CurrentLiterature
guide for free recall. However, subjects viewing lists of identical map
features displayed only the use of passage structure for text recall. This
difference appears attributable to the availability of map images for
subjects viewing maps at encoding, and the lack of imaginal support for the
list group. Finally, generating features and their spatial locations led to
superior locational accuracy at recall. Both studies support the conjoint
retention hypothesis which states that probability of recalling information
from maps or related text is predictable directly from the trace strength of
jointly encoded imaginal and verbal memory representations.
Assessing Lexical Knowledge
Merlynn Rosell Bergen
Stanford University Ph.D. 1986, 294 pages
Education, Psychology
DAI V47(06), SecA, pp2082
University Microfilms International
ADG86-19716
A Conceptual Database Design and Analysis Methodology (Volumes I and II).
Rob H. Rucker
Arizona State University Ph.D. 1986,
583 pages
Engineering, Industrial
DAI V47(06), SecB, pp2574
University Microfilms International
ADG86-2203 7
154
The words we use are hypothesized to lie within a richly interlinked semantic network. To answer the question "What does a student know when he
or she knows the meaning of a word?" the present study used a structured
interview to provide a frame within which 48 4th- and 6th-grade students
gave detailed descriptions of their semantic networks for 32 words from
their basal readers.
A child learns the meanings of most words from daily experiences. Only
later, as a result of schooling, does he or she learn the more formal uses of
language. A lexical model that predicts separability of natural and formal
lexical knowledge was introduced and the two aspects differentially measured.
The purposes were: (a) to assess the lexical knowledge of students who
varied in vocabulary ability, grade, and gender; (b) to determine performance differences when difficulty of the task was manipulated through the
use of word (form class, difficulty, word origin), presentation (modality,
context), and task factors; and (c) to test the Lexical Model.
The results were: (a) students' formal lexical knowledge was below
mastery for these known words, but student protocols indicated a rich web
of natural lexical knowledge (high ability students reliably outperformed
average ability students, grade differences were significant only for the
formal word knowledge measures, and gender was never a significant
source of variance); (b) factors designed to vary the difficulty of the task
did not yield consistently significant performance differences; and (c)
factor analyses of the dependent measures indicated separability of natural
and formal word knowledge as had been predicted by the Lexical Model.
Measuring lexical knowledge in the detailed fashion of the present study
has been neglected in earlier research. The hypothesis that natural word
knowledge skills are separable from formal has not previously been tested
in this way. As all of the students provided evidence of a rich web of
natural word knowledge, deliberately building formal lexical skills from
within this semantically-interlinked base, rather than separately from it,
would appear to be a useful pedagogical strategy.
This dissertation presents a methodology, and describes an environment,
useful for conceptual data base design at the requirements analysis level.
The methodology is called DREAMERS. This is an acronym for Domain,
Relation, Entity, Attribute, Mathematical Modeling with Expert System
Support. The methodology allows for the rapid prototyping of proposed
skeleton conceptual design via an implementation within a selected relational data base management system. The methodology includes design,
implementation and analysis components. The sequence of steps in the
methodology proceeds from system specifications, to high level abstract
data/transactions models, to an entity-relation type digraph model, to a
relational conceptual design, to a relational implementation and thence to
analysis of the operational skeleton prototype data base.
Underlying the analysis portion of the methodology is the use of the
mathematics of categories, digraphs, lattices, simplicial complexes and
predicate logic as well as a large scale relational database to aid in the
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
Abstracts of Current Literature
development process itself. The categorial portion of the mathematics is
used to first classify the models used in the methodology, and then, the
graphical theories are used to derive canonical structures for analysis and
implementation.
Considering the transactions and their associated data structures, doubly
ordered lattices have been constructed (Transaction/Data Maps) that
provide additional graphical insight to the analyst. Other graphical techniques have been employed that create binary matrices from relational
tables or skeleton digraphs via an operation called "complexing". From a
binary matrix, at least three graphical representations - digraphs, simplexes
and lattices - may be derived.
The environment for the database development discussed here has been
constructed by the author by using the SQL/DS database management
system product together with an interactive interface based on the
language APL2. These two major products are supplemented with the
expert system shell ESE/VM together with additional support from the
packages GRAPHPAK, and REXX. This developmental environment aids
the designer by providing dialog management, graphics, analysis, expert
system, communications, and database management system services.
Parallel Processing of Natural Language
Hui Olivia Chang
Northwestern University Ph.D. 1986,
208 pages
Language, Linguistics
DAI V47(08), SecA, pp3020
University Microfilms International
ADG86-2 733 7
Linguistics and Translation: Some Semantic Problems in Arabic-English Translation
Ahmed Mouakket
Georgetown University Ph.D. 1986,
250 pages
Language, Linguistics
DAI V47(07), SecA, pp2565
University Microfilms International
Two types of parallel natural language processing are studied in this work:
(1) the parallelism between syntactic and non-syntactic processing and, (2)
the parallelism within syntactic processing. It is recognized that a syntactic
category can potentially be attached to more than one node in the syntactic
tree of a sentence. Even if all the attachments are syntactically wellformed, non-syntactic factors such as semantic and pragmatic consideration may require one particular attachment. Syntactic processing must
synchronize and communicate with non-syntactic processing. Two syntactic processing algorithms are proposed for use in a parallel environment:
Early's algorithm and the LR(k) algorithm. Conditions are identified to
detect the syntactic ambiguity and the algorithms are augmented accordingly. It is shown that by using non-syntactic information during syntactic
processing, backtracking can be reduced, and the performance of the
syntactic processor is improved.
For the second type of parallelism, it is recognized that one portion of a
grammar can be isolated from the rest of the grammar and be processed by
a separate processor. A partial grammar of a larger grammar is defined.
Parallel syntactic processing is achieved by using two processors concurrently: the main processor (mp) and the auxiliary processor (ap). The
auxiliary processor processes/accepts a substring in the input that is generated by the partial grammar. The main processor is responsible for processing the rest of the input and for interprocessor communication. The
LR(k) algorithm is augmented to the effect that the main processor can
take advantage of the processing result of the auxiliary processor. It is
shown that the performance of the proposed parallel processing is
supported by many of the syntactic constraints in natural languages. In
addition, by recognizing the divisibility of the grammar, parallel parsing
supports partial semantic interpretation during the course of the processing
and is useful for constructing fault-tolerant NLP.
Translating from Arabic into English involves certain morphological,
syntactic and semantic problems. To understand these problems, one has
to return to the cultural and social backgrounds of the Arabic language and
try to discover how these may affect the process of translating into English. It is also essential to note that Arabic is a VSO, non-Indo-European
language whose speakers differ in cultural and social behavior from those
of the Western languages.
The problem, therefore, will be threefold: (a) to look into the cultural
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
155
The FINITE STRING Newsletter
Abstracts of Current Literature
ADG86-22332
and social backgrounds of Arabic and discover the basic elements which
affect the process of translation, (b) to account for the peculiarities of
Arabic lexicon and structure by examining some Arabic texts that have
been translated into the English language by native speakers of English,
and (c) to relate the above descriptions and findings to a theory of translation in the light of what Nida termed "dynamic equivalence." It is on the
areas of cross-cultural communication, connotative meanings, intersentential levels and textual levels that this study will focus most.
Furthermore, recent developments in the fields of theoretical linguistics
and the research done in the fields of Case G r a m m a r and semantics have
paved the way toward a deeper understattding of underlying structures. In
this study, the works of Fillmore (1968, 1971), Chafe (1970), and Cook's
(1979) Matrix Model will furnish some basic concepts of semantic representations in order to account "for the analysis of certain data.
The study shows the pragmatic aspects of translating certain Arabic
texts into English. It also gives a short account of the use and application
of translation in some educational areas. Finally, it provides implications
for an interpretable theory which considers translation both an art and a
science.
Computer Assisted Dialect Adaptation: the
Tucanoan Experiment
This dissertation provides the theoretical basis for a computer program that
adapts textual material from one language of the Tucanoan family to
another. Tucanoan languages are spoken by small groups living in south,
eastern Colombia, northwestern Brazil, northern Peru, and northern Ecuador.
This work represents the first attempt to apply principles of machine
translation and computational linguistics to indigenous languages of
Colombia. It discusses aspects of translation theory relevant to machine
translation. Some features of the Tucanoan languages relevant to the
adaptation process are discussed in depth, including differences in suffix
systems marking case, noun classifiers, and the evidential systems of the
various languages. Of particular interest for automated parsing is the problem of null allomorphs of certain morphemes.
Robert Bruce Reed
The University of Texas at Arlington Ph.D.
1986, 272 pages
Language, Linguistics
DAI V47(06), SecA, pp2146
University Microfilms International
ADG86-21742
The Semantics of Anaphora in Discourse
Rebecca Louis Root
The University of Texas at Austin Ph.D.
1986, 175 pages
Language, Linguistics
DAI V47(05), SecA, pp1716
University Microfilms International
A DG86-185 74
156
The syntactic variety found in anaphoric relationships in discourses of
more than one sentence is quite great. This is particularly true when plural
anaphors are involved. In addition to the expected forms, there are links
from anaphors to discontinuous antecedents, as in " J o h n found a piano
teacher. They are both fond of Mozart", and links from plural anaphors to
singular antecedents in distributive contexts, as in " E v e r y girl brought a
cake. CoincidentaUy, they were all chocolate." Despite the surface variation, it is argued here that a uniform account of these constructions is
possible. An analysis of the semantic properties of discourse anaphora is
presented here which offers a unified explanation of both the truth conditions and the acceptability conditions of this phenomenon. The analysis is
framed within the context of the Discourse Representation Theory of Hans
Kamp. In this theory, each sentence contributes to the construction of a
representation of the meaning of the discourse. This is done in part
through the introduction of "reference markers" for each noun phrase.
Truth is defined for such a representation in terms of an embedding of the
representation in a model. In this account, an anaphoric link is truthful if
the set of individuals to which the anaphor's reference marker is mapped,
is the same as the set of individuals to which the reference markers of the
antecedent(s) are mapped in the course of finding an embedding for the
representation. This definition formalized, together with the assumptions
that the discourse is true and that there is a principle of semantic number
agreement, constitute the analysis presented here. In addition to the
formal explanations, it is argued that this approach is attractive from a
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
The FINITE STRING Newsletter
Abstracts o f Current Literature
computational point of view, and an implementation of the leading ideas is
presented.
A Prolegomenon to Theory of Translation
Robert Darrell Firmage
The University of Utah Ph.D. 1986,
361 pages
Philosophy
DAI V47(07), SecA, pp2612
University Microfilms International
ADG86-24441
As its title indicates, this dissertation seeks both to determine the scope
and nature of theory of translation, and to provide a basis for future
attempts to produce such a theory. Its general strategy is to clear the
ground for an analysis of the nature of the equivalence obtaining between
a translation and its original, by grounding it in an adequate theory of
language. Thus, its primary focus is on existing theories of meaning and of
truth, particularly as enunciated by contemporary representatives of the
analytic tradition of philosophy.
It is divided into three chapters. Chapter I centers about the problem of
indeterminacy of translation, as introduced by Quine, and serves mainly as
a prospectus of current philosophic discussion involving the notion of
translation. It attempts both to enunciate the major problems and to
review and criticize various significant viewpoints concerning them. Since
theory of translation is shown to involve theory of meaning, Chapter II
attempts to adumbrate a theory of meaning adequate to the needs and
practices of translation. Such a theory, in turn, is shown to involve the
notion of truth in relation to human practice, and hence Chapter III is
devoted to theory of truth. In a short Postscript, the results of these
discussions are refocused on the problem of translational equivalence, in
the endeavor to provide an heuristic for subsequent analysis.
Although it cannot presume to have provided an adequate theory of
translation, this dissertation claims to have sketched the basis for such a
theory, by virtue of having provided an account of the workings of interlinguistic exchanges, and language in general, from the perspective of the
actual practice of translation, rather than from the typical "armchair
linguistics" of most philosophic theory. In the process, it has provided a
perspective on the ancient problems of meaning and truth, which, if
correct, would necessitate a thorough revamping of most of the traditional
approaches to those problems. Although centered about the notion of
translation, owing to the ramifications of this notion, it could perhaps as
easily be seen as a prolegomenon to theory of knowledge.
A Theory of Events
Kathleen Gill
Indiana University Ph.D, 270 pages
Philosophy
DAI V47(05). SecA, pp1750
An account of events is developed in which events are characterized as a
series of momentary states of affairs. This characterization is motivated by
a study of the structural features required to capture our notion of an
event. Events have structure in the sense that they involve objects and
properties, and, since they necessarily occur over an interval of time,
events have a transtemporal structure. This latter feature is used to
account for a variety of relationships between events, as well as
distinctions between states, processes, and completal events. Special attention is given to the problem of event identity. Some progress is made on
this issue by laying the groundwork which is necessary for its resolution.
This consists, first of all, in sorting out various cases of identity, e.g. distinguishing between the problems of adverbial modification and property
identity, and, secondly, in providing a metaphysical framework within
which to interpret the problem of event identity.
Natural Language Semantics and Guise
Theory
Frant~sco Orilia
Indiana University Ph.D. 1986, 263 pages
Philosophy
DAI V47(08), SecA, pp3069
University Microfilms International
ADG86-28010
I assume that the task of natural language semantics is to provide an unambiguous logical language into which natural language can be translated in
such a way that the translating expressions display a structure which is
isomorphic to the meaning of the translated expressions. Since language is
a means of thinking and communicating mental contents, the meanings o f
singular terms cannot be the individuals of the substratist tradition,
because such individuals are not cognizable entities. Thus I propose that
the logical language be based on Castaneda's guise theory, according to
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987
157
The FINITE STRING Newsletter
Abstracts of Current Literature
which singular terms always denote guises, i.e., roughly, (finite) bundles of
properties. This, I argue, would result in a semantics which is in accordance with projects such as Lakoff's natural logic or Fodor's methodological
solipsism.
I first propose a formal system, GCC, which tries to be as faithful as
possible to Castaneda's informal presentation of guise theory. It is therefore characterized by different forms of predication and a distinction
between a level of property composition and a level of proposition composition. Such a distinction is dropped in a second system, GF, which
presents a more traditional Fregean representation of predication. Yet, GF
endorses essential assumptions of guise theory such as the existence of
different sameness relations that can provide various interpretations for the
English "is". I claim that GF provides more theoretical simplicity than
GCC.
Finally, I show the fruitfulness of the present approach by applying GF
to a vast collection of linguistico-philosophical puzzles which includes but
is not restricted to those that guise theory was originally designed to
address: various versions of Frege's paradox, the paradox of analysis,
Quine's puzzle on the number of planets, issues of reidentification and
intentional identity, the anaphoric "it" of sentences such as "the lizard's
tail fell off but then it grew back," problems connected with the use of
"knowing-who (-which)," proper names, indexicals and demonstratives.
The Influence of Domain-Specific Know-
ledge on Processing Resources During
the Comprehension of Domain-Related
Information
Timothy Andrew Post
University of Pittsburgh Ph.D. 1986, 86 pages;
Psychology, Developmental
DAI V47(06), SecB, pp2646
University Microfilms International
ADG86-20226
158
Three experiments were designed to test assumptions regarding domainspecific knowledge and the allocation of processing load. A general model
of comprehension was used to frame the current effort. This model specifies that comprehenders are able to instantiate stored memories of events,
referred to as substructures, that are related to contextual situations. Once
instanfiated, a substructure does not require processing load and facilitates
the ifitegration of domain-specific materials. Four general findings are
described: Sequences of statements about an event in a domain are processed with a large initial reading time, followed by a decline; unusual
events can cause changes in processing load allocation, but only to
an initial degree and not as a function of the degree of unusualness; reading goals, i.e., the intent that one has in comprehending a portion of text,
may be changed as a function of importance and unusualness at the
sentence level; and domain-specific knowledge may be viewed as a cognitive adaptation that primarily reduces the processing demands of text
comprehension. A model is sketched that describes how these four
phenomena may occur during reading.
Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987