Academia.eduAcademia.edu
ABSTRACTS OF CURRENT LITERATURE Articles, Word Order, and Resource Control Hypothesis Janusz S. Bien Warsaw In Mey, Jacob L., Ed., Language and Discourse: Test and Protest, A Festschrift for Petr Sgall. (Vol. 19, Linguistic and Literary Studies in Eastern Europe.) John Benjamins Publishing Company, Amsterdam/ Philadelphia, ! 986. The paper elaborates the ideas presented in Bien (1983). The definite and indefinite distinction is viewed as a manifestation of the variable depth of nominal phrase processings: indefinite phrases are represented by frame pointers, while definite ones by frame instances incorporating information found by memory search. In general, the depth of processing is determined by the availability of resources. Different word orders cause different distributions of the parser's processing load and therefore influence also the depth of processing. Articles and word order appear to be only some of several resource control devices available in natural languages. For copies of the following papers from Projekt SEMSYN, please write to Frau Martin c / o Projekt SEMSYN Institut fuer Informatik Azenbergstr. 12 D-7000 Stuttgart 1 West Germany or e-mail to: semsyn@ifistg.uucp The Automated News Agency: SEMTEX A Text Generator for German Dietmar Roesner GEOTEX - A System for Verbalizing Geometric Constructions (in German) Waiter Kehl As a by-product of the J a p a n e s e / G e r m a n machine translation project SEMSYN the SEMTEX text generator for German has been implemented (in ZetaLISP for SYMBOLICS lisp machines). SEMTEX's first application has been to generate newspaper stories about job market development. Starting point for the newspaper application is just the data from the montl~ly job market report (numbers of unemployed, open jobs . . . . ). A rudimentary "text planner" takes these data and those of relevant previous months, checks for changes and significant developments, simulates possible argumentations of various political speakers on these developments and finally creates a representation for the intended text as an ordered list of frame descriptions. SEMTEX then converts this list into.a newspaper story in German using an extended version of the generator of the SEMSYN project. The extensions for SEMTEX include: • Building up a representation for the context during the utterance of successive sentences that allows for - avoiding repetitions in wording - avoiding re-utterance of information still valid - pronominalization and other types of references. • Grammatical tense is dynamically derived by checking the temporal information from the conceptual repr%sentations and relating it to the time of speech and the time-period focussed by the story. • When simulating arguments the text planner uses abstract rhetorical schemata; the generator is enriched with knowledge about various ways to express such rhetorical structures as German surface texts. GEOTEX is an application of the SEMTEX text generator for German: The text generator is combined with a tool for interactively creating geometric constructions. The latter offers formal commands for manipulating (i.e. Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 93 The FINITE STRING Newsletter Abstracts of Current Literature creating, naming and - deliberately - deleting) basic objects of Euclidean geometry. The generator is used to produce descriptive texts - in G e r m a n - related to the geometric construction: • descriptions of the geometric objects involved, • descriptions of the sequence of steps done during a construction. SEMTEX's context-handling mechanisms have been enriched for GEOTEX: • Elision is no longer restricted to adjuncts. For repetitive operations, verb and subject will be elided in subsequent sentences. • The distinction between known information and new one is exploited to decide on constituent ordering: the constituent referring to the known object is "topicalized", i.e. put in front of the sentence. • The system allows for more ways to refer to objects introduced in the text: pronouns, textual deixis using demonstrative pronouns, names. The choice between these variants is done deliberately. GEOTEX is implemented in ZetaLISP and runs on SYMBOLICS lisp machines. The Generation System of the SEMSYN Project. Towards a Task-Independent Generator for German Dietmar Roesner We report on our experiences from the implementation of the SEMSYN generator, a system generating German texts from semantic representations, and its application to a variety of different areas, input structures and generation tasks. In its initial version the SEMSYN generator was used within a J a p a n e s e / G e r m a n MT project, where it produced G e r m a n equivalents to Japanese titles from scientific papers. Being carefully designed in object-oriented style (and implemented with the FLAVOR system) the system proved to be easily adaptable to other semantic representations e.g. output from CMU's Universal Parser - and extensible to other generation tasks: generating G e r m a n news stories, generating descriptive texts to geometric constructions. Copies of the following reports on the joint research project WISBER can be ordered free of charge from Dr. Johannes Arz Universit~it des Saarlandes FR. 10 Informatik IV lm Stadtwald 15 D-6600 Saarbrticken 11 Electronic mail address: wisber% sbsvax.uucp@germany.csnet Neuere Grammatiktheorien und Grammatikformalismen H.-U. Block, M. Gehrke, H. Haugeneder, R. Hunze Report No. 1 Entwurf eines Erhebungsschemas fiir Geldanlage R. Busche, S. op de Hipt, M.-J. Schacter-Radig Report No. 2 Generierung von Erkl~irungen aus formalen Wissensrepr~isentationen H. Riisner in LDV-Forum, Band 4, Nummer 1, Juni 1986, pp. 3-19 94 The present paper gives an overview of modern theories of syntax and is intended to provide insight into current trends in the field of parsing. The grammar theories treated here are government and binding theory, generalized phrase structure grammar, and lexical functional grammar, as these approaches currently appear to be the most promising. Recent grammar formalisms are virtually all based on unification procedures. Three representatives of this group (functional unification grammar, &patr., and definite clause grammar) are presented. This report describes the acquisition schema for the knowledge required by knowledge-based consulting system WISBER, the goal of which consists in carrying out the process of knowledge acquisition and formalization in a methodical - i.e., planned and controlled - manner. The main task involves the design of appropriate acquisition techniques and their successful application in the domain of investment consulting. The main topic of this report concerns the generation of natural language texts. The use of explanation components in expert systems involves making computer behavior more transparent. This standard can only be attained if the current stack dump procedure is replaced by procedures in which user expectations are met with respect to the contents of the, systems Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter Abstracts of Current Liferalrure Report No. 3 explanation as well as the acceptability of language structure. This paper reports on work pertaining to an expanded range of explanation components in the Nixdorf exper system shell TwAIce. A critical account of the position held by grammatical theory in generating natural language at the user level is given, whereby the decision for a certain theory remains first and foremost pragmatical. Moreover, a stand is taken concerning scientific experimentation on the transfer of formal knowledge representation. Practical problems concerning technical technology are pointed out that haven't yet been taken into account. Incremental Construction of C- and F-Structure in an LFG-Parser In this paper a parser for Lexical Functional G r a m m a r (LFG) which is characterized by incrementally constructing the c- and f-structure of a sentence during parsing is presented. Then the possibilities of the earliest check on consistency, coherence, and completeness are discussed. Incremental construction of f-structure leads to an early detection and abortion of incorrect paths and so increases parsing efficiency. Furthermore, those semantic interpretation processes that operate on partial structures can be triggered at an earlier state. This also leads to a considerable improvement in parsing time. LFG seems to be well suited for such an approach because it provides for locality principles by the definition of coherence and completeness. H.-U. Block, R. Hunze in Proceedings of the 1 lth International Conference on Computational Linguistics, COLING'86, Bonn, pp. 490-493 Report No. 4 The Treatment of Movement Rules in an LFG-Parser H.-U. Block, H. Haugender in Proceedings of the 1 lth International Conference on Computational Linguistics, COLlNG'86, Bonn, pp. 482-486 Report No. 5 Morpheme-Based Lexical Analysis M. Gehrke, H.-U. Block Report No. 6 Probleme der Wissensrepr~isentation in Beratungssystemen H.-U. Block, M. Gehrke, H. Haugender, R. Hunze Report No. 7 In this paper a way of treating long-distance movement phenomena as exemplified in (1) is proposed within the framework of an LFG-based parser. (1) Who do you think Peter tried to meet 'You think Peter tried to meet who' After a short overview of the treatment of general discontinuous dependencies in the Theory of Government and Binding, Lexical Functional Grammar, and Generalized Phrase Structure Grammar, the so-called whor long-distance movement are concentrated arguing that a general mechanism which is compatible with both the LFG and the GB treatment of long-distance movement can be found. Finally, the implementation of such a movement mechanism in an LFG-parser is presented. In this paper some aspects of the advantages and disadvantages of a morpheme-based lexicon with respect to a full lexicon are discussed. Then a current implementation of an application-independent lexical access component is presented as well as an implemented formalism for the inflectional analysis of German. The present report consists of two main sections. The first part analyzes individual knowledge sources that require specialization for the consulting system W1SBER. It should serve as a first approximation to the structural analysis of all knowledge sources. In the second part, methods for the representation of knowledge and languages are examined. Regarding this, KL-ONE, interpreted as an epistemic formal structure of language representation for describing structure objects, is examined. Supplementing this is an examination of other systems which, in addition, have significant assertive components such as KRYPTON and KL-TWO at their disposal. At the other end of the spectrum lies PEARL, a system that cannot clearly be semantically and epistemically interpreted as a representational language as such. Between these two poles lie, on the one hand, FLR, which, without guaranteeing the semantic clarity of the grammatical constructions used, Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 95 The FINITE STRING Newsletter Abstracts of Current Literature flexibly combines a large number of the ideas previously suggested and, on the other hand, KRS, representative for a group of hybrid representation systems which allow a flexible combination of various formal structures of representation. Beratung und natiirliehsprachlicher Dialog - eine Evaluation yon Systemen der Kiinstlichen Intelligenz H. Bergmann, M. Gerlach, W. Hoeppner, H. Marburger Report No. 8 This report contains an evaluation of Artificial Intelligence systems which provide the research base for the development of the natural-language advisory system WISBER. First, the reasons for selecting the particular systems considered in the study are given and a set of evaluation criteria emphasizing in particular pragmatic factors (e.g., dialog phenomena, handling of speech acts, user modeling) is presented. The body of the report consists of descriptions and critical evaluations of the following systems: ARGOT, AYPA, GRUNDY, GUIDON, HAM-ANS, KAMP, OSCAR, ROMPER, TRACK, UC, VIE-LANG, WIZARD, WUSOR, XCALIBUR. The final chapter summarizes the results, concentrating on the possible utilization of individual system capabilities in the development of WISBER. Form der Ergebnisse der Wissensakquisition in WISBER-XPS4 M. Fliegner, M.-J. Schachter-Radig Report No. 9 In this paper fundamental questions are discussed concerning the representation of expert knowledge, exemplified within the area of investment consulting. While a written report is appropriate for a general presentation of results, it neither satisfies the needs of systems development - which of course must build upon the results of knowledge acquisition - nor can it do justice to the requirements of knowledge acquisition itself. On the other hand, epistemologically expressive knowledge representation tools require that conceptual design decisions must be made quite early on. The tools LOOPS, OPS5, prolog-based shell, and KL-ONE are dealt with. The following abstracts are from C O L I N G "86 P R O C E E D I N G S , copies of which are available only from IKS e.V. Poppelsdorfer Allee 47 D-5300 Bonn 1 WEST G E R M A N Y Telephone: + 4 9 / 2 2 8 / 7 3 5 6 4 5 EARN/BITNET: UPK000@DBNRHRZ1 IN T E R N E T : UPK000 % D B N R H R Z 1.BITNET @ WlS C V M . W I S C . E D U The price is 95 DM within Europe and 110 DM for air delivery to non-European countries. Please pay in advance by check to the address above or by bankers draft to the following account: Bank for Gemeinwirtschaft Bonn Account no. 11205 163 900, BLZ 380 101 11 Lexicon-Grammar: The Representation of Compound Words Maurice Gross Universit6 Paris 7 Laboratoire Documentaire et Linguistique 2, place Jussieu F-75221 Paris CEDEX 05 COLING'86, pp. 1-6 96 The essential feature of a lexicon-grammar is that the elementary unit of computation and storage is the simple sentence: subject-verb-complement(s). This type of representation is obviously needed for verbs: limiting a verb to its shape has no meaning other than typographic, since a verb cannot be separated from its subject and essential complements. We have shown (1975) that given a verb, or equivalently a simple sentence, the set of syntactic properties that describes its variations is unique: in general, no other verb has an identical syntactic paradigm. As a consequence, the properties of each verbal construction must be represented in a lexicon-grammar. The lexicon has no significance taken as an isolated component and the grammar component, viewed as independent of the lexicon, will have to be limited to certain complex sentences. Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter An Empirically Based Approach towards a System of Semantic Features Cornelia Zelinsky- Wibbelt IAI-Eurotra-D Martin-Luther-StraBe 14 D-6600 Saarbrticken Abstracts of Current Literature A major problem in machine translation is the semantic description of lexical units which should be based on a semantic system that is both coherent and operationalized to the greatest possible degree. This is to guarantee consistency between lexical units coded by lexicographers. This article introduces a generating device for achieving well-formed semantic feature expressions. COLING'86, pp. 7-12 Concept and Structure of Semantic Markers for Machine Translation in Mu-Project Yoshiyuki Sakamoto Electrotechnical Laboratory Sakura-mura. Niihari-gun. Ibaraki, Japan Tetsuya Ishikawa University of Library & Information Science Yatabe-machi. Tsukuba-gun. lbaraki, Japan Masayuki Satoh Japan Information Center of Science & Technology. Nagata-cho, Chiyoda-ku Tokyo, Japan COLING'86, pp. 13-20 This paper discusses the semantic features of nouns classified into categories in Japanese-to-English translation, and proposes a system for semantic markers. In our system, syntactic analysis is carried out by checking the semantic compatibility between verbs and nouns. The semantic structure of a sentence can be extracted at the same time as its syntactic analysis. We also use semantic markers to select words in the transfer phase for translation into English. The system of the Semantic Markers for Nouns consists of 13 conceptual facets, including one facet for 'Others' (discussed later), and is made up of 49 filial slots (semantic markers) as terminals. We have tested about 3,000 sample abstracts in science and technological fields. Our research has revealed that our method is extremely effective in determining the meanings of Wago verbs (basic Japanese verbs) which have broader concepts like the English verbs make, get, take, put, etc. A Theory of Semantic Relations for Large Scale Natural Language Processing Hanne Ruus Institut for nordisk filologi & Eurotra-DK Ebbe Spang-Hanssen Romansk institut & Eurotra-DK University of Copenhagen Njalsgade 80 DK-2300 Copenhagen S COLING'86, pp. 20-22 Even a superficial meaning representation of a text requires a system of semantic labels that characterize the relations between the predicates in the text and their arguments. The semantic interpretation of syntactic subjects and objects, of prepositions and subordinate conjunctions has been treated in numerous books and papers with titles including works like deep case, case roles, semantic roles, and semantic relations. In this paper we concentrate on the semantic relations established by predicates: what are they, what are their characteristics, how do they group the predicates. Extending the Expressive Capacity of the Semantic Component of the OPERA System Celestin Sedogbo Centre de Recherche Bull 68, Route de Versailles 78430 Louveciennes, France COLING'86, pp. 23-28 OPERA is a natural language question answering system allowing the interrogation of a data base consisting of an extensive listing of operas. The linguistic front-end of OPERA is a comprehensive grammar of French, and its semantic component translates the syntactic analysis into logical formulas (first order logic formulas). However, there are quite a few constructions which can be analyzed syntactically in the grammar but for which we are unable to specify translations. Foremost among them are anaphoric and elliptic constructions. Thus this paper describes the extension of OPERA to anaphoric and elliptic constructions on the basis of the Discourse Segmentation Theory. User Models: The Problem of Disparity Sandra Carberry A significant component of a user model in an information-seeking dialogue is the task-related plan motivating the information-seeker's queries. A number of researchers have modeled the plan inference process and used these models to design more robust natural language interfaces. However, in each case it has been assumed that the system's context model and the plan under construction by the information-seeker are never at variance. This paper addresses the problem of disparate plans. It presents a four phase approach and argues that handling disparate plans requires an enriched context model. This model must permit the addition of components suggested by the information-seeker but not fully supported by the system's domain knowledge, and must differentiate among the components according to the kind of support accorded each component as a correct Department of Computer & Information Science University of Delaware Newark, Delaware 19716 COLING'86, pp. 29-34 Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 97 The FINITE STRING Newsletter Abstracts of Current Literature part of the information-seeker's overall plan. It is shown how a component's support should affect the system's hypothesis about the source of error once plan disparity is suggested. Pragmatic Sensitivity in NL Interfaces and the Structure of Conversation Tom Wachtel Scicon Ltd., London and Research Unit for Information Science & AI, Hamburg University COLING'86, pp. 35-41 A T w o - L e v e l Dialogue Representation Giacomo Ferrari Department of Linguistics University of Pisa Ronan Reilly Educational Research Center St. Patrick's College, Dublin 9 COLING'86, pp. 42-45 INTERFACILE: Linguistic Coverage and Query Reformulation Yvette Mathieu, Paul Sabatier CNRS - LADL Universit~ Paris 7 Tour Centrale 9 E 2 Place Jussieu 75005 Paris COLING'86, pp. 46-49 Category Cooccurrence Restrictions and the Elimination of Metarules James Kilbury Technical University of Berlin KIT/NASEV, CIS, Sekr. FRS-8 Franklinstr. 28/29 D-1000 Berlin 10 Germany - West Berlin COLING'86, pp. 50-55 98 The work reported here is being conducted as part of the LOKI project (ESPRIT Project 107, " A logic oriented approach to knowledge and data bases supporting natural user interaction"). The goal of the NL part of the project is to build a pragmatically sensitive natural language interface to a knowledge base. By "pragmatically sensitive", we mean that the system should not only produce well-formed coherent and cohesive language (a minimum requirement of any NL system designed to handle discourse) but should also be sensitive to those aspects of user behaviour that humans are sensitive to over and above simply providing a good response, including producing output that is appropriately decorated with those minor and semantically inconsequential elements of language that make the difference between natural language and natural natural language. This paper concentrates on the representation of the structure of conversation in our systems, we will first outline the representation we use for dialogue moves, and then outline the nature of the definition of wellformed dialogue that we are operating with. Finally, we will note a few extensions to the representation mechanism. In this paper a two-level dialogue representation system is presented. It is intended to recognize the structure of a large range of dialogues including some nonverbal communicative acts which may be involved in an interaction. It provides a syntactic description of a dialogue which can be expressed in terms of re-writing rules. The semantic level of the proposed representation system is given by the goal and subgoal structure underlying the dialogue syntactic units. Two types of goals are identified; goals which relate to the content of the dialogue, and those which relate to communicating the content. The experience we have gained in designing and using natural language interfaces has led us to develop a general language system, INTERFACILE, involving the following principles: - The linguistic coverage must be elementary but must include phenomena that allow a rapid, concise, and spontaneous interaction, such as anaphora (ellipsis, pronouns, etc.). - The linguistic competence and limits of the interface must be easily and rapidly perceived by the user. - The interface must be equipped with strategies and procedures for leading the user to adjust his linguistic competence to the capacities of the system. We have illustrated these principles in an application: a natural language (French) interface for acquiring the formal commands of some operating system languages. (The examples given here concern DCL of Digital Equipment Company.) This paper builds upon and extends certain ideas developed within the framework of Generalized Phrase Structure G r a m m a r (GPSG). A new descriptive device, the Category Cooccurrence Restriction (CCR), is introduced in analogy to existing devices of GPSG in order to express constraints on the cooccurrence of categories within local trees (i.e., trees of depth one) which at present are stated with Immediate Dominance &idp. rules and metarules. In addition to providing a uniform format for the statement of such constraints, CCRs permit generalizations to be expressed which presently cannot be captured in GPSG. Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter Abstracts of CurrentLiterature Sections 1.1 and 1.2 introduce CCRs and presuppose only a general familiarity with GPSG. The ideas do not depend on details of GPSG and can be applied to other grammatical formalisms. Sections 1.3-1.5 discuss CCRs in relation to particular principles of GPSG and assume familiarity with Gazdar et al. (1985) (henceforth abbreviated as GKPS). Finally, section 2 contains proposals for using CCRs to avoid the analyses with metarules given for English in GKPS Testing the Projectivity Hypothesis Vladimir Pericliev Mathematical Linguistics Dept. Institute of Mathematics with Comp Centre 1113 Sofia, bl.8, Bulgaria llarion llarionov Mathematics Dept. Higher Inst of English & Building Sofia, Bulgaria COLING'86, pp. 56-58 The empirical validity of the projectivity hypothesis for Bulgarian is tested. It is shown that the justification of the hypothesis presented for other languages suffers serious methodological deficiencies. Our automated testing, designed to evade such deficiencies, yielded results falsifying the hypothesis for Bulgarian: the non-projective constructions studied were in fact grammatical rather than ungrammatical, as implied by the projectivity thesis. Despite this, the projectivity/non-projectivity distinction itself has to be retained in Bulgarian syntax and, with some provisions, in the systems for automatic processing as well. Particle Homonymy and Machine Translation Kdroly Fdbricz JATE University of Szeged Egyetem u. 2. Hungary H - 6722 COLING'86, pp. 59-61 The purpose of this contribution is to formulate ways in which the homonymy of so-called 'Modal Particles' and the etymons can be handled. Our aim is to show that not only a strategy for this type of h o m o n y m y can be worked out, but also a formalization of information beyond propositional content can be introduced with a view to its MT application. Plurals, Cardinalities, and Structures of Determination This paper presents an approach for processing incomplete and inconsistent knowledge. Basis for attaching these problems are 'structures of determination', which are extensions of Scott's approximation lattices taking into consideration some requirements from natural language processing and representation of knowledge. The theory developed is exemplified with processing plural noun phrases referring to objects which have to be understood as classes or sets. Referential processes are handled by processes on 'Referential Nets', which are a specific knowledge structure developed for the representation of object-oriented knowledge. Problems of determination with respect to cardinality assumptions are emphasized. Christopher U. Habel Universitat Hamburg, Fachbereich Informatik SchlOterstr. 70 D-1000 Hamburg 13 COLING'86, pp. 62-64 Processing Word Order Variation within a Modified I D / L P Framework Pradip Dey University of Alabama at Birmingham Birmingham, AL 35294 COLING'86 pp. 65-67 From a well represented sample of world languages, Steel (1978) shows that about 7 8 % of languages exhibit significant word order variation. Only recently has this wide-spread phenomenon been drawing appropriate attention. Perhaps ID/LP (Immediate Dominance and Linear Precedence) framework is the most debated theory in this area. We point out some difficulties in processing standard ID/LP grammar and present a modified version of the grammar. In the modified version, the right-hand side of phrase structure rules is treated as a set or partially-ordered set. An instance of the framework is implemented. Sentence Adverbials in a System of Question Answering without a Prearranged Data Base In the present paper we provide a report on a joint approach to the computation treatment of sentence adverbials (such as surprisingly, presumably, or probably) and focussing adverbials (such as only or at least, including negation (not) and some other adverbial expressions, such as for example or inter alia) within a system of question answering without a prearranged data base (TIBAQ). This approach is based on a joint theoretical account of the expressions in question in the framework of a functional description of language; we argue that in the primary case, the expressions in question occupy, in the underlying topic-focus articulation of a sentence, the focus-initial position, Eva Koktova Hamburg, West Germany COLING'86 pp. 68-73 Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 99 The FINITE STRING Newsletter Abstracts of Current Literature extending their scope over the focus, or the new information, of a sentence, thus specifying, in a broad sense of the word, how the next information of a sentence holds. On the surface the expressions in question are usually moved to scope-ambiguous positions, which can be analyzed by means of several general strategies. D-PATR: A Development Environment for Unification-Based Grammars Lauri Kartunnen Artificial Intelligence Center SRI International 333 Ravenswood Avenue Menlo Park, CA 94025 a n d Center for the Study of Language and Information, Stanford University COLING'86, pp. 74-80 Structural Correspondence Specification Environment Yongfeng Yah Groupe d'Etudes pour la Traduction Automatique (GETA) B.P. 68 University of Grenoble 38402 Saint Martin d'H6res, France D-PATR is a development environment for unification-based grammars on Xerox 1100 series work stations. It is based on the PATR formalism developed at SRI International. This formalism is suitable for encoding a wide variety of grammars. At one end of this range are simple phrase-structure grammars with no feature augmentations. The PATR formalism can also be used to encode grammars that are based on a number of current linguistic theories, such as lexical-functional grammar (Bresnan and Kaplan), headdriven phrase structure grammar (Pollard and Sag), and functional unification grammar (Kay). At the other end of the range covered by D-PATR are unification-based categorial grammars (Klein, Steedman, Uszkoreit, Wittenberg) in which all the syntactic information is incorporated in the lexicon and the remaining few combinatorial rules that build phrases are function application and composition. Definite-clause grammars (Pereira and Warren) can also be encoded in the PATR formalism. This article presents the Structural Correspondence Specification Environment (SCSE) being implemented at GETA. The SCSE is designed to help linguists to develop, consult, and verify the SCS grammars (SCSG) which specify linguistic models. It integrates the techniques of data bases, structure editors, and language interpreters. We argue that formalisms and tools of specification are as important as the specification itself. COLING'86, pp. 81-84 Conditioned Unification for Natural Language Processing Kditi Hasida Electrotechnical Laboratory Umezono 1-1-4, Sakura-Mura, Niibari-Gun Ibaraki, 305 Japan COLING'86, pp. 85-87 Methodology and Verifiability in Montague Grammar Seiki Akama Fujitsu Ltd. 2-4-19, Sin-Yokohama Yokohama, 222, Japan This paper presents what we call a conditional unification, a new method of unification for processing natural languages. The key idea is to annotate the patterns with a certain sort of conditions, so that they carry abundant information. This method transmits information from one pattern to another more efficiently than procedure attachments, in which information contained in the procedure is embedded in the program rather than directly attached to patterns. Coupled with techniques in formal linguistics, moreover, conditioned unification serves most types of operations for natural language processing. Methodological problems in Montague G r a m m a r are discussed. Our observations show that a mode-theoretic approach to natural language semantics is inadequate with respect to its verifiability from a logical point of view. But, the formal attitudes seem to be of use for the development in computational linguistics. COLING'86, pp. 88-90 Towards a Dedicated Database Management System for Dictionaries Marc Domenig, Patrick Shann lnstitut Dalle Molle pour les Etudes Semantiques et Cognitives &isscop. Route des Acacias 54 1227 Geneva, Switzerland COLING'86 pp. 91-96 100 This paper argues that a lexical data base should be implemented with a special kind of database management system (DBMS) and outlines the design of such a system. The major difference between this proposal and a general purpose DBMS is that its data definition language (DDL) allows the specification of the entire morphology, which turns the lexical data base from a mere collection of 'static' data into a real-time word-analyzer. Moreover, the dedication of the system conduces to the feasibility of user interfaces with very comfortable monitor and manipulation functions. Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter The Transfer Phase of the Mu Machine Translation System Makoto Nagao, Jun-ichi Tsujii Department of Electrical Engineering Kyoto University Kyoto, Japan 606 COLING'86, pp. 97-103 Lexical Transfer: A Missing Element in Linguistics Theories Alan K. Melby Brigham Young University Department of Linguistics Provo, Utah 84602 COLING'86, pp. 104-106 Idiosyncratic Gap: A Tough Problem to Structure-Based Machine Translation Yoshihiko Nitta Advanced Research Laboratory Hitachi Ltd. Kokubunji, Tokyo 185 Japan COLING'86, pp. 107-111 Lexicai-Functional Transfer: A Transfer Framework in a Machine-Translation System Based on LFG Ikuo Kudo CSK Research Institute 3-22-17 Higashi-Ikebukuro, Toshima-ku Tokyo, 170, Japan Hirosato Nomura NTT Basic Research Laboratories Musashino-shi, Tokyo, 180, Japan COLING'86, pp. 112-114 Abstracts of Current Literature The interlingual approach to MT has been repeatedly advocated by researchers originally interested in natural language understanding who take machine translation to be one possible application. However, not only the ambiguity but also the vagueness which every natural language inevitably has leads this approach into essential difficulties. In contrast, our project, the Mu-project, adopts the transfer approach as the basic framework of MT. This paper describes the detailed construction of the transfer phase of our system from Japanese to English, and gives some examples of problems which seem difficult to treat in the interlingual approach. Some of the design principles relevant to the topic of this paper are: • Multiple Layer of Grammars • Multiple Layer Presentation • Lexicon Driven Processing • Form-Oriented Dictionary Description This paper also shows how these principles are realized in the current system. One of the necessary tasks of a machine translation system is lexical transfer. In some cases there is a one-to-one mapping from source language word to target language word. What theoretical model is followed when there is a one-to-many mapping? Unfortunately, none of the linguistic models that have been used in machine translation include a lexical transfer component. In the absence of a theoretical model, this paper will suggest a new way to test lexical transfer systems. This test is being applied to an MT system under development. One possible conclusion may be that further effort should be expended developing models of lexical transfer. Current practical machine translation systems, which are designed to deal with a huge amount of documents, are generally structure-based. That is, the translation process is done based on the analysis and transformation of the structure of the source sentence, not on the understanding and paraphrasing of the meaning of that sentence. But each language has its own syntactic and semantic idiosyncrasy, and on this account, without understanding the total meaning of the source.sentence, it is often difficult for MT to bridge properly the idiosyncratic gap between source and target language. A somewhat new method call "Cross Translation Test" is presented that reveals the detail of idiosyncratic gap together with the so-so satisfiable possibility of MT The usefulness of the sublanguage approach in reducing the idiosyncratic gap between source and target languages is also mentioned. This paper presents a transfer framework called LFT (Lexical-Functional Transfer) for a machine translation system based on LFG (Lexical-Functional Grammar). The translation process consists of subprocesses of analysis, transfer, and generation. We adopt the so-called f-structures of LFG as the intermediate representations or interfaces between those subprocesses, thus the transfer process converts a source f-structure into a target f-structure. Since LFG is a grammatical framework for sentence structure analysis of one language, for the purpose, we propose a new framework for specifying transfer rules with LFG schemata, which incorporates corresponding lexical functions of two different languages into an equational representation. The transfer process, therefore, is to solve equations called target f-descriptions derived from the transfer rules applied to the source f-structure and then to produce a target f-structure. Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 101 The FINITE STRING Newsletter Transfer and MT Modularity Pierre Isabelle, Elliott Macklovitch Canadian Workplace Automation Research Center 1575 Chomedey Boulevard Laval, Quebec, Canada H7V 2X2 COLING'86, pp. 115-117 The Need for MT-Oriented Versions of Case and Valency in MT Harold L. Somers Centre for Computational Linguistics University of Manchester Institute of Science and Technology COLING'86, pp. 118-123 A Parametric NL Translator Randall Sharp Dept. of Computer Science University of British Columbia Vancouver, Canada Abstracts of Current Literature The transfer components of typical second generation (G2) MT systems do not fully conform to the principles of G2 modularity, incorporating extensive target language information while failing to separate translation facts from linguistic theory. The exclusion from transfer of all non-contrastive information leads us to a system design in which the three major components operate in parallel rather than in sequence. We also propose that MT systems be designed to allow translators to express their knowledge in natural metalanguage statements. This paper looks at the use in machine translation systems of the linguistic models of Case and Valency. It is argued that neither of these models was originally developed with this use in mind, and both must be adapted somewhat to meet this purpose. In particular, the traditional Valency distinction of complements and adjuncts leads to conflicts when valency frames in different languages are compared: a finer but more flexible distinction is required. Also, these concepts must be extended beyond the verb, to include the noun and adjective as valency bearers. As far as Case is concerned, too narrow an approach has traditionally been taken: work in this field has been too concerned only with cases for arguments in verb frames; case label systems for non-valency bound elements and also for elements in nominal groups must be elaborated. The paper suggests an integrated approach specifically oriented towards the particular problems found in MT. This report outlines a machine translation system whose linguistic component is based on principles of G o v e r n m e n t and Binding. A "universal g r a m m a r " is defined, together with parameters of variation for specific languages. The system, written in Prolog, parses, generates, and translates between English and Spanish (both directions). COLING'86, pp. 124-126 Lexicase Parsing: A Lexicon-Driven Approach to Syntactic Analysis Stanley Starosta University of Hawaii Social Science Research Institute and Pacific International Center for High Technology Research Honolulu, Hawaii 96822 Hirosato Nomura NTT Basic Research Laboratories Musashino-shi, Tokyo, 180, Japan COLING'86, pp. 127-132 Solutions for Problems of MT Parser Methods used in Mu-Machine Translation Project Jun-ichi Nakamura, Jun-ichi Tsujii, Makoto Nagao Dept. of Electrical Engineering Kyoto University Sakyo, Kyoto 606, Japan COLING'86, pp. 133-135 102 This paper presents a lexicon-based approach to syntactic analysis, Lexicase, and applies it to a lexicon-driven computational parsing system. The basic descriptive mechanism in a Lexicase grammar is lexical features. The properties of lexical items are represented by contextual and non-contextual features, and generalizations are expressed as relationships among sets of these features and among sets of lexical entries. Syntactic tree structures are represented as networks of pairwise dependency relationships among the words in a sentence. Possible dependencies are marked as contextual features on individual lexical items, and Lexicase parsing is a process of picking out words in a string and attaching dependents to them in accordance with their contextual features. Lexicase is an appropriate vehicle for parsing because Lexicase analyses are monostratal, flat, and relatively non-abstract, and it is well suited to machine translation because grammatical representations for corresponding sentences in two languages will be very similar to each other in structure and inter-constituent relations, and thus far easier to interconvert. A parser is a key component of a machine translation system. If it fails in parsing an input sentence, the MT system cannot output a complete translation. A parser of a practical MT system must solve many problems caused by the varieties of characteristics of natural languages. Some problems are caused by the incompleteness of grammatical rules and dictionary information, and some by the ambiguity of natural languages. Others are caused by various types of sentence constructions, such as itemization, insertion by parentheses, and other typographical conventions that cannot be naturally captured by ordinary linguistic rules. Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter Abstracts of CurrentLiterature The authors of this paper have been developing MT systems between Japanese and English (in both directions) under the Mu-machine translation project. In the system's development, several methods have been implemented with grammar writing language GRADE to solve the problems of the MT parser. In this paper, first the characteristics of GRADE and the Mu-MT parser are briefly described. Then, methods to solve the MT parsing problems that are caused by the varieties of sentence constructions and the ambiguities of natural languages are discussed from the viewpoint of efficiency and maintainability. Strategies and Heuristics in the Analysis of a Natural Language in Machine Translation Zaharin Yusoff Groupe d'Etudes pour la Traduction Automatique BP no. 68 Universit6 de Grenoble 38402 Saint-Martin-d'H~res, France The analysis phase in an indirect, transfer, and global approach to machine translation is studied. The analysis conducted can be described as exhaustive (meaning with backtracking), depth-first, and strategically and heuristically driven, while the grammar used is an augmented context free grammar. The problem areas, being pattern matching, ambiguities, forward propagation, checking for correctness, and backtracking, are highlighted. Established results found in the literature are employed whenever adaptable, while suggestions are given otherwise. COLING'86, pp. 136-139 Parsing in Parallel Xiuming Huang, Louise Guthrie Computing Research Laboratory New Mexico State University Las Cruces, NM 88003 COLING'86, pp. 140-145 Computational Comparative Studies on Romance Languages: A Linguistic Comparison of Lexicon-Grammars Annibale Elia lstituto di Linguistica Universit~t di Salerno Yvette Mathieu Laboratoire d'Automatique Documentaire et Linguistique C.N.R.S. - Universit6 de Paris 7 The paper is a description of a parallel model for natural language parsing, and a design for its implementation on the Hypercube multiprocessor. The parallel model is based on the Semantic Definite Clause G r a m m a r formalism and integrates syntax and semantics through the communication of processes. The main processes, of which there are six, contain either purely syntactic or purely semantic information, giving the advantage of simple; transparent algorithms dedicated to only one aspect of parsing. Communication between processes is used to impose semantic constraints on the syntactic processes. What we present here is an application on the basis of the Italian and French linguistic data bank assembled by the Istituto di Linguistica of Salerno University (Italy) and the Laboratoire Automatique Documentaire et Linguistique (C.N.R.S.-France). These two research centers have been working for years to the constitution of formalized grammars of the respective languages. The composition of lexicon-grammars is the first stage of this project. COLING'86, pp. 146-150 A Stochastic Approach to Parsing Geoffrey Sampson Department of Linguistics and Phonetics University of Leeds COLING'86, pp. 151-155 Parsing Without (Much) Phrase Structure Michael B. Kac Department of Linguistics University of Minnesota Simulated annealing is a stochastic computational technique for finding optimal solutions to combinatorial problems for which the combinatorial explosion phenomenon rules out the possibility of systematically examining each alternative. It is currently being applied to the practical problem of optimizing the physical design of computer circuitry, and to the theoretical problems of resolving patterns of auditory and visual stimulation into meaningful arrangements of phonemes and three-dimensional objects. Grammatical parsing - resolving unanalyzed linear sequences of words into meaningful grammatical structures - can be regarded as a perception problem logically analogous to those just cited, and simulated annealing holds great promise as a parsing technique. Approaches to NL syntax conform in varying degrees to the older relational/dependency model (essentially that assumed in traditional grammar), which treats a sentence as a group of words united by various relations, and the newer constituent model . . . . In computational linguistics Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 103 The FINITE STRING Newsletter Minneapolis, MN 55455 Alexis Manaster-Ramer Program in Linguistics University of Michigan Ann Arbor, MI 48109 COLING'86, pp. 156-158 Reconnaissance-Attack Parsing Michael B. Kac, Tom Rindflesch Department of Linguistics University of Minnesota Minneapolis, MN 55455 Karen L. Ryna Computer Sciences Center Honeywell, Inc. Minneapolis, MN 55427 Abstracts of Current Literature there is a strong (if not universal) reliance on phrase structure as the medium via which to represent syntactic structure; call this the consensus view. ... In its strongest form, the consensus view says that the recovery of a fully specified parse tree is an essential step in computational language processing, and would, if correct, provide important support for the constituent model. In this paper, we shall critically examine the rationale for this view, and will sketch (informally) an alternative view which we find more defensible. The actual position we shall take for this discussion, however, is conservative in that we will not argue that there is no place whatever for constituent analysis in parsing or in syntactic analysis generally. What we argue is that phrase structure is at least partly redundant in that a direct leap to the composition of some semantic units is possible from a relatively underspecified syntactic representation (as opposed to a complete parse tree). In this paper we will describe an approach to parsing, one major component of which is a strategy called RECONNAISSANCE-ATTACK. Under this strategy, no structure building is attempted until after completion of a preliminary phase designed to exploit low-level information to the fullest possible extent. This first pass then defines a set of constraints that restrict the set of available options when structure building proper begins. R-A parsing is in principle compatible with a variety of different views regarding the nature of syntactic representation, though it fits more comfortably with some than with others. COLING'86, pp. 159-160 Panel: Natural Language Interfaces Ready for Commercial Success? Wolfgang Wahlster (Chair) Department of Computer Science University of Saarbrticken D-6600 Saarbrucken 11 Fed. Rep. of Germany COLING'86 p. 161 STATEMENT BY THE CHAIR (abridged) The goal of this panel is to evaluate three natural language interfaces which were introduced to the commercial market in 1985 (cf. Carnegie Group 1985, Kamins 1985, Texas Instruments 1985) and to relate them to current research in computational linguistics. Each of the commercial systems selected as a starting point for the discussion (see Wahlster 1986 for a functional comparison) was developed by a well-known scientist with considerable research experience in NL processing: LanguageCraft 1 by Carnegie Group (designed under the direction of J. Carbonell), NLMenu by Texas Instruments (designed under the direction of H. Tennant), and Q & A 2 by Symantec (designed under the direction of G. Hendrix). 1 Trademark of Carnegie-Group, Inc. 2 Trademark of Symantec Corporation Requirements for Robust Natural Language Interfaces: The LanguageCraft and XCALIBUR Experiences Jaime G. Carbonell Carnegie-Mellon University and Carnegie-Group, Inc. Pittsburgh, PA 15213 COLING'86, pp. 162-163 104 PANELIST STATEMENT (abridged): Natural Language interfaces to data bases and expert systems require the investigation of several crucial capabilities in order to be judged habitable by their end users and productive by the developers of applications. User habitability is measured in terms of linguistic coverage, robustness of behavior and speed of response, whereas implementer activity is measured by the amount of effort required to connect the interface to a new application, to develop its syntactic and semantic grammar, and to test and debug the resultant system assuring a certain level of performance. These latter criteria have not been addressed directly by natural language researchers in pure laboratory settings, with the exception of user-defined extensions to an existing interface (e.g., NanoKLAUS, VOX). But, in order to amortize the cost of developing practical, robust, and efficient interfaces over multiple applications, the implementer productivity requirements are as important as user habitability. We treat each set of criteria in turn, drawing from our experience in XCALIBUR, and in LanguageCraft a commercially available environment and run-time module for rapid development of domain-oriented natural language interfaces. In our discussion we distill the general lessons accrued Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter Abstracts of CurrentLiterature from several years of experience using these systems, and conducting several small-scale user studies. Q&A: Already a Success? (Responses to moderator's question based on Q&A.) Gary G. Hendrix Symantec Corporation Cupertino, CA 95014 COLING'86, pp. 164-166 The Commercial Application of Natural Language Interfaces Harry Tennant Computer Science Center Texas Instruments Dallas, Texas COLING'86 p. 167 PANELIST STATEMENT (abridged): I don't think that natural language interfaces are a very good idea. By that I mean conventional natural language interfaces - the kind where the user types in a question and the system tries to understand it. Oh sure, when (if?) computers have world knowledge that is comparable to what humans need to communicate with each other, natural language interfaces will be easy to build and, depending on what else is available, might be a good way to communicate with computers. But today we are soooo far away from having that much knowledge in a system, conventional natural language interfaces don't make sense. There is something different that makes more sense - NLMenu. It is a combination of menu technology with natural language understanding technology, and it eliminates many of the deficiencies one finds with conventional natural language interfaces while retaining the important benefits. ...end o f p a n e L . The Role of Inversion and PP-Fronting in Relating Discourse Elements Mark Vincent LaPolla The Artificial Intelligence Laboratory and The Department of Linguistics University of Texas at Austin Austin, Texas 70LING'86, pp. 168-173 Situational Investigation of Presupposition Seiki Akama Fujitsu Ltd. 2-4-19 ShinYokohama Yokohama, Japan Masahito Kawamori This paper will explore and discuss the less obvious ways syntactic structure is used to convey information and how this information could be used by a natural language database system as a heuristic to organize and search a discourse space. The primary concern of this paper will be to present a general theory of processing which capitalizes on the information provided by such non-SVO word orders as inversion, (wh) clefting, and prepositional phrase (PP) fronting. This paper gives a formal theory of presupposition using situation semantics developed by Barwise and Perry. We will slightly modify Barwise and Perry's original theory of situation semantics so that we can deal with nonmonotonic reasonings which are very important for the formalization of presupposition in natural language. This aspect is closely related to the formulation of incomplete knowledge in artificial intelligence. Sophia University 7 Kioicho, Chiyodaku Tokyo, Japan COLING'86, pp. 174-176 Linking Propositions D.S. Brde, R.A. Smit Rotterdam School of Management Erasmus University P.O.B. 1738 NL-3000 DR Rotterdam, The Netherlands COLING'86, pp. 177-180 The function words of a language provide explicit information about how propositions are to be related. We have examined a subset of these function words, namely the subordinating conjunctions which link propositions within a sentence, using sentences taken from corpora stored on magnetic tape. On the basis of this analysis, a computer program for Dutch language generation and comprehension has been extended to deal with the subordinating conjunctions. We present an overview of the underlying dimensions that were used in describing the semantics and pragmaties of the Dutch subordinating conjunctions. We propose a Universal set of Linking Dimensions, sufficient to specify the subordinating conjunctions in any language. This ULD is a first proposal for the representation required for a Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 105 The FINITESTRING Newsletter Abstracts of Current Literature computer program to understand or translate the subordinating conjunctions of any natural language. Discourse and Cohesion in Expository Text Alien B. Tucker, Sergei Nirenburg Department of Computer Science Colgate University Victor Raskin Department of English Purdue University COLING'86, pp. 181-183 D e g r e e s of Understanding Eva Haj~ovd, Petr Sgall Faculty of Mathematics and Physics Charles University Malostransk6 n. 25 Prague 1, Czechoslovakia COLING'86, pp. 184-186 Categorial Unification Grammars Hans Uszkoreit Artificial Intelligence Center, SRI International and Center for the Study of Languages and Information, Stanford University COLING'86, pp. 187-194 106 This paper discusses the role of discourse in expository text, text which typically comprises published scholar papers, textbooks, proceedings of conferences, and other highly stylized documents. Our purpose is to examine the extent to which those discourse-related phenomena that generally assist the analysis of dialogue text - where speaker, hearer, and speech-act information are more actively involved in the identification of plans and goals - can be used to help with the analysis of expository text. In particular, we make the optimistic assumption that expository text is strongly connected, i.e., that all adjacent pairs of clauses in such a text are connected by "cohesion markers", both explicit and implicit. We investigate the impact that this assumption may have on the depth of understanding that can be achieved, the underlying semantic structures, and the supporting knowledge base for the analysis. An application of this work in designing the AI-based machine translation model, TRANSLATOR, is discussed in Nirenburg et al. (page 627 of these Proceedings). Along with "static" or "declarative" descriptions of language system, models of language use (the regularities of communicative competence) are constructed. One of the outstanding aspects of this transfer of attention consists in the efforts devoted to automatic comprehension of natural language which, since Winograd's SHRDLU, are presented in many different contexts. One speaks about understanding, or comprehension, although it may be noticed that the term is used in different, and often unclear, meanings. In machine translation systems, as the late B. Vauquois pointed out (see now Vauquois and Boitet, 1985), a flexible system combining different levels of automatic analysis is necessary (i.e., the transfer component should be able to operate at different levels). The human factor cannot be completely dispensed with; it seems inevitable to include post-edition, or such a division of labor as that known from the system METEO. Not only should the semantico-pragmatic items present in the source language structure be reflected but also certain aspects of factual knowledge (see Slocum 1985: 16). It was pointed out by Kirschner (1982: 18) that, to a certain degree, this requirement can be met by means of a system of semantic features. For NL comprehension systems the automatic formulation of a partial image of the world often belongs to the core of the system; such a task certainly goes far beyond pure linguistic analysis and description. Winograd (1976: 269,275) claims that a linguistic description should handle "the entire complex of the goals of the speaker". It is then possible to ask what are the main features relevant for the patterning of this complex and what are the relationships between understanding all the goals of the speaker and having internalized the system of a natural language. It seems to be worthwhile to reexamine the different kinds and degrees of understanding. Categorial unification grammars (CUGs) embody the essential properties of both unification and categorial grammar formalisms. Their efficient and uniform way of encoding linguistic knowledge in well-understood and widely-used representations makes them attractive for computational applications and for linguistic research. In this paper, the basic concepts of CUGs and simple examples of their application will be presented. It will be argued that the strategies and potentials of CUGs justify their further exploration in the wider context of research on unification grammars. Approaches to selected linguistic Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter Abstracts of CurrentLiterature phenomena such as long-distance dependencies, adjuncts, word order, and extraposition are discussed. Dependency Unification Grammar Peter Hellwig University of Heidelberg D-6900 Heidelberg, West Germany COLING'86, pp. 195-198 The Weak Generative Capacity of Parenthesis-Free Categorial Grammars Jovce Friedman, Dawei Dai, Weiguo Wang Computer Science Department Boston University 111 Cummington Street Boston, MA 02215 COLING'86, pp. 199-201 Tree Adjoining and Head Wrapping E. Vijay-Shanker, David J. Weir, Aravind K. Joshi Department of Computer and Information Science University of Pennsylvania Philadelphia, PA 19104 This paper describes the analysis component of the language processing system PLAIN from the viewpoint of unification grammars. The pnnciples of Dependency Unification Grammar (DUGs) are discussed. The computer language DRL (Dependency Representation Language) is introduced in which DUGs can be formulated. A unification-based parsing procedure is part of the formalism. PLAIN is implemented at the universities of Heidelberg, Bonn, Flensburg, Kiel, Zurich, and Cambridge, U.K. We study the weak generative capacity of a class of parenthesis-free categorial grammars derived from those of Ades and Steedman by varying the set of reduction rules. With forward cancellation as the only rule, the grammars are weakly equivalent to context-free grammars. When a backward combination rule is added, it is no longer possible to obtain all the context-free languages. With suitable restriction of the forward partial rule, the languages are still context-free and a push-down automaton can be used for recognition. Using the unrestricted rule of forward partial combination, a context-sensitive language is obtained. In this paper we discuss the formal relationship between the classes of languages generated by Tree Adjoining Grammars and Head Grammars. In particular, we show that Head Languages are included in Tree Adjoining Languages and that Tree Adjoining Grammars are equivalent to a modification of Head Grammars called Modified Head Grammars. The inclusion of MHL in HL, and thus the equivalence of HGs and TAGs, in the most general case remains to be established. COLING'86, pp. 202-207 Categorial Grammars for Strata of NonCF Languages and their Parsers Michal P. Chytil Charles University Malostransk6 nddm.25 118 00 Praha 1, Czechoslovakia We introduce a generalization of categorial grammar extending its descriptive power, and a simple model of categorial grammar parser. Both tools can be adjusted to particular strata of languages via restricting grammatical or computational complexity. Hans Karlgren KVAL SOdermalstorg 8 116 45 Stockholm, Sweden COLING'86, pp. 208-210 A Simple Reconstruction of GPSG Smart M. Shieber Artificial Intelligence Center, SRI International and Center for the Study of Language and Information, Stanford University COLING'86, pp. 211-215 Kind Types in Knowledge Representation K. Dahlgren IBM Los Angeles Scientific Center 11601 Wilshire Blvd. Like most linguistic theories, the theory of generalized phrase structure grammar (GPSG) has described language axiomatically, that is, as a set of universal and language-specific constraints on the well-formedness of linguistic elements of some sort. The coverage and detailed analysis of English grammar in the ambitious recent volume by Gazdar, Klein, Pullum, and Sag entitled Generalized Phrase Structure Grammar, are impressive, in part because of the complexity of the axiomatic system developed by the authors. In this paper, we examine the possibility that simpler descriptions of the same theory can be achieved through a slightly different, albeit still axiomatic, method. Rather than characterize the well-formed trees directly, we progress in two stages by procedurally characterizing the wellformedness axioms themselves, which in turn characterize the trees. This paper describes Kind Types (KT), a system which uses commonsense knowledge to reason about natural language text. KT encodes some of the knowledge underlying natural language understanding, including category distinctions and descriptions differentiating real-world objects, states, and Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 107 The FINITE STRING Newsletter Los Angeles, CA 90025 J. McDowell Department of Linguistics University of Southern California Los Angeles, CA 90089 Abstracts of Current Literature events. It embeds an ontology reflecting the ordinary person's top-level cognitive model of real-world distinctions and a data base of prototype descriptions of real-world entities. KT is transportable, empirically-based and constrained for efficient reasoning in ways similar to human reasoning processes. COLING'86, pp. 216-221 DCKR - Knowledge Representation in Prolog and Its Application to Natural Language Processing Hozumi Tanaka Tokyo Institute of Technology Department of Computer Science O-okayama, 2-12-1, Megro-ku Tokyo, Japan COLING'86, pp. 222-225 Conceptual Lexicon Using an ObjectOriented Language Shoichi Yokoyama Electrotechnieal Laboratory Tsukuba, Ibaraki, Japan Kenji Hanakata Universitat Stuttgart Stuttgart, F.R. Germany COLING'86, pp. 226-228 Elementary Contracts as a Pragmatic Basis of Language Interaction E.L. Pershina AI Laboratory, Computer Center Siberian Division of the USSR Ac. Sci. Novosibirsk 630090, USSR COLING'86, pp. 229-231 Communicative Triad as a Structural Element of Language Interaction F. G. Dinenberg AI Laboratory, Computer Center Siberian Division of the USSR Ac. Sei. Novosibirsk 630090, USSR COLING'86, pp. 232-234 TBMS: Domain Specific Text Management and Lexicon Development 108 Semantic processing is one of the important tasks for natural language processing. Basic to semantic processing is descriptions of lexical items. The most frequently used form of description of lexical items is probably Frames or Objects. Therefore in what form Frames or Objects are expressed is a key issue for natural language processing. A method of the Object representation in Prolog called DCKR will be introduced. It will be seen that if part of general knowledge and a dictionary are described in DCKR, part of context-processing, and the greater part of semantic processing can be left to the functions built in Prolog. This paper describe the construction of a lexicon representing abstract concepts. This lexicon is written by an object-oriented language, CTALK, and forms a dynamic network system controlled by object-oriented mechanisms. The content of the lexicon is constructed using a Japanese dictionary. First, entry words and their definition parts are derived from the dictionary. Second, syntactic and semantic information is analyzed from these parts. Finally, superconcepts are assigned in the superconcept part in an object, static parts in the slot values, and dynamic operations to the message parts, respectively. One word has one object in a world, but through the superconcept part and slot part this connects to the subconcept of other words and worlds. When relative concepts are accumulated, the result will be a model of human thoughts which have conscious and unconscious parts. Language interaction (LI) as a part of interpersonal communication is considerably influenced by psychological and social roles of the partners and their pragmatic goals. These aspects of communication should be accounted for while elaborating advanced user-computer dialogue systems and developing formal models of LI. We propose here a formal description of communicative context of LI-situation, namely, a system of indices of LI agents' interest in achieving various pragmatic purposes and a system of contracts which reflect social and psychological roles of the LI agents and conventionalize their "rights" and "duties" in the LI-process. Different values of these parameters of communication allow us to state possibility a n d / o r necessity of certain types of speech acts under certain conditions of LI-situation. Researches on dialogue natural-language interaction with intellectual " h u m a n - c o m p u t e r " systems are based on models of language " h u m a n - t o human" interaction, these models representing descriptions of communication laws. An aspect of developing language interaction models is an investigation of dialogue structure. In the paper a notion of elementary communicative triad (SR-triad) is introduced to model the "stimulusreaction" relation between utterances in the dialogue. The use of the SRtriad apparatus allows us to represent a scheme of any dialogue as a triad structure. SR-triad structure being inherent both to natural and programming language dialogues, SR-system is claimed to be necessary while developing dialogue processors. The definition of a Text Base Management System is introduced in terms of software engineering. That gives a basis for discussing practical text Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter S. Goeser, E. Mergenthaler Universtity of Ulm Federal Republic of Germany Abstracts of CurrentLiterature administration, including questions on corpus properties and appropriate retrieval criteria. Finally, strategies for the derivation of a word data base from an actual TBMS will be discussed. COLING'86, pp. 235-240 Text Analysis and Knowledge Extraction Fujio Nishida, Shinobu Takamatsu, Tadaaki Tani, Hiroji Kusaka Department of Electrical Engineering Faculty of Engineering University of Osaka Prefecture Sakai, Osaka, 591 Japan COLING'86, pp. 241-243 Context Analysis System for Japanese Text Hitoshi Isahara, Shun Ishizaki Electrotechnical Laboratory 1-!-4, Umezono, Sakura-mura, Niihari-gun Ibaraki, Japan 305 COLING'86 pp. 244-246 Disambiguation and Language Acquisition through the Phrasal Lexicon Uri Zernik, Michael G. Dyer Artificial Intelligence Laboratory Computer Science Department 3531 Boelter Hall University of California Los Angeles, CA 90024 COLING'86, pp. 247-252 Linguistic Knowledge Extraction from Real Language Behavior K. Shirai, T. Hamada Department of Electrical Engineering Waseda University 3-4-10hkubo Shinjuku-ku, Tokyo, Japan COLING'86, pp. 253-255 The study of text understanding and knowledge extraction has been actively done by many researchers. The authors also studied a method of structured information extraction from texts without a global text analysis. The method is available for a comparatively short text such as a patent claim clause and an abstract of a technical paper. This paper describes the outline of a method of knowledge extraction from a longer text which needs a global text analysis. The kinds of texts are expository texts or explanation texts. Expository texts described here mean those which have various hierarchical headings such as a title, a heading of each section and sometimes an abstract. In this definition, most texts, including technical papers, reports, and newspapers, are expository. Text of this kind disclose the main knowledge in a top-down manner and show not only the location of an attribute value in a text but also several key points of the content. This property of expository texts contrasts with that of novels and stories in which an unexpected development of the plot is preferred. This paper pays attention to such characteristics of expository texts and describes a method of analyzing texts by referring to information contained in the intersentential relations and the headings of texts and then extracting requested knowledge such as a summary from texts in an efficient way. A natural language understanding system is described which extracts contextual information from Japanese texts. It integrates syntactic, semantic, and contextual processing serially. The syntactic analyzer obtains rough syntactic structures from the text. The semantic analyzer treats modifying relations inside noun phrases and case relations among verbs and noun phrases. Then, the contextual analyzer obtains contextual information from the semantic structure extracted by the semantic analyzer. Our system understands the context using precoded contextual knowledge on terrorism and plugs the event information in input sentences into the contextual structure. The phrase approach to language processing emphasizes the role of the lexicon as a knowledge source. Rather than maintaining a single generic lexical entry for each word, e.g., take, the lexicon contains many phrases, e.g., take on, take to the streets, take to swimming, take over, etc. Although this approach proves effective in parsing and in generation, there are two acute problems which still require solutions. First, due to the huge size of the phrase lexicon, especially when considering subtle meanings and idiosyncratic behavior of phrases, encoding of lexical entries cannot be done manually. Thus phrase acquisition must be employed to construct the lexi.con. Second, when a set of phrases is morpho-syntactically equivalent, disambiguation must be performed by semantic means. These problems are addressed in the program RINA. An approach to extract linguistic knowledge from real language behavior is described. This method depends on the extraction of word relations, patterns of which are obtained by structuring the dependency relations in sentences called Kakari-Uke relation in Japanese. As the first step of this approach, an experiment of a word classification utilizing those patterns was made on the 4178 sentences of real language data. A system was made to analyze dependency structure of sentences utilizing the knowledge Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 109 The FINITE STRING Newsletter Abstracts of CurrentLiterature base obtained through this word classification and the effectiveness of the knowledge base was evaluated. To develop this approach further, the relation matrix which captures multiple interaction of words is proposed. Tailoring Importance Evaluation to Reader's Goals: A Contribution to Descriptive Text Summarization Danilo Fum, Giovanni Guido, Carlo Tasso Istito di Matematica, Informatica e Sistemistica Universitfi di Udine, Italy COLING'86, pp. 2 5 6 - 2 5 9 Domain Dependent Natural Language Understanding Klaus Heje Munch Department of Computer Science Technical University of Denmark DK-2800 Lyngby, Denmark COLING'86, pp. 260-262 Morphological Analysis for a German Text-to-Speech System Amanda Pounder, Markus Kommenda Institut for Nachrichtentechnik und Hochfrequenztechnik Technische Universitat Wien Gusshausstrasse 25, A-1040 Wien, Austria COLING'86, pp. 2 6 3 - 2 6 8 Synergy of Syntax and Morphology in Automatic Parsing of French Language with a Minimum of Data Jacques Vergne, Pascale Pagbs Inalco Paris This paper deals with a new approach to importance evaluation of descriptive texts developed in the framework of SUSY, an experimental system in the domain of text summarization. The problem of taking into account the reader's goals in evaluating importance of different parts of a text is first analyzed. A solution to the design of a goal interpreter capable of computing a quantitative measure of the relevance degree of a piece of text according to a given goal is then proposed, and an example of goal interpreter operation is provided. A natural language understanding system for a restricted domain of discourse - thermodynamic exercises at an introductory level - is presented. The system transforms texts into a formal meaning representation language based on cases. The semantical interpretation of sentences and phrases is controlled by case frames formulated around verbs and surface grammatical roles in noun phrases. During the semantical interpretation of a text, semantic constraints may be imposed on elements of the text. Each sentence is analyzed with respect to context, making the system capable of solving anaphoric references such as definite descriptions, pronouns, and elliptic constructions. The system has been implemented and successfully tested on a selection of exercises. A central problem in speech synthesis with unrestricted vocabulary is the automatic derivation of correct pronunciation from the graphemic form of a text. The software module GRAPHON was developed to perform this conversion for G e r m a n and is currently being extended by a morphological analysis component. This analysis is based on a morph lexicon and a set of rules and structural descriptions for G e r m a n y word-forms. It provides each text input item with an individual characterization such that the phonological, syntactic, and prosodic components may operate upon it. This systematic approach thus serves to minimize the number of wrong transcriptions and at the same time lays the foundation for the generation of stress and intonation patterns, yielding more intelligible, natural-sounding, and generally acceptable synthetic speech. We intend to present in this paper a parsing method of French language whose particularities are: a multi-level approach: syntax and morphology working simultaneously, the use of string pattern matching and the absence of dictionary. We want here to evaluate the feasibility of the method rather than to present an operational system. COLING'86, pp. 269-271 A Morphological Recognizer with Syntactic and Phonologic Rules John Bear Artificial Intelligence Center SRI International 333 Ravenswood Avenue Menlo Park, CA 94025 This paper describes a morphological analyzer which, when parsing a word, uses two sets of rules: rules describing the syntax of words, and and rules describing facts about orthography. COLING'86, pp. 2 72-2 76 A Dictionary and Morphological Analyser for English G.J. Russell, S.G. Pulman 110 This paper describes the current state of a three-year project aimed at the development of software for use in handling large quantities of dictionary information within natural language processing systems. The project ... is Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter Computer Laboratory University of Cambridge G.D. Ritchie, A.W. Black Department of Artificial Intelligence University of Edinburgh COLING'86, pp. 2 77-2 79 A Kana-Kanji Translation System for Non-Segmented Input Sentences based on Syntactic and Semantic Analysis Masahiro Abe, Yoshimitsu Ooshima, Katsuhiko Yuura, Nobuyuki Takeichi Central Research Laboratory Hitachi, Ltd. Kokubunji, Tokyo, Japan COLING'86, pp. 280-285 A Compression Technique for Arabic Dictionaries: The Affix Analysis Abdelmafid Ben Hamadou D6partement of Computer Science FSEG Faculty B.P. 69 - Route de l'a6roport SFAX Tunisia COLING'86, pp. 286-288 Machine Learning of Morphological Rules by Generalization and Analogy Klaus Wothke Arbeitsstelle Linguistische Datenverarbeitung Institut for Deutsche Sprache Mannheim, West Germany COLING'86, pp. 289-293 Linguistic Developments in Eurotra since 1983 Lieven Jaspaert Abstracts of CurrentLiterature one of three closely related projects funded under the Alvey IKBS Programme (Natural Language Theme); a parser is under development at Edinburgh by Henry Thompson and John Phillips), and a sentence grammar is being devised by Ted Biscoe and Clare Grover at Lancaster and Bran Boguraev and John Carroll at Cambridge. It is intended that the software and rules produced by all three projects will be directly compatible and capable of functioning in an integrated system. This paper presents a disambiguation approach for translating non-segmented-Kana into Kanji. The method consists of two steps. In the first step, an input sentence is analyzed morphologically and ambiguous morphemes are stored in a network form. In the second step, the best path, which is a string of morphemes, is selected by syntactic and semantic analysis based on case grammar. In order to avoid the combinatorial explosion of possible paths, the following heuristic search method is adopted. First, a path that contains the smallest number of weighted-morphemes is chosen as the quasi-best path by a best-first-search technique. Next, the restricted range of morphemes near the quasi-best path is extracted from the morpheme network to construct preferential paths. An experimental system incorporating large dictionaries has been developed and evaluated, m translation accuracy of 90.5 was obtained. This can be improved to about 95°/6 by optimizing the dictionaries. In every application that concerns the automatic processing of natural language, the problem of the dictionary size is posed. In this paper we propose a compression dictionary algorithm based on an affix analysis of the non-diacritical Arabic. It consists in decomposing a word into its first elements, taking into account the different linguistic transformations that can affect the morphological structures. This work has been achieved as part of a study of the automatic detection and correction of spelling-errors in the non-diacritical Arabic texts. This paper describes an experimental procedure for the inductive automated learning of morphological rules from examples. At first an outline of the problem is given. Then a formalism for the representation of morphological rules is defined. This formalism is used by the automated procedure, whose anatomy is subsequently presented. Finally, the performance of the system is evaluated and the most important unsolved problems are discussed. I wish to put the theory and metatheory currently adopted in the Eurotra project into a historical perspective, indicating where and why changes to its basic design for a transfer-based MT (TBMT) system have been made. Katholieke Universiteit Leuven Belgium COLING'86, pp. 294-296 The < C , A > Framework in Eurotra: A Theoretically Committed Notation for MT D.J. Arnold University of Essex Colchester, Essex CO4 3SQ, UK S. Krauwer, L. des Tombe University of Utrecht Trans 14, 3512 JK Utrecht, The Netherlands This paper describes a model for MT, developed within the Eurotra MT project, based on the idea of compositional translation, by describing a basic, experimental notation which embodies the idea. The introduction provides background, section 1 introduces the basic ideas and the notation, and section 2 discusses some of the theoretical and practical implications of the model, including some concrete extensions, and some more speculative discussion. Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 111 The FINITE STRING Newsletter Abstracts of Current Literature M. Rosner ISSCO 54, Route des Acacias 1227 Geneva, Switzerland G.B. Varile Commission of the European Communities L-2928 Luxembourg COLING'86, pp. 297-303 Generating Semantic Structures in Eurotra-D Erich Steiner IAI - Eurotra - D Martin-Luther-Strasse 14 D-6600 Saarbrticken, West Germany COLING'86, pp. 304 306 Valency Theory in a Stratificational MT System Paul Schmidt IAI Eurotra-D Martin-Luther-Strasse 14 D-6600 Saarbr0eken, West Germany COLING'86 pp. 307-312 A Compositional Approach to the Translation of Temporal Expressions in the Rosetta System Lisette Appelo Philips Research Laboratories Eindhoven, The Netherlands The following paper is based on work done in the multi-lingual MT project Eurotra, and MT project of the European Community. Analysis and generation of clauses within the Eurotra framework proceeds through the levels of (at least) Eurotra constituent structure (ECS), Eurotra relation structure (ERS), and interface structure (IS). At IS, labelling of nodes consists of labellings for time, modality, semantic features, semantic relations, and others. In this paper, we shall be concerned exclusively with semantic relations (SRs), to which we shall also refer as "participant roles" (PR). According to current Eurotra legislation, these SRs are assigned to dictionary entries of verbs (and other word classes, which will be disregarded in this paper) by coders, and through these entries to clauses in a pattern matching process. This approach, while certainly valid in principle, leads to the problem of inter-coder-consistency, at least as long as the means for identifying SRs are paraphrase tests for SRs. In Eurotra-D, we have for some time now been experimenting with a set of SRs, or PRs, which are identified with the help of syntactic criteria. This approach will be outlined in this paper. This paper tries to investigate valency theory as a linguistic tool in machine translation. There are three main areas in which major questions arise: (1) Valency theory itself. I sketch a valency theory in linguistic terms which includes the discussion of the nature of dependency representation as an interface for semantic description. (2) The dependency representation in the translation process. I try to sketch the different roles of dependency representation in analysis and generation. (3) The implementation of valency theory in an MT system. I give a few examples for how a valency description could be implemented in the Eurotra formalism. This paper discusses the translation of temporal expressions, in the framework of the machine translation system Rosetta. The translation method of Rosetta, the "isomorphic grammar method", is based on Montague's Compositionality Principle. It shows that a compositional approach leads to a transparent account of the complex aspects of time in natural language and can be used for the translation of temporal expressions. COLING'86, pp. 313-318 Idioms in the Rosetta Machine Translation System Andrd Schenk Philips Research Laboratories Eindhoven, The Netherlands COLING'86, pp. 319-324 NARA: A Two-Way Simultaneous Interpretation System between Korean and Japanese - A Methodological Study Hee Sung (?hung, Tosiyasu L. Kunii l 12 This paper discusses one of the problems of machine translation, namely the translation of idioms. The paper describes a solution to this problem within the theoretical framework of the Rosetta machine translation system. Rosetta is an experimental translation system which uses an intermediate language and translates between Dutch, English, and, in the future, Spanish. This paper presents a new computing model for constructing a two-way simultaneous interpretation system between Korean and Japanese. We also propose several methodological approaches to the construction of a two-way simultaneous interpretation system, and realize the two-way Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter Department of Information Science Faculty of Science, University of Tokyo 7-3-1 Hongo, Bunkyo-ku Tokyo, 113 Japan Abstracts of CurrentLiterature interpreting process as a model unifying both linguistic competence and linguistic performance. The model is verified theoretically and through actual applications. COLING'86 pp. 325-328 Strategies for Interactive Machine Translation: The Experience and Implications of the UMIST Japanese Project P.J. Whitelock, M. McGee Wood, B.J. Chandler, N. Holden, H.J. Horsfall Centre for Computational Linguistics University of Manchester Institute of Science and Technology PO Box 88, Manchester M60 1QD UK COLING'86 pp. 329-334 Pragmatics in Machine Translation Annely Rothkegel Universit~it Saarbrticken Sonderforschungsbereich 100 Elektronische Sprachforschung D 6600 Saarbrticken, West Germany COLING'86, pp. 335-337 A Metric for Computational Analysis of Meaning: Toward an Applied Theory of Linguistic Semantics Sergei Nirenburg Department of Computer Science Colgate University Hamilton, NY 13346 SERGEC@IOLGATE Victor Raskin Department of English Purdue University West Lafayette, IN 47907 JHP@ZURDUE-ASC.CSNET At the Centre for Computational Linguistics, we are designing and implementing an English-to-Japanese interactive machine translation system. The project is funded jointly by the Alvey Directorate and International Computers Limited (ICL). The prototype system runs on the ICL PERQ, though much of the development work has been done on a VAX 11/750. It is implemented in Prolog, in the interests of rapid prototyping, but intended for optimization. The informing principles are those of modern complex-feature-based linguistic theories, in particular Lexical-Functional G r a m m a r (Bresnan (ed.) 1982, Kaplan and Bresnan 1982), and Generalized Phrase Structure G r a m m a r (Gazdar et al. 1985). For development purposes we are using an existing corpus of 10,000 words of continuous prose from the PERQ's graphics documentation; in the long term, the system will be extended for use by technical writers in fields other than software. At the time of writing, we have well-developed system development software, user interface, and grammar and dictionary handling facilities. The English analysis grammar handles most of the syntactic structures of the corpus, and we have a range of formats for output of linguistic representations and Japanese text. A transfer grammar for English-Japanese has been prototyped, but is not yet fully adequate to handle all constructions in the corpus; a facility for dictionary entry in Kanji is incorporated. The aspect of the system we will focus on in the present paper is its interactive nature, discussing the range of different types of interaction which are provided or permitted for different types of users. TEXAN is a system of transfer-oriented text analysis. Its linguistic concept is based on a communicative approach within the framework of speech act theory. In this view texts are considered to be the result of linguistic actions. It is assumed that they control the selection of translation equivalents. The transition of this concept of linguistic actions (text acts) to the model of computer analysis is performed by a context-free illocution grammar processing categories of actions and a propositional structure of states of affairs. The grammar which is related to a text lexicon, provides the connection of these categories and the linguistic surface units of a single language. A metric for .assessing the complexity of semantic (and pragmatic) analysis in natural language processing is proposed as part of a general applied theory of linguistic semantics for NLP. The theory is intended as a complete projection of linguistic semantics onto NLP and is designed as an exhaustive list of possible choices among strategies of semantic analysis at each level, from the word to the entire text. The alternatives are summarized in a chart, which can be completed for each existing or projected NLP system. The remaining components of the applied theory are also outlined. COLING'86, pp. 338-340 Collative Semantics Dan Fass Computing Research Laboratory This paper introduces Collative Semantics (CS), a new domain-independent semantics for natural language processing (NLP) which addresses the problems of lexical ambiguity, metonymy, various semantic relations Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 113 The FINITE STRING Newsletter New Mexico State University Las Cruces, NM 88003 COLING'86, pp. 341..343 A Logical Formalism for the Representation of Determiners Barbara Di Eugenio, Leonardo ILesmo, Paolo Pogliano, Pietro Torasso, Francesco Urbano Dipartimento di Informatica Universitfi di Torino Via Valperga Caluso 37 10125 Torino - Italy Abstracts of Current Literature (conventional relations, redundant relations, contradictory relations, metaphorical relations, and severely anomalous relations) and the introduction of new information. We explain the two techniques CS uses for matching together knowledge structures (KSs) and why semantic vectors, which record the results of such matches, are informative enough to tell apart semantic relations and be the basis for lexical disambiguation. Determiners play an important role in conveying the meaning of an utterance, but they have often been disregarded, perhaps because it seemed more important to devise methods to grasp the global meaning of a sentence, even if not in a precise way. Another problem with determiners is their inherent ambiguity. In this paper we propose a logical formalism, which, among other things, is suitable for representing determiners without forcing a particular interpretation when their meaning is still not clear. COLING'86, pp. 344-346 A Compositional Semantics for Directional Modifiers - Locative Case Reopened Erhard W. Hinrichs Bolt Beranek & Newman Laboratories 10 Moulton Street Cambridge, MA 02238 This paper presents a model-theoretic semantics for directional modifiers in English. The semantic theory presupposed for the analysis is that of Montague G r a m m a r (cf. Montague 1970, 1973) which makes it possible to develop a strongly compositional treatment of directional modifiers. Such a treatment has significant computational advantages over case-based treatments of directional modifiers that are advocated in the AI literature. COLING'86, pp. 347-349 Temporal Relations in Texts and Time Logical Inferences Jiirgen Kunze Central Institute of Linguistics Academy of Sciences of GDR DDR- 1100 Berlin COLING'86, pp. 350-352 Linguistics Bases for Machine Translation Christian Rohrer Institut for Linguistik Universit~t Stuttgart KeplerstraBe 17 7000 Stuttgart 1 A calculus is presented which allows an efficient treatment of the following components: Tenses, temporal conjunctions, temporal adverbials (of "definite" type), temporal quantifications, and phases. The phases are a means for structuring the set of time-points t where a certain proposition is valid. For one proposition, there may exist several "phase"-perspectives. The calculus has integrative properties, i.e., all five components are represented by the same formal means. This renders possible a rather easy combination of all information and conditions coming from the aforesaid components. My aim in organizing this panel is to stimulate the discussion between researchers working on MT and linguists interested in formal syntax and semantics. I am convinced that a closer cooperation will be fruitful for both sides. I will be talking about experimental MT or MT as a research project and not as a development project. COLING'86, pp. 353-355 Combining Deictic Gestures and Natural Language for Referent Identification Alfred Kobsa, Jiirgen Allgayer, Carola Redding, Norbert Reithinger, Dagmar Schmauks, Karin Harbusch, Wolfgang Washlster SFB 314: AI - Knowledge-Based Systems University of Saarbrucken D-6600 SaarbrOken l 1, West Germany COLING'86, pp. 356-361 114 In virtually all current natural-language dialog systems, users can only refer to objects by using linguistic descriptions. However, in human face-to-face conversation, participants frequently use various sorts of deictic gestures as well. In this paper, we will present the referent identification component of XTRA, a system for a natural-language access to expert systems. XTRA allows the user to combine NL input together with pointing gestures on the terminal screen in order to refer to objects on the display. Information about the location and type of this deictic gesture, as well as about the linguistic description of the referred object, the case frame, and the dialog memory are utilized for identifying the object. The system is tolerant in respect to impreciseness of both the deictic and the natural language input. The user can thereby refer to objects more easily, avoid referential failures, and employ vague everyday terms instead of precise technical notions. Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter An Approach to Non-Singular Terms in Discourse Tomek Strzalkowski School of Computing Science Simon Fraser University Burnaby, B.C. Canada V5A 1S6 COLING'86, pp. 362-364 Processing Clinical Narratives in Hungarian Gdbor Praszdky National Education Library and Museum Computer Department Honv6d u. 19 H-1055 Budapest, Hungary COLING'86, pp. 365-367 Definite Noun Phrases and the Semantics of Discourse Manfred Pinkal c/o Fraunhofer-Institute IAO Holzgartenstrasse 17, D 7000 Stuttgart 1 a n d Institut fur Linguistik Universit~it Stuttgart COL1NG '86, pp. 368-3 73 Learning the Space of Word Meanings for Information Retrieval Systems Koichi Hori, Seinosuke Toda, Hisashi Yasunaga National Institute of Japanese Literature 1- 16-10 Yutakacho Shingawaku Tokyo 142 Japan COLING '86, pp. 3 74-3 79 On the Use of Term Associations in Automatic Information Retrieval Gerard Salton Department of Computer Science Cornel University Ithaca, NY 14853 COLING'86, pp. 380-386 Abstracts of Current Literature A new Theory of Names and Descriptions that offers a uniform treatment for many types of non-singular concepts found in natural language discourse is presented. We introduce a layered model of the language denotational base (the universe) in which every world object is assigned a layer (level) reflecting its relative singularity with respect to other objects in the universe. We define the notion of relative singularity of world objects as an abstraction class of the layer-membership relation. This paper describes a system that extracts information from Hungarian descriptive texts of medical domain. Texts of clinical narratives define a sublanguage that uses limited syntax but holds the main characteristics of the language, namely free word order and rich morphology. We offer a fairly general parsing method for free word order languages and how to use it for parsing Hungarian clinical texts. The system can handle simple cases of ellipses, anaphora, unknown words, and typical abbreviations of clinical practice. The system translates texts of anamneses, patient visits, laboratory tests, medical examinations, and discharge summaries into an information format usable for a medical expert system. Similarly to this expert system, the information formatting program has been written in MPROLOG language and its experimental version runs on PROPER-16, a Hungarian-made (IBM-XT compatible) microcomputer. In this talk I will first give a short overview of the basic Discourse Representation Theory system (Kamp 1981), and sketch K a m p ' s proposal for the treatment of definite noun phrases. Then I will indicate how the basic reference establishing function and the "side-effects" of different types of definite NPs can be described in more detail. In doing this, I will refer to the work about anaphora done in the NLP area (especially by Barbara Grosz, Candy Sidner, and Bonnie Webber), integrating some of their assumptions into the DRT framework, and critically commenting on some others. Several methods to represent meanings of words have been proposed. However, they are not useful for information retrieval systems, because they cannot deal with the entities that cannot be universally represented by symbols. In this paper we propose a notion of semantic space. Semantic space is an Euclidean space where words and entities are put. A word is one point in the space. The meanings of the word are represented as the space configuration around the word. The entities that cannot be represented by symbols can be identified in the space by the location the entity should be settled in. We also give a learning mechanism for the space. We prove the effectiveness of the proposed method by an experiment on information retrieval for the study of Japanese literature. It has been recognized that single words extracted from natural language texts are not always useful for the representation of information content. Associated or related terms, and complex content identifiers derived from thesauruses and knowledge bases, or constructed by automatic word grouping techniques, have therefore been proposed for text identification purposes. The area of associative content analysis and information retrieval is reviewed in this study. The available experimental evidence shows that none of the existing or proposed methodologies are guaranteed to improve retrieval performance in a replicable manner for document collections in different subject areas. The associative techniques are most valuable for restricted environments covering narrow subject areas, or in iterative Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 115 The FINITESTRINGNewsletter Abstracts of Current Literature search situations where user inputs are available to refine previously available query formulations and search output. Towards the Automatic Acquisition of Lexical Data H. Trost, E. Buchberger Department of Medical Cybernetics and Artificial Intelligence University of Vienna, Austria COLING'86, pp. 3 8 7 - 3 8 9 PeriPhrase: Lingware for Parsing and Structural Transfer Kenneth R. Beesley, David Hefner A.L.P. Systems 190 West 800 North Provo, UT 84604 Creating a knowledge base has always been a bottleneck in the implementation of AI systems. This is also true for Natural Language Understanding systems, particularly •for data-driven ones. While a perfect system for automatic acquisition of all sorts of knowledge is still far from being realized, partial solutions are possible. This holds especially for lexical data. Nevertheless, the task is not trivial, in particular when dealing with languages rich in inflectional forms like German. Our system is to be used by persons with no specific linguistic knowledge, thus linguistic expertise has been put into the system to ascertain correct classification of words. Classification is done by means of a small rule based system with lexical knowledge and language-specific heuristics. The key idea is the identification of three sorts of knowledge which are processed distinctly and the optimal use of knowledge already contained in the existing lexicon. PeriPhrase is a high-level computer language developed by A.L.P. Systems to facilitate parsing and structural transfer. It is designed to speed the development of computer-assisted translation systems and grammar checkers. We describe the syntax and semantics of this tool, its integrated development environment, and some of our experience with it. COLING'86, pp. 390-392 SCSL: A Linguistic Specification Language for MT R~mi Zajac GETA, BP 68 Universit6 de Grenoble 38402 Saint-Martin-d'H~res, France COLING'86, pp. 3 9 3 - 3 9 8 A User Friendly ATN Programming Environment (APE) Hans Haugeneder, Manfred Gehrke siemens AG, ZT ZTI INF West Germany Nowadays, MT systems grow to such a size that a first specification step is necessary if we want to be able to master their development and maintenance, for the software part as well as for the linguistic part ("lingwares"). Advocating for a clean separation between linguistics tasks and programming tasks, we first introduce a specification/implementation/ validation framework for NLP, then SCSL, a language for the specification of analysis and generation modules. APE is a workbench to develop ATN grammars based on an active chart parser. It represents the networks graphically and supports the grammar writer by window- and menu-based debugging techniques. COLING'86, pp. 399-401 A Language for Transcriptions Yves LePage GETA, BP 68 Universit6 Scientifique et M6dicale de Grenoble 38402 Saint-Martin-d'H~res, France To deal with specific alphabets is a necessity in natural language processing. In Grenoble, this problem is solved with the help of transcriptions. Here we present a language (LT) designed to the rapid writing of passage from one transcription to another (transducers) and give some examples of its use. COLING'86, pp. 402-404 Variables et Categories Grammticales dans un Modele Ariane Jean-Phillipe Guilbaud GETA, BP 68 Universit6 Scientifique et M6dicale de Grenoble 38402 Saint-Martin-d'H~res, France pp. 4 0 5 - 4 0 7 116 Toutes les cat6gories grammaticales utilis6es dans un module de traduction Ariane sont formalis6es et cod6es de facon mn6monique en tant que variables et valeurs de variables. L'ensemble des variables d'un mod61e donn6 constitute le vocabulaire du m6talangage qui permet de d6crire la langue source et la langue cible de ce mod61e. •La structure de donn6es du syst6me est une arborescence dont chaque noeud porte une d6coration. Les d6corations contiennent les variables d6clar6es pour le syst6me et affect6es de certaines valeurs. Les variables apparaissent 6galement dans les grammaires d'analyse, de transfert et de g6n6ration, dans les dictionnaires monolingues d'analyse ou de g6n6ration Computational Lingltdstics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter Abstracts of Current Literature et bilingues de transfert lexical, ainsi que dans les sp6cifications de mod61e linguistique (grammaires statiques). Di~duction Automatique et Syst~mes Transformationnels J. Chauchd C.E.L.T.A. 23, Boulevard Albert ler 54000 - Nancy, France Les syst6mes transformationnels utilisent des processus d6ductifs d'une approche diff6rente des syst6mes utilis6s en intelligence artificielle. A travers une comparaison du language Proglog et du language Sygmart, il est montr6 comment r6aliser dans les syst6mes transformationnels des applications utilisant des raisonnements et des bases de connaissances. COLING'86, pp. 408-411 CRITAC - A Japanese Text Proofreading System Koichi Takeda, Tet~nosuke Fujisaki, Emiko Suzuki Japan Science Institute IBM Japan, Ltd. 5- i 9 Sanban-cho, Chiyoda-ku, Tokyo 102, Japan CRITAC (CRITiquing using ACcumulated knowledge) is an experimental expert system for proofreading Japanese text. It detects mistypes, Kanato-Kanji rnisconversions, and stylistic errors. This system combines Prolog-coded heuristic knowledge with conventional Japanese text processing techniques which involve heavy computation and access to large language data bases. COLING'86, pp. 412-417 Storing Text using Integer Codes Raja Noor Ainon Computer Centre University of Malaya 59100 Kuala Lumpur, Malaysia COLING'86, pp. 418-420 BetaText: An Event Driven Text Processing and Text Analyzing System Benny Brodda Department of Linguistics University of Stockholm S-106 91 Stockholm, Sweden COLING'86, pp. 421-422 Toward Integrated Dictionaries for M(a)T Ch, Boitet, N. Nedobejkine GETA, BP 68 Universit6 de Grenoble 38402 Sint-Martin-d'H6res, France COLING'86, pp. 423-428 lndexage Lexical au GETA Jedrzej Bukowski Traditionally, text is stored on computers as a stream of characters. The goal of this research is to store text in a form that facilitates word manipulation whilst reducing storage space. A word list with syntactic linear ordering is stored and words in a text are given two-byte integer codes that point to their respective positions in this list. The implementation of the encoding scheme is described and the performance statistics of this encoding scheme are presented. BetaText can be described as an event driven production system, in which (combinations of) text events lead to certain actions, such as the printing of sentences that exhibit certain, say, syntactic phenomena. The analysis mechanism used allows for arbitrarily complex parsing, but is particularly suitable for finite state parsing. A careful investigation of what is actually needed in linguistically relevant text processing resulted in a rather small but carefully chosen set of "elementary actions" to be implemented. In the framework of Machine (aided) Translation systems, two types of lexical knowledge are used, "natural" and "formal", in the form of on-line terminological resources for human translators or revisors and of coded dictionaries for Machine Translation proper. A new organization is presented, which allows one to integrate both types in a unique structure, called " f o r k " integrated dictionary, or FID. A given FID is associated with one natural language and may give access to translations into several other languages. The FIDs associated with languages L1 and L2 contain all information necessary to generate coded dictionaries of M(a)T systems translating from L1 into L2 or vice-versa. The skeleton of a FID may be viewed as a classical bilingual dictionary. Each item is a tree structure, constructed by taking the "natural" information (a tree) and "grafting" onto it some "formal" information. Various aspects of this design are refined and illustrated by detailed examples, several scenarios for the construction of FIDs are presented, and some problems of organization and implementation are discussed. A prototype implementation of the FID structure is underway in Grenoble. L'aspect lexicographique de la traduction assist6e par ordinateur est pr6sent6 et illustr6 par des exemples de traduction du russe en francais Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 117 The FINITE STRING Newsletter Abstracts of CurrentLiterature GETA, BP 68 Universit6 Scientifique et M6dicale de Grenoble 38402 Saint-Martin-d'H~res, France COLING'86, pp. 429-431 r6alis6e par le GETA ~t Grenoble. Experiments with an MT-Directed Lexicai Knowledge Bank A crucial test for any MT system is its power to solve lexical ambiguities. The size of the lexicon, its structural principles, and the availability of extra-linguistic knowledge are the most important aspects in this respect. This paper outlines the experimental development of the SWESIL system: a structured lexicon-based word expert system designed to play a pivotal role in the process of Distributed Language Translation which is being developed in the Netherlands. It presents SWESIL's organizing principles, gives a short description of the present experimental set-up, and shows how SWESIL is being tested at this moment. B.C. Papegaaij, V. Sadler, A.P.M. Witkam BSO/Research Bureau voor Systeemontwikkeling P.O. Box 8348 3503 RH Utrecht, The Netherlands COLING'86, pp. 432-434 A Word Database for Natural Language Processing Brigitt Barnett, Hubert ILehmann, Magdalena Zoeppritze IBM Scientific Center TiergartenstraBe 15 6900 Heidelberg, Federal Republic of Germany COLING'86, pp. 435-440 Lexical Database Design: The Shakespeare Dictionary Model H. Joachim Neuhaus The paper describes the design of a fair sized lexical data base that is ~o be used with a natural language based expert system with G e r m a n as the language of interaction. Sources for entries and tools for constructing and maintaining the database are discussed, as well as the information needed in the lexicon for the purposes of syntactic and semantic processing. This paper describes the data and presents some preliminary design considerations along with a sample schema. Westf~dische Wilhelms-Universitat, FB 12 D-4400 MOnster, West Germany COLING'86, pp. 441-444 An Attempt to Automatic Thesaurus Construction from an Ordinary Japanese Language Directory Hiroaki Tsurumaru Department of Electronics Nagasaki University Nagasaki 852, Japan H o w to obtain hierarchical relations (e.g., superordinate-hyponym relation, synonym relation) is one of the most important problems for thesaurus construction. A pilot system for extracting these relations automatically from an ordinary Japanese language dictionary (Shinmeikai Kokugojiten, published by Sansei-do, in machine readable form) is given. The features of the definition sentences in the dictionary, the mechanical extraction of the hierarchical relations, and the estimation of the results are discussed. Tom Hitaka, Sho Yoshida Department of Electronics Kyushu University 36 Fujuoka 812, Japan COLING'86, pp. 445-447 Acquisition of Knowledge Data by Analyzing Natural Language Yasuhito Tanaka Himeji College 1-1-12 Shinzaike Honmachi Himeji City Hyogoken 670 Japan Automatic identification of homonyms in kana-to-kanji conversion systems and of multivocal words in machine translation systems cannot be sufficiently implemented by the mere combination of grammar and word dictionaries. This calls for a new concept of knowledge data. What the new knowledge data is and how it can be acquired are mentioned in the paper. In natural language research, active discussion has been made within the framework of knowledge and samples of knowledge. Sho Yoshida Kyushu University 6-10-1 Hakozaki Higashiku Fukuoka City Fukuokaken 812 Japan COLING'86, pp. 448-450 118 Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The F I N I T E S T R I N G Newsletter Model for Lexical Knowledge Base Michio Isoda, Hideo Also Faculty of Science and Technology Keio University Noriyuki Kamibayashi, Yoshifumi Matsunaga System Technology Laboratory Fuji Xerox Co., Ltd. COLING'86, pp. 451-453 User Specification of Syntactic Case Frames in TELl, A Transportable, UserCustomized Natural Language Processor Bruce W. Ballard AT&T Bell Laboratories 600 Mountain Avenue Murray Hill, NJ 07974 COLING'86, pp. 454-460 Functional Structures for Parsing Dependency Constraints H. Jiippinen, A. Lehtola, K. Valkonen SITRA Foundation P.O. Box 329 Helsinki, Finland and Helsinki University of Technology Abstracts of Current Literature This paper describes a model for a lexical knowledge base (LKB). An LKB is a knowledge base management system (KBMS) which stores various kinds of dictionary knowledge in a uniform framework and provides multiple viewpoints to the stored knowledge. KBMSs for natural language knowledge will be fundamental components of knowledgeable environments where non-computer professionals can use various kinds of support tools for document preparation or translation. However, basic models for such KBMSs have not been established yet. Thus, we propose a model for an LKB focusing on dictionary knowledge such as that obtained from machine-readable dictionaries. When an LKB is given a key from a user, it accesses the stored knowledge associated with that key. In addition to conventional direct retrieval, the LKB has a more intelligent access capability to retrieve related knowledge through relationships among knowledge units. To represent complex and irregular relationships, we employ the notion of implicit relationships. In contrast to conventional database models where relationships between data items are statically defined at data generation time, the LKB extracts relationships dynamically by interpreting the contents of stored knowledge at run time. This makes the LKB more flexible; users can add new functions or new knowledge incrementally at any time. The LKB also has the capability to define and construct new virtual dictionaries from existing dictionaries. Thus users can define their own customized dictionaries suitable for their specific purposes. The proposed model provides a logical foundation for building flexible and intelligent LKBs. In this paper, we present methods that allow the users of a natural language processor (NLP) to define, inspect, and modify any case frame information associated with the words and phrases known to the system. An implementation of this work forms a critical part of the Transportable English-Language Interface (TEL1) system. However, our techniques have enabled customization capabilities largely independent of the specific NLP for which information is being acquired. The primary goal of the syntactic acquisitions of TEL1 is to redress the fact that many NL prototypes have failed (1) to make known to users exactly what inputs are allowed (e.g., what words and phrases are defined) and (2) to meet the needs of a given user or group of users (e.g., appropriate vocabulary, syntax, and semantics). Experience has shown that neither users nor system designers can predict in advance all the words, phrases, and associated meanings that will arise in accessing a given data base (cf., Tennant 1977). Thus, we have chosen to make TEL1 "transportable" in an extreme sense, where customizations may be performed (1) by end users, as opposed to computer professionals, and (2) at any time during English processing. This paper outlines a high-level language FUNDPL for expressing functional structures for parsing dependency constraints. The goal of the language is to allow a grammar writer to pin down his or her grammar with minimal commitment to control. FUNDPL interpreter has been implemented on top of a lower-level language DPL, which we had implemented earlier. COLING'86, pp. 461-463 Controlled Active Procedures as a Tool for Linguistic Engineering Heinz-Dirk Luckhardt, Manfred Thiel Sonderforschungsbereich I00 Controlled active procedures are productions that are grouped under and activated by units called "scouts". Scouts are controlled by units called "mission", which also select relevant sections from the data structure for rule application. Following the problem reduction method, the parsing Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 I 19 The FINITE STRING Newsletter "Elektronische Sprachforschung" Universitat des Saarlandes D-6600 Saarbrticken 1 l Bundesrepublik Deutschland COLING'86, pp. 464-469 A New Predictive Analyzer of English Hiroyuki Musha Department of Information Science Tokyo Institute of Technology Ohokayama, Meguro-ku, Tokyo 152, Japan COLING'86, pp. 470-472 Generalized Memory Manipulating Actions for Parsing Natural Language Irina Prodanof Istituto di Linguistica Computazionale CNR-Pisa Giacomo Ferrari Abstractsof CurrentLiterature problem is subdivided into ever smaller subproblems, each one of which is r e p r e s e n t e d b y a mission. The elementary problems are represented by scouts. The CAP grammar formalism is based on experience gained with natural language (NL) analysis and translation by computer in the Sonderforschungsbereich at the University of Saarbrticken over the past twelve years and dictated by the wish to develop an efficient parser for random NL texts on a sound theoretical basis. The idea has ripened in discussions with colleagues from the EUROTRA project and is based on what HeinzDieter Maas has developed in the framework of the SUSY-II system. In the present paper, CAP is introduced as a means of linguistic engineering (cf., Simmons 1985), which covers aspects like rule writing, parsing strategies, syntactic and semantic representation of meaning, representation of lexical knowledge, etc. Aspects of syntactic predictions made during the recognition of English sentences are investigated. We reinforce Kuno's original predictive analyzer by introducing five types of predictions. For each type of prediction, we discuss and present its necessity, its description method, and recognition mechanisms. We make use of three kinds of stacks whose behavior is specified by grammar rules in an extended version of Greibach normal form. We also investigate other factors that affect the predictive recognition process, i.e., preferences among syntactic ambiguities and necessary amount of lookahead. These factors as well as the proposed handling mechanisms of predictions are tested by analyzing two kinds of articles. In our experiment, more than seventy percent of sentences are recognized and looking two words ahead seems to be the critical length for the predictive recognition. Current (computational) linguistic theories have developed specific formalisms for representing linguistic phenomena such as unbounded dependencies, relative, etc. In this contribution we present a model of linguistic structures storing and accessing, which accounts for the same phenomena in a procedural way. Such a model has been implemented in the frame of an ATN parser. Department of Linguistics University of Pisa COLING'86, pp. 473-475 Distributed Memory: A Basis for Chart Parsing Jon M. Slack Human Cognition Research Laboratory Open University Milton Keynes, MK7 6AA England COLING'86, pp. 476-481 120 The properties of distributed representations and memory systems are explored as a potential basis for non-deterministic parsing mechanisms. The structure of a distributed chart parsing representation is outlined. Such a representation encodes both immediate-dominance and terminal projection information on a single composite memory vector. A parsing architecture is described which uses a permanent store of context-free rule patterns encoded as split composite vectors, and two interacting working memory units. These latter two units encode vectors which correspond to the active and inactive edges of an active chart parsing scheme. This type of virtual parsing mechanism is compatible with both a macro-level implementation based on standard sequential processing and a micro-level implementation using a massively parallel architecture. The research to be discussed here differs from previous work in that it explores the properties of distributed representations as a basis for constructing parallel parsing architectures. Rather than being represented by localized networks of processing units, the grammar rules are encoded as patterns which have their effect through simple, yet well-specified, forms of interaction. The aim of the research is to devise a virtual machine Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter Abstracts of Current Literature for parsing context-free languages based on the mutual interaction of relatively simple memory components. The Treatment of M o v e m e n t Rules in a LFG Parser Haus-UIrich Block, Ham Haugeneder Siemens AG, Mtinchen ZT ZT1 INF, West Germany COLING'86, pp. 482-486 A Concept of Derivation for LFG Jiirgen Wedekind Department of Linguistics University of Stuttgart West Germany COLING'86, pp. 487-489 Incremental Construction of C - and F-Structure in a LFG Parser Haus-Ulrich Boock, Rudolf Hunze ZTI INF 3 Siemens AG Mtmchen, West Germany COLING'86, pp. 490-493 Getting Things Out of Order Laus Netter Department of Linguistics University of Stuttgart West Germany COLING'86, pp. 494-496 In this paper we propose a way of treating long-distance movement phenomena as exemplified in (1) in the framework of an LFG-based parser. (1) Who do you think Peter tried to meet 'You think Peter tried to meet who' We therefore concentrate first on the theoretical status of so-called whor long-distance movement in Lexical Functional G r a m m a r (LFG) and in the Theory of Government and Binding (GB), arguing that a general mechanism that is compatible with both LFG and GB treatment of long-distance movement can be found. Finally, we present the implementation of such a movement mechanism in a LFG parser. In this paper a version of LFG will be developed which has only one level of representation and is equivalent to the modified version of Kaplan, presented in Bresnan (1982) and Kaplan and Zaenen (1986). The structures of this monostratal version are f-structures, augmented by additional information about the derived symbols and their linear order. For these structures it is possible to define an adequate concept of direct derivability by which the derivation process becomes more efficient, as the f-description solution algorithm is directly simulated during the derivation of these structures, instead of being postponed. Apart from this, it follows from this reducibility that LFG as a theory in its present form does not make use of the c-structure information that goes beyond the mere linear order of the derived symbols. In this paper we present a parser for Lexical Function G r a m m a r (LFG) which is characterized by incrementally constructing the c- and f-structure of a sentence during the parse. We then discuss the possibilities of the earliest check on consistenc),, coherence, and completeness. Incremental construction of f-structure leads to an early detection and abortion of incorrect paths and so increases parsing efficiency. Furthermore those semantic interpretation processes that operate on partial structures can be triggered at an earlier state. This also leads to a considerable improvement in parsing time. LFG seems to be well suited for such an approach because it provides for locality principles by the definition of coherence and completeness. One of the most characteristic features of G e r m a n word order seems to be a contrast between fixed ordering rules concerning the order of verbal elements and a much more variable ordering of their corresponding nominal arguments. As a consequence, German word order seems to yield a large number of phenomena that may be classified as " u n b o u n d e d " or "long-distance dependencies", without necessarily involving wh-constituents or " m o v e m e n t " across sentence boundaries. Whereas in traditional LFG long-distance dependencies are treated by means of constituent control, we will follow a recent proposal by Kaplan and Zaenen (1986) to give up the constraint known as "functional locality" and instead allow regular expressions to appear as functional schemata annotated to c-structure rules. Exploiting the principles of completeness and coherence we will thus be able to cope even with absolutely free word order without the need of generating empty terminal nodes at all. The empirical assumption, underlying the proposed analysis in its most radical form, is the hypothesis that (with very few exceptions) the nominal arguments have to appear on the left of the verb by which they are assigned case. We will restrict the Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 121 The FINITE STRING Newsletter Abstracts of CurrentLiterature discussion to sentences with one finite verb as well as to subcategorized nominal arguments, largely ignoring ADJuncts. TOPIC Essentials Udo Hahn, Ulrich Reimer Universit~it Konstanz lnformationswissenschaft Postfach 5560 D-7750 Konstanz, F.R.G. COLING'86, pp. 497-503 Towards Discourse-Oriented N o n m o n o tonic System Barbara Dunin-Keplicz, Witold Lukaszewicz Institute of Informatics Warsaw University P.O. Box 1210 00-901 Warszawa, Poland An overview of TOPIC is provided, a knowledge-based text information system for the analysis of German-language texts. TOPIC supplies text condensates (summaries) on variable degrees of generality and makes available facts acquired from the texts. The presentation focuses on the major methodological principles underlying the design of TOPIC: a frame representation model that incorporates various integrity constraints, text parsing with focus on text cohesion and text coherence properties of expository texts, a lexically distributed semantic text grammar in the format of word experts, a model of partial text parsing, and text graphs as appropriate representation structures for text condensates. The purpose of this paper is to analyse the phenomenon of nonmonotonicity in a natural language and to formulate a number of general principles which should be taken into consideration while constructing a discourse oriented nonmonotonic formalism. COLING'86, pp. 504-506 Japanese Honorifics and Situation Semantics R. Sugimura Institute for New Generation Computer Technology (ICOT) Japan COLING'86, pp. 507-510 Two Approaches to Commonsense Inferencing for Discourse Analysis Marc Dymetman Universit6 Seientifique et M6dieale de Grenoble Groupe d'Etudes pour la Traduction Automatique B.P. 68 38042 Saint Martin d'H6res, France COLING'86, pp. 511-514 Speech Acts of Assertion in Cooperative Informational Dialogue I.S. Kononenko AI Laboratory, Computer Center Siberian Division of the USSR Ac. Sci Novosibirsk 630090, USSR COLING'86, pp. 515-519 Pragmatic Considerations in Man-Machine Discourse 122 A model of Japanese honorific expressions in situation semantics is proposed. Situation semantics provides considerable power for analyzing the complicated structure of Japanese honorific expressions. The main 'feature of this model is a set of basic rules for context switching in honorific sentences. Mizutani's theory of Japanese honorifics is presented and incorporated in the model which has been used to develop an experimental system capable of analyzing honorific context. Some features of this system are described. The dominant philosophy regarding the formalization of Commonsense Inferencing in the physical domain consists in the exploitation of the "tarskian" scheme axiomatization < - > interpretation borrowed from mathematical logic. The commonsense postulates constitute the axiomatization, and the real world provides the " m o d e l " for this axiomatization. The observation of the effective activity of linguistic communication and of the commonsense inferencing processes which are involved in it show the unacceptability of this scheme. An alternative is proposed, where the notion of "conceptual category" plays a principal role, and where the principle of logical adequation of an axiomatization to a model is replaced by a notion of "projection" of a conceptual structure onto the observed reality. Dialogue systems should provide a cooperative informational dialogue aimed at knowledge sharing. In the paper speech acts of assertion (SAA) are assumed to be the means of achieving this goal. A typology of SAAs is proposed which reflects certain cognitive aspects of communicative situation at different stages of mutual informing process. Information constituents of the type assertions are formally described to represent a current cognitive state of the speaker's knowledge base, each proposition in it being characterized by a subjective verisimilitude evaluation. The general ,scheme of information flow in the cooperative dialogue is considered. With regard to this scheme the dialogue functions of SAAs are discussed. This paper presents nothing that has not been noted previously by research in Artificial Intelligence but seeks to gather together various ideas that Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter Walther v. Hahn Research Unit for Information Science and Artificial Intelligence University of Hamburg D-2000 Hamburg 13, West Germany Abstracts of Current Literature have arisen in the literature. It collects those arguments which are in my view crucial for further progress and is intended only as a reminder of insights which might have been forgotten for some time. COLING'86, pp. 520-527 Formal Specification of Natural Language Syntax using Two-Level Grammar Barret R. Bryant, Dale Johnson, Galanjaninath Edupuganty Department of Computer and Information Science The University of Alabama Birmingham, Alabama 35294 COLING '86, pp. 527-533 On Formalizations of Marcus's Parser R. Nozohoor-Farshi The two-level grammar is investigated as a notation for giving formal specification of the context-free and context-sensitive aspects of natural language syntax. In this paper, a large class of English declarative sentences, including post-noun-modification by relative clauses, is formalized using a two-level grammar. The principal advantages of two-level grammar are: 1) it is very easy to understand and may be used to give a formal description using a structured form of natural language; 2) it is formal with many well-known mathematical properties; and 3) it is directly implementable by interpretation. The significance of the latter fact is that once we have written a two-level grammar for natural language syntax, we can derive a parser automatically without writing any additional specialized computer programs. Because of the ease with which two-level grammars may express logic and their Turing computability, we expect that they will also be very suitable for future extensions to semantics and knowledge representation. LR(k,t), BCP(m,n), and LRRL(k) grammars, and their relations to Marcus parsing are discussed. Department of Computing Science University of Alberta Edmonton, Canada T6G 2H1 COLING'86, pp. 533-535 A Grammar Used for Parsing and Generation Jean-Marie Lancel, Nathalie Simonin CAP Sogeti Innovation 129, rue de I'Universit6 75007 Paris, France Fratwpis Roasselot University of Strasbourg II 22, rue Descartes 67084, Strasbourg, France This text presents the outline of a system using the same grammar for parsing and generating sentences in a given language. This system has been devised for a "multilingual document generation" project. The Functional G r a m m a r notation described here allows a full symmetry between parsing and generating. Such a grammar may be read easily from the point of view of the parsing and from the point of view of the generation. This allows one to write only one grammar of a language, which minimizes the linguistics costs in a multilingual scheme. COLING'86, pp. 536-539 BUILDRS: An Implementation of DR Theory and LFG Hajime Wada Department of Linguistics Nicholas Asher Department of Philosophy, Center for Cognitive Science The University of Texas at Austin CGLING'86, pp. 540-545 A Prolog Implementation of GovernmentBinding Theory Robert J. Kuhns Artificial Intelligence Center This paper examines a particular Prolog implementation of Discourse Representation theory (DR theory) constructed at the University of Texas. The implementation also contains a Lexical Functional G r a m m a r parser that provides f-structures: these f-structures are then translated into the semantic representations posited by DR theory, structures which are known as Discourse Representation Structures (DRSs). Our program handles some linguistically interesting phenomena in English such as (i) scope ambiguities of singular quantifiers, (it) functional control phenomena, and (iii) long distance dependencies. Finally, we have implemented an algorithm for anaphora resolution. Our goal is to use purely linguistically available information in constructing a semantic representation of discourse as far as is feasible and to forego appeals to world knowledge. A parser founded on Chomsky's Government-Binding Theory and implemented in Prolog is described. By focussing on systems of constraints as proposed by this theory, the system is capable of parsing without an elaborate rule set and subcategorization features on lexical items. In addition to Computational Linguistics, Volume 13, Numbers 1-2, January. June 1987 123 The FINITE STRING Newsletter Arthur D. Little, Inc. Cambridge, MA 02140 Abstracts of Current Literature the parse, theta, binding, and control relations are determined simultaneously. COLING'86, pp. 546..550 A Lexical Functional Grammar System in Prolog Andreas Eisele, Jochen DOrre Department of Linguistics University of Stuttgart West Germany COLING'86, pp. 551-553 Knowledge Structures for Natural Language Generation Paul S. Jacobs Knowledge-Based Systems Branch General Electric Corporate Research and Development Schenectady, NY 12301 COLING'86, pp. 554-559 Semantic-Based Generation of Japanese German Translation System K. Hanakata Institut f. Informatik University of Stuttgart Herdweg 51 D-7000 Stuttgart 1, F.R. Germany This paper describes a system in Prolog for the automatic transformation of a grammar, written in LFG formalism, into a DCG-based parser. It demonstrates the main principles of the transformation, the representation of f-structures and constraints, the treatment of long-distance dependencies, and left recursion. Finally some problem areas of the system and possibilities for overcoming them are discussed. The development of natural language interfaces to Artificial Intelligence systems is dependent on the representation of knowledge. A major impediment to building such systems has been the difficulty in adding sufficient linguistic and conceptual knowledge to extend and adapt their capabilities. This difficulty has been apparent in systems which perform the task of language production, i.e., the generation of natural language output to satisfy the communicative requirements of a system. The Ace framework applies knowledge representation fundamentals to the task of encoding knowledge about language. Within this framework, linguistic and conceptual knowledge are organized into hierarchies, and structured associations are used to join knowledge structures that are metaphorically or referentially related. These structured associations permit specialized linguistic knowledge to derive partially from more abstract knowledge, facilitating the use of abstractions in generating specialized phrases. This organization, used by a generator called KING (Knowledge INtensive Generator), promotes the extensibility and adaptability of the generation system. Project SEMSYN*** achieved a state where a prototype system generates G e r m a n texts on the basis of the semantic representation produced from Japanese texts by ATLAS/II of Fujitsu Laboratory. This paper describes some problems that are specific to our semantic-based approach and some results of the evaluation study that has been made by the Germanist group. A. Lesniewski Standard Elektrik Lorenz AG Ostendstrasse 3 D-7530 Pforzheim, F.R. Germany S. Yokoyama Electrotechnical Laboratory Umezono, Sakuramura, Nihari Ibaraki 305, Japan COLING'86 560-562 Synthesizing Weather Forecasts from Formatted Data R. Kittredge, A. Polgubre D6partement de Linguistique Universit6 de Montr6al E. Goldberg Atmosphere Environment Services Environment Canada, Toronto This paper describes a system (RAREAS) which synthesizes marine weather forecasts directly from formatted weather data. Such synthesis appears feasible in certain natural sublanguages with stereotyped text structure. RAREAS draws on several kinds of linguistic and non-linguistic knowledge and mirrors a forecaster's apparent tendency to ascribe less precise temporal adverbs to more remote meteorological events. The approach can easily be adapted to synthesize bilingual or multi-lingual texts. COL1NG'86, pp. 563-565 From Structure to Process Michael Zock, GErard Sabah 124 This paper describes an implemented tutoring system designed to help students to generate clitic-constructions in French. While showing various Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter LIMS1 - Langues Naturelles B.P. 30 - Orsay C6dex / France Christophe Alviset INSSEE - 3, av. P. Larousse 94241 Malakoff - France COLING'86, pp. 5 6 6 - 5 6 9 Generating a Coherent Text Describing a Traffic Scene Hans-Joachim Novak Fachbereich Informatik Universit~t Hamburg D-2000 Hamburg 13, West Germany COLING '86, pp. 5 70-5 75 Generating Natural Language Text in a Dialog System Mare Koit Department of Programming 2 Juhan Liivi Street Madis Saluveer Artificial Intelligence Laboratory 78 Tiigi Street Tartu State University 202400 Tartu Estonia USSR Abstracts of CurrentLiterature ways of converting a given meaning structure into its corresponding surface expression, the system helps not only to discover w h a t data to process but also h o w this information processing should take place. In other words, we are concerned with efficiency in verbal planning (performance). Recognizing that the same result can be obtained by various methods, the student should find out which one is best suited to the circumstances (what is known, task demands, etc.). Informational states, hence the processor's needs, may vary to a great extent, as may his strategies or cognitive styles. In consequence, in order to become an efficient processor, the student has to acquire not only STRUCTURAL or RULE KNOWLEDGE but also PROCEDURAL KNOWLEDGE (skill). With this in mind we have designed three modules in order to foster a reflective, experimental attitude in the learner, helping him to discover insightfully the most efficient strategy. If a system that embodies a reference semantic for motion verbs and prepositions is to generate a coherent text describing the recognized motions, it needs a decision procedure to select the events. In NAOS event selection is done by use of a specialization hierarchy of motion verbs. The strategy of anticipated visualization is used for the selection of optional deep cases. The system exhibits low-level strategies which are based on verb inherent properties that allow the generation of a coherent descriptive text. The paper deals with generation of natural language text in a dialog system. The approach is based on principles underlying the dialog system TARLUS under development at Tartu State University. The main problems concerned are the architecture of a dialog system and its knowledge base. Much attention is devoted to problems which arise in answering the user queries - the problems of planning an answer, the non-linguistic and linguistic phases of generating an answer. COLING '86, pp. 5 76-580 Generating English Paraphrases from Formal Relational Calculus Expressions A.N. De Roeck, B.G.T. Lowden University of Essex Wivenhoe Park Colchester, United Kingdom COLING'86, pp. 581-583 The Computational Complexity of Sentence Derivation in Functional Unification Grammar Graeme Ritchie Department of Artificial Intelligence University of Edingburgh Edinburgh 3H1 1HN This paper discusses a system for producing English descriptions (ox "paraphrases") of the content of formal relational calculus formulae expressing a database query. It explains the underlying design motivations and describes a conceptual model and focus selection mechanism necessary for delivering coherent paraphrases. The general paraphrasing strategy is discussed, as are the notions of "desirable" paraphrase and "paraphrasable query". Two examples are included. The system was developed and implemented in Prolog at the University of Essex under a grant from ICL. Functional unification (FU) grammar is a general linguistic formalism based on the merging of feature-sets. An informal outline is given of how the definition of derivation within FU grammar can be used to represent the satisfiability of an arbitrary logical formula in conjunctive normal form. This suggests that the generation of a structure from an arbitrary FU grammar is NP-hard, which is an undesirably high level of computational complexity. COLING'86, pp. 5 8 4 - 5 8 6 Parsing Spoken Language: a Semantic Caseframe Approach Philip J. Hayes, Alexander G. Hauptmann, Jaime G. Carbonell, Masaru Tomita Computer Science Department Parsing spoken input introduces serious problems not present in parsing typed natural language. In particular, indeterminacies and inaccuracies of acoustic recognition must be handled in an integral manner. Many techniques for parsing typed natural language do not adapt well to these extra demands. This paper describes an extension of semantic caseframe parsing Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 125 The FINITE STRING Newsletter Carnegie-Mellon University Pittsburgh, PA 15213 COLING'86, pp. 58 7-592 Divided and Valency-Oriented Parsing in Speech Understanding Gerh. Th. Niedermair Zt ZTI INF, Siemens AG Otto-Hahn-Ring 6 8000 Mtmchen 83 COLING'86, pp. 593-595 The Role of Semantic Processing in an Automatic Speech Understanding System Astrid Brietzraann, Ute Ehrlich Lehrstuhl fuer Informatik 5 (Mustererkennung) Universitaet Erlangen-Nuernberg Martesstr. 3, 8520 Erlangen, F.R. Germany COLING'86, pp. 596-598 Synthesis of Spoken Message from Semantic Representations Laurence Danlos, Eric LaPorte Laboratoire d'Automatique Documentaire et Linguistique Universit6 de Paris 7 2, place Jussieu 75251 Paris Cedex 05 Abstracts of CurrentLiterature to restricted-domain spoken input. The semantic caseframe grammar representation is the same as that used for earlier work on robust parsing of typed input. Due to the uncertainty inherent in speech recognition, the caseframe grammar is applied in a quite different way, emphasizing island growing from caseframe headers. This radical change in application is possible due to the high degree of abstraction in the caseframe representation. The approach presented was tested successfully in a preliminary implementation. A parsing scheme for spoken utterances is proposed that deviates from traditional " o n e go" left to right sentence parsing in that it divides the parsing process first into two separate parallel processes. Verbal constituents and nominal phrases (including prepositional phrases) are treated separately and only brought together in an utterance parser. This allows especially the utterance parser to draw on valency information right from beginning when amalgamating the nominal constituents to the verbal core by means of binary sentence rules. The paper also discusses problem of representing the valency information in case-frames arising in a spoken language environment. We present the semantics component of a speech understanding and dialogue system that was developed at our institute. Due to pronunciation variabilities and vagueness of the word recognition process, semantics in a speech understanding system has to resolve additional problems. Its main task is not only to build up a representation structure for the meaning of an utterance, as in a system for written input, semantic knowledge is also employed to decide between alternative word hypotheses, to judge the plausibility of syntactic structures, and to guide the word recognition process by expectations resulting from partial analyses. A semantic-representation-to-speech system communicates orally the information given in a semantic representation. Such a system must integrate a text generation module, a phonetic conversion module, a prosodic module, and a speech synthesizer. We will see how the syntactic information elaborated by the text generation module is used for both phonetic conversion and prosody, so as to produce the data that must be supplied to the speech synthesizer, namely a phonetic chain including prosodic information. Franfoise Emerard Centre National d'Etudes des T616communications 22301 Lannion Cedex COLING'86, pp. 599-604 The Procedure to Construct a Word Predictor in a Speech Understanding System from a Task-Specific Grammar Defined in a CFG or a DCG Yasuhisa Niirai, Shigeru Uzuhara, Yutaka Kobayashi This paper describes a method for converting a task-dependent grammar into a word predictor of a speech understanding system. Since the word prediction is a top-down operation, left recursive rules induce an infinite looping. We have solved this problem by applying an algorithm for bottom-up parsing. Department of Computer Science Kyoto Institute of Technology Matsugasaki, Sakyo-ku, Kyoto 606, Japan COLING'86, pp. 605-607 The Role of Phonology in Speech Processing Richard Wiese Seminar ftir Allgemeine Sprachwissenschaft 126 In this paper, I discuss the role of phonology in the modelling of speech processing. It will be argued that recent models of nonlinear representation in phonology should be put to use in speech processing systems (SPS). Models of phonology aim at the reconstruction of the phonological Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter Universit~it DUsseldorf D-4000 Dtisseldorf, FRG COLING '86, pp. 608-611 Computational Phonology: Merged, not Mixed Egon Berendsen Department of Phonetics University of Utrecht The Netherlands Simone Langeweg Phonetics Laboratory University of Leyden The Netherlands Abstracts of CurrentLiterature knowledge that speakers possess and utilize in speech processing. The most important function of phonology in SPS is, therefore, to put constraints on what can be expected in the speech stream. A second, more specific function relates to the particular emphasis of the phonological models mentioned above and outlined in section 4: It has been realized that many SPSs do not make sufficient use of the suprasegmental aspects of the speech signal. But it is precisely in the domain of prosody where nonlinear phonology has made important progress in our insight into the phonological component of language. From the phonetic point of view, phonological knowledge is higher level knowledge just as syntactic or semantic information. But since phonological knowledge is in an obvious way closer to the phonetic domain than syntax or semantics, it is even more surprising that phonological knowledge has been rarely applied systematically in SPS. Research into text-to-speech systems has become a rather important topic in the areas of linguistics and phonetics. Particularly for English, several text-to-speech systems have been established (cf., for example, Hertz 1982, Klatt 1976). For Dutch, text-to-speech systems are being developed at the University of Nijmegen (cf. Wester 1984) and at the Universities of Utrecht and Leyden and at the Institute of Perception Research Eindhoven as well. In this paper we will be concerned with the grapheme-to-phoneme conversion component as part of the Dutch text-to-speech system which is being developed in Utrecht, Leyden, and Eindhoven. Hugo van Leeuwen Institute of Perception Research Eindhoven, The Netherlands COLING'86, pp. 612-614 Phonological Pivot Parsing Grzegorz Dogil Universit~it Bielefeld Fakult~tt fur Linguistik und Literaturwissenschaft D-4800 Bielefeld, West Germany COLING'86, pp. 615-617 A Description of the VESPRA Speech Processing System Roll Haberbeck FU Berlin, FB Germanistik D-1000 Berlin 33 TU Berlin, FB Informatik There are two basic mysteries about natural language: the speed and ease with which it is acquired by a child, and the speed and ease with which it is processed. Similarly to language acquisition, language processing faces a strong input-data-deficiency problem. When we speak we alter a great deal in the idealized phonological and phonetic representations. We delete whole phonemes, we radically change allophones, we shift stresses, we break up intonational patterns, we insert pauses at the most unexpected places, etc. If to this crippled "phonological string" we add all the noise from the surroundings which does not help comprehension either, it is bewildering that the parser is supposed to recognize anything at all. However, even in the most difficult circumstances (foreign accent, loud environment, being drunk, etc.), we do comprehend speech quickly and efficiently. There must be then some signals in the phonetic string which are particularly easy to grasp and to process. I call these signals pivots and call the parsers working with these signals pivot parsers. I present here an idea of what a fast parser which requires the minimum of phonologically invariant informat;.on might look like. This parser works in a sequentially-looping manner and the decisions it makes are non-deterministic. It is universally applicable, it is faster, and it seems to be no less efficient than other phonological parsers that have been proposed. The VESPRA system is designed for the processing of chains of (not connected utterances of) wordforms. These strings of wordforms correspond to sentences except that they are not realized in connected speech. VESPRA means: Verarbeitung und Erkennung gesprochener Sprache (processing and recognition of speech). VESPRA will be used to control different types of machines by voice input (for instance: non critical Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 127 The FINITESTRING Newsletter D-1000 Berlin 10 COLING'86, pp. 618-620 Translation by Understanding: A Machine Translation System LUTE Hirosato Nomura, Shozo Naito, Yasuhiro Katagiri, Akira Shimazu NTT Basic Research Laboratories Musashino-shi Tokyo, 180, Japan COLING'86, pp. 621-626 On Knowledge-Based Machine Translation Sergei Nirenburg Colgate University Victor Raskin Purdue University Allen Tucker Colgate University COLING'86, pp. 627-632 Another Stride towards Knowledge-Based Machine Translation Masaru Tomita, Jaime Carbonell Computer Science Department Carnegie-Mellon University Pittsburgh, PA 15213 COLING'86, pp. 633-638 English - Malay Translation System: a Laboratory Prototype Loon-Cheong Tong Computer Aided Translation Project School of Mathematical and Computer 128 Abstracts of Current Literature control functions in cars and in trucks, voice box in digital telephone systems, text processing systems, different types of office workstations). This paper presents a linguistic model for language understanding and describes its application to an experimental machine translation system called LUTE. The language understanding model is an interactive model between the memory structure and a text. The memory structure is hierarchical and represented in a frame-network. Linguistic and non-!inguistic knowledge is stored and the result of understanding the text is assimilated into the memory structure. The understanding process is interactive in that the text invokes knowledge and the understanding procedure interprets the text by using that knowledge. A linguistic model, called the Extended Case Structure model, is defined by adopting three kinds of information: structure, relation, and concept. These three are used recursively and iteratively as the basis for memory organization. These principles are applied to the design and implementation of the LUTE which translates Japanese i n t o English and vice versa. This paper describes the design of the knowledge representation medium used for representing concepts and assertions, respectively, in a subworld chosen for a knowledge-based machine translation system. This design is used in the TRANSPORTATION machine translation project. The knowledge representation language, or interlingua, has two components, DIL and TIL. DIL stands for 'dictionary of interlingua' and describes the semantics of a subworld. TIL stands for 'text of interlingua' and is responsible for producing an interlingua text, which represents the meaning of an input text in the terms of the interlingua. We maintain that involved analysis of various types of linguistic and encyclopedic meaning is necessary for the task of automatic translation. The mechanisms for extracting and manipulating and reproducing the meaning of texts will be reported in detail elsewhere. The linguistic (including the syntactic) knowledge about source and target languages is used by the mechanisms that translate texts into and from the interlingua. Since interlingua is an artificial language, we can (and do, through TIL) control the syntax and semantics of the allowed interlingua elements. The interlingua suggested for TRANSPORTATION has a broader coverage than other knowledge representation schemata for natural language. It involves the knowledge about discourse, speech acts, focus, time, space, and other facets of the overall meaning of texts. Building on the well-established premise that reliable machine translation requires a significant degree of text comprehension, this paper presents a recent advance in multi-lingual knowledge-based machine translation (KBMT). Unlike previous approaches, the current method provides for separate syntactic and semantic knowledge sources that are integrated dynamically for parsing and generation. Such a separation enables the system to have syntactic grammars, language specific but domain general, and semantic knowledge bases, domain specific but language general. Subsequently, grammars and domain knowledge are precompiled automatically in any desired combination to produce very efficient and very thorough real-time parsers. A pilot implementation of our KBMT architecture using functional grammars and entity-oriented semantics demonstrates the feasibility of the new approach. This paper presents the results obtained by an English to Malay computer translation system at the level of a laboratory prototype. The translation output obtained for a selected text (secondary school chemistry textbook) is evaluated using a grading scheme based on ease of post-editing. The effect of a change in area and typology of text is investigated by comparing Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter Science Universiti Sains Malaysia 11800 Penang, Malaysia COLING'86, pp. 639-642 A Prototype Machine Translation based on Extracts from Data Processing E. Luctkens, Ph. Fermont Department of Information Science and Documentation Free University of Brussels Belgium COLING '86, pp. 643-645 A Prototype English-Japanese Machine Translation System for Translating IBM Manuals Taijiro Tsutsumi Natural Language Processing Science Institute, IBM Japan, Ltd. 5-19, Sanban-cho, Chiyoda-ku Tokyo 102, Japan Abstracts of Current Literature with the translation output obtained for a university level computer science text. An analysis of the problems which give rise to incorrect translations is discussed. This paper also provides statistical information on the English to Malay translation system and concludes with an outline of further work being carried out on this system with the aim of attaining an industrial prototype. The following article presents a prototype for the machine translation of English into French . . . . The prototype aims to provide a diagnostic study that lays the foundations for further development rather than immediately producing an accurate but limited realization. By way of experiment, the corpus for translation was based on selected extracts from computer systems manuals. After studying the basic material, as well as assessing the various decision criteria, it was decided to construct a prototype made up of three components: analysis, transfer, and generation. Although the prototype was designed with multilingual applications in mind, it appeared preferable at this stage not to set up a system with interlingua since the elaboration of the interlingua alone would have taken up a disproportionate amount of time, thus handicapping the development of the prototype itself. This paper describes a prototype English-Japanese machine-translation (MT) system developed at the Science Institute of IBM Japan, Ltd. This MT system currently aims at the translation of IBM computer manuals. It is based on a transfer approach in which the transfer phase is divided into two sub-phrases. English transformation and English-Japanese conversion. An outline of the system and a detailed description of the EnglishJapanese transfer method are presented. COLING'86, pp. 646-648 Construction of a Modular and Portable Translation System Fujio Nishida, Yoneharu Fujita, Shinohu Takamatsu Department of Electrical Engineering Faculty of Engineering University of Osaka Prefecture Sakai, Osaka, Japan 591 COLING'86, pp. 649-651 When Mariko Talks to Siegfried Dietmar R6sner Projekt SEMSYN, Institut fur Informatik Universit~it Stuttgart, Herdweg 51 D-7000 Stuttgart I West Germany This paper has two purposes. One of them is to show a method of constructing an MT system on a library module basis with the aid of a programming construction system called L-MAPS. The MT system can be written in any programming language designated by a user if an appropriate data base and the appropriate functions are implemented in advance. For example, it can be written in a compiler language like C language, which is preferable for a workstation with a relatively slow running machine speed. The other purpose is to give a brief introduction of a program generating system called Library-Module Aided Program Synthesizing system (abbreviated to L-MAPS) running on a library module basis. L-MAPS permits us to write program specifications in a restricted natural language like Japanese and converts them to formal specifications. It refines the formal specifications using the library modules and generates a readable comment of the refined specification written in the above natural language every refinement in option. The conversion between formal expressions and natural language expressions is performed efficiently on a case grammar basis. In this paper we will report on our experiences from a two and a half year project that designed and implemented a prototypical Japanese to G e r m a n translation system for titles of Japanese papers. COLING'86, pp. 652- 654 Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 129 The FINITE STRING Newsletter Future Directions of Machine Translation Jun-ichi Tsujii Department of Electrical Engineering Kyoto University Sakyo-ku, Kyoto 606, Japan COLING'86, pp. 655-668 Discourse, Anaphora, and Parsing Mark Johnson Center for the Study of Language and Information and Department of Linguistics Stanford University Ewan Klein Centre for Cognitive Science Edinburgh University Abstracts of Current Literature In this paper, we will discuss several problems concerned with "understanding and translation", especially how we can integrate the two lines of research, with their different histories and different techniques, into unified frameworks, and the difficulties we might encounter in attempting such an integration. The discussion will reveal some of the reasons why MT researchers are so separated from research in the other application fields of NLP. We will also list some of the key problems, both linguistic and computational, which we encountered during the development of our MT systems, and whose resolutions we consider to be of essential importance for future MT research and development. Discourse Representation Theory, as formulated by Hans Kamp and others, provides a model of inter- and intra-sentential anaphoric dependencies in natural language. In this paper, we present a reformulation of the model which, unlike Kamp's is specified declaratively. Moreover, it uses the same rule formalism for building both syntactic and semantic structures. The model has been implemented in an extension of Prolog, and runs on a VAX 1 1 / 7 5 0 computer. COLING'86, pp. 669-675 Selected Dissertation Abstracts Compiled by: Susanne M. Humphrey, National Library of Medicine, Bethesda, MD 20209 Bob Krovetz, University of Massachusetts, Amherst, MA 01002 The following are citations selected by title and abstract as being related to computational linguistics or knowledge representation, resulting from a computer search, using the BRS Information Technologies retrieval service, of the Dissertation Abstracts International (DAI) data base produced by University Microfilms International. Included are the title; author; university, degree, and, if available, number of pages; DAI subject category chosen by the author of the dissertation; an abstract; and the UM order number and year-month of entry into the data base. References are sorted first by DAI subject category and second by author. Citations denoted by an MAI reference do not yet have abstracts in the data base and refer to abstracts in the published Masters Abstracts International. Unless otherwise specified, paper or microform copies of dissertations may be ordered from University Microfilms International Dissertation Copies Post Office Box 1764 Ann Arbor, MI 48106 telephone for U.S. (except Michigan, Hawaii, Alaska): 1-800-521-3042 for Canada: 1-800-268-6090. Price lists and other ordering and shipping information are in the introduction to the published DAI. An alternate source for copies is sometimes provided at the end of the abstract. The dissertation titles and abstracts contained here are published with permission of University Microfilms International, publishers of Dissertation Abstracts International (copyright 1986 by University Microfilms International), and may not be reproduced without their prior permission. The Effect of Knowledge Representation and Psychological Type on Human Understanding in the Human-Computer Interface People need to understand the logic of an information system because they must tell the system what to do, how to do it, and determine what the system did. Because of the limits of human memory, the logic of the system must be represented in a form that both people and computers can Wallace Irving Castle use. The University of Texas at Arlington Ph.D. 1985, 192 pages Business Administration, General One hundred thirty graduate and undergraduate business students participated in an experiment to evaluate the effect on human understanding of representation type and psychological type. A story represented in English was compared to a story represented in predicate logic. Psychological type was measured with the Myers-Briggs Type Indicator. Human understanding was measured with an inference recognition test. The hypothesis was that sensing psychological types perform better with predi- University Microfilms International ADG86-074 78 130 ComputationalLinguistics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter Abstracts of Current Literature cate logic than with English while intuitive psychological types perform better with English than with predicate logic. The results of the experiment supported the hypothesis that representation type and psychological type interact to effect human understanding; however, both sensing and intuitive psychological types performed better with predicate logic. The interaction occurred because the sensing psychological types performed as well with English as with predicate logic while the intuitive psychological types performed well with predicate logic but poorly with English. English is not the best representation type for helping the intuitive psychological type to understand the logic of an information system. The results of this experiment show that it is not correct to assume that English is always superior to any other representation regardless of the people using the system. In designing the human-computer interface, the alternative representations should be evaluated using an experimental design to determine their effect on different psychological types of users. The results of an experiment may show that the representation that is easy for the computer may also be the best for the people using the system. Towards a Representation of Lisp Semantics Nizar Mohammed Awartani Lehigh University Ph.D. 1986, 89 pages Computer Science University Microfilms International AD.G86-16151 Debugging Programs in a Distributed System Environment Peter Charles Bates University of Massachusetts Ph.D. 1986, 239 pages Computer Science University Microfilms International ADG86-12013 In this dissertation we chose to examine the semantics of a subset of RLISP and give its formal specifications. In this subset we do not include loops or recursive procedures. We defined the language elements which are fundamental to the statements within the language such as symbolic expressions, variables, symbolic expressions with quotation marks and a few others. By this we gain precision and completeness in RLISP specification at the fundamental level. We described RLISP syntactic constructs in a consistent way on a single logical level. The dissertation presents a system of formal rules that permit the establishment of rigorous proofs using only the uninterpreted program text. The method we used depended on repeated substitutions for occurrences of expressions in a given RLISP program. To explore the subtlety of RLISP we included an informal description of the rules and provided several examples illustrating them. Debugging is an activity that attempts to locate the sources of errors in the specification and coding of a software system and to suggest possible repairs that might be made to correct the errors. Debugging complex distributed programs is a frustrating and difficult task. This is due primarily to the predominance of a low-level, computation-unit view of systems. This extant perspective is necessarily detail intensive and offers little aid in dealing with the higher level operational characteristics of a system or the complexities inherent in distributed systems. In this dissertation we develop a high-level debugging approach in which debugging is viewed as a process of creating models of actual behavior from the activity of the system and comparing these to models of expected system behavior. The differences between the actual and expected models can be used to characterize errorful behavior. The basis for the approach is viewing the activity of a system as consisting of a stream of significant, distinguishable events that may be abstracted into high-level models of system behavior. An example is presented to demonstrate the use of event based model building to investigate an error in a distributed program. Behavior abstraction and system understanding are characterized as problems in pattern recognition that must operate in a noisy, uncertain environment. Pattern recognition in support of behavioral abstraction is thus shown to be more than a simple parsing exercise. A formal model is developed for event based behavioral abstraction which provides a basis for rigorous discussions of debugging as behavior modelling and forms a Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 131 The FINITE STRING Newsletter Abstracts of Current Literature guide for implementing tools to support debugging in terms of events and higher level abstractions of system behavior. A prototype distributed behavior recognition system which has been constructed to demonstrate and evaluate the feasibility of the EBBA approach is described. The prototype toolset identifies a range of debugging tools useful for distributed systems. Remote debugging, filtered remote debugging with preset actions, simple cooperative debugging, and distributed debugging progressively increase the power of debugging agents at individual nodes by reducing communication requirements, increasing overall transparency of the debugging tools, and distributing debugging tool functionality throughout the system. Semantic Query Optimization in Deductive Data Bases. (Volumes I and II) Upendranath Sharma Chakravarthy University of Maryland Ph.D. 1985, 304 pages Computer Science University Microfilms International A DG86-08788 This thesis addresses the problem of efficient query evaluation over a deductive data base and proposes several methods to optimize the evaluation of a query. The problems addressed in this thesis and the solutions proposed, under the central theme of query optimization, can be discussed under (i) Techniques for interfacing PROLOG with relational data bases, (ii) A formalism for semantic query optimization using integrity constraints, and (iii) Multiple query evaluation in deductive data bases. We propose several ways in which a PROLOG interpreter can be modified so that it can be interfaced effectively with a database system. Three solutions, namely, a simple modification to the PROLOG query evaluation strategy to accomplish the complied approach, a meta-level interpreter without any modifications to PROLOG and a set evaluation strategy using tables, are proposed in this thesis. A general framework in which domain specific knowledge - in the form of integrity constraints - is used to transform a query, is proposed in this thesis and is termed semantic query optimization. The process of semantic query optimization is carried out in two phases. Initially, the axioms of a data base are semantically compiled, wherein integrity constraints are integrated into the axioms in a suitable manner. Semantic compilation is performed only once prior to the submission of any query. Subsequently, the compiled axioms are utilized for query transformation at the time of query evaluation. The transformed query has restrictions imposed on it by the integrity constraints and hence it may be evaluated more efficiently over the data base than the original query. Multiple queries arise in several contexts. In the case of deductive data bases, a single query on an intensional predicate may result in several disjunctive queries which may have overlapping computations. We extend the connection graph decomposition algorithm to generate a single plan for a set of disjunctive queries. A multi-query graph is used as a non-procedural representation for a set of queries. The algorithm proposed in this thesis minimizes the number of accesses to the secondary storage where the relations are physically stored as well as the total number of joins. Visual Programming with Icons This dissertation describes the design approach of an iconic system on a modern lisp machine. The proposed system has applications in image information system, visual programming, computer aided design (CAD), and multimedia communications. Potential applications include computer vision systems, visual languages, and iconic expert systems. A generalized definition of an icon is proposed: a visual representation of an object (physical or abstract) which has relational dependencies with other icons. An experimental iconic system has been designed around an Icon Manager and an Icon Editor. The Icon Manager includes facilities for icon creation, icon interpretation, icon exploration, icon saving (on files), and an interactively programmable menu system; it provides basic support to create icons and relate them to other icons. The Icon Editor supports Olivier Bernard Clarisse Illinois Institute of Technology Ph.D. 1985, 256 pages Computer Science University Microfilms International ADG86-06485 132 Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 T h e F I N I T E S T R I N G Newsletter Abstracts of Current Literature several methods to interactively edit an icon from sketch representations or icon structure representations. The overall system allows the creation and organization of flat pictorial objects of any shape in a two and a half dimensional space. The image processing tools required by the iconic system include a general technique for halftone transformation of images, a general region growing technique, and methods for progressive transmission of images. Applications of this iconic programming environment to visual programming, to program design and electronic circuit design (from icon selection and editing), to knowledge systems based on icon graph matching and to multimedia communications are studied in detail. Finally, a possible hardware structure to support an icon management system is proposed which is based on a pyramid of microprocessors architecture. Hierarchical Reasoning: Simulating Complex Processes over Multiple Levels of Abstraction Paul Anthony Fishwick University of Pennsylvania Ph.D. 1986, 196 pages Computer Science University Microfilms International ADG86-14793 Linguistic Solid Modeling using Graph Grammars Patrick Arthur Fitzhorn Colorado State University Ph.D. 1985, 105 pages Computer Science University Microfilms International ADG86-07641 This thesis describes a method for simulating processes over multiple levels of abstraction. There has been recent work with respect to data, object, and problem-solving abstraction; however, abstraction in simulation has not been adequately explored. We define a process as a hierarchy of distinct production rule sets that interface to each other so that abstraction levels may be bridged where desired. In this way, the process may be studied at abstraction levels that are appropriate for the specific task: notions of qualitative and quantitative simulation are integrated to form a complete process description. The advantages to such a description are increased control, computational efficiency, and selective reporting of simulation results. Within the framework of hierarchical reasoning, we will concentrate on presenting the primary concept of process abstraction. A C o m m o n Lisp implementation of the hierarchical reasoning theory called HIRES is presented. HIRES allows the user to reason in a hierarchical fashion by relating certain facets of the simulation to levels of abstraction specified in terms of actions, objects, reports, and time. The user is free to reason about a process over multiple levels by weaving through the levels either manually or via automatically controlled specifications. Capabilities exist in HIRES to facilitate the creation of graph-based abstraction levels. For instance, the analyst can create continuous system models (CSMP), petri net models, scripts, or generic graph models that define the process model at a given level. We present a four-level elevator system and a two-level "dining philosophers" simulation to demonstrate the efficacy of process abstraction. The goal of this work is to develop formal relationships between language theory and topologically correct computer representations of objects in Euclidean three-space (E3), that is, physical solids. Thus the concern is to generate grammars whose languages can be interpreted as classes of representations of possibly proper subsets of physical solids. A methodology is then studied for the implementation of the developed grammars. The grammars of interest are variants of graph grammars whose languages are sets of directed graphs with node and edge labels, and whose productions rewrite graphs into other graphs. Graph grammars are of interest here since they generate structures similar to plane models (topological representations of the class of 3D solids). Since it can be shown that plane models are sufficient representations of the topology of E 3 polytopes, a class of graph grammars that generate all such models should be of interest. The strings generated by these grammars will then be representations of physical solids that, although based on formal topological guarantees, can be manipulated with formal language theory. Computer implementation of the grammars is considered, and it is shown that a natural method that encompasses storage of the representation, as well as the grammar itself, is one based on the predicate calculus. Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 133 The FINITE STRING Newsletter Abstractsof CurrentLiterature In this implementation, the vertices and edges of a representation are stored as facts in a logic data base, while the grammar that rewrites subsets of the graph with other graphs becomes a set of relations on graphs. The programming language PROLOG is used for implementation, since it is based closely on the first order predicate calculus. In conclusion, it is shown that the current, major representations of physical solids have analogs in the developed graph grammars to the same level of representation validity. That being the case, graph grammars can replace current, heuristic implementations of physical solid representations with formal methods from language theory. A Logic Data Model for the Machine Representation of Knowledge Randolph George Goebel The University of British Columbia (Canada) Ph.D. 1985 Computer Science This item is not available from University Microfilms International. ADG05-58418 A Fully Lazy Higher Order Purely Functional Programming Language with Reduction Semantics Kevin John Greene Syracuse University Ph.D. 1985, 262 pages Computer Science University Microfilms International ADG86-03 760 134 DLOG is a logic-based data model developed to show how logic-programming can combine contributions of Data Base Management (DBM) and Artificial Intelligence (AI). The DLOG data model is based on a logical formulation that is a superset of the relation data model (Reiter83), and uses Bowen and Kowalski's notion of an amalgamated meta and object language (Bowen82) to describe the relationship between data model objects. The DLOG specification includes a language syntax, a proof (or query evaluation) procedure, a description of the language's semantics, and a specification of the relationships between assertions, queries, and application data bases. DLOG's basic data description language is the Horn clause subset of first order logic (Kowalski79, Kowalski81), together with embedded descriptive terms and non-Horn integrity constraints. The embedded terms are motivated by Artificial Intelligence representation language ideas, specifically, the descriptive terms of the KRL language (Bobrow77). A similar facility based on logical descriptions is provided in DLOG. The DLOG language permits the use of definite and indefinite descriptions of individuals and sets in both queries and assertions. The meaning of DLOG's extended language is specified by writing Horn clauses that describe the relation between the basic language and the extensions. The experimental implementation is the appropriate Prolog program derived from that specification. The DLOG implementation relies on an extension to the standard Prolog proof procedure. This includes a "unification" procedure that matches embedded terms by recursively invoking the DLOG proof procedure (cf., &loglisp. Robinson82). The experimental system includes logic-based implementations of traditional database facilities (e.g., transactions, integrity constraints, data dictionaries, data manipulation language facilities), and an idea for using logic as the basis for heuristic interpretation of queries. This heuristic uses a notion of partial match or sub-proof to produce assumptions under which plausible query answers can be derived. The experimental DLOG database (or "knowledge base") management system is exercised by describing an undergraduate degree program. The example application database is a description of the Bachelor of Computer Science degree requirements at The University of British Columbia. This application demonstrates how DLOG's embedded terms provide a concise description of degree program knowledge, and how that knowledge is used to specify student programs, and select program options. In the first third of this thesis, three well-known reduction calculi - A. Church's X-calculus, M. Schonfinkel's SKI-calculus, and C. P. Wadsworth's graph oriented A-calculus - are defined. Schonfinkel's classic transformation of ?,-calculus well-formed formulas (wffs) into variable-free SKI-calculus wffs is also presented. A new notion, lazy-normal form, a generalization of the SKI-calculus concept of normal form, is then defined and compared with Wadsworth's concept of head-normal form. Head-normal form is a generalized notion of normal form in the A-calculus. It is Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter Abstracts of Current Literature demonstrated that a SKI-calculus wff in lazy-normal form is an outline of the wff's normal form (if one exists) - i.e., its normal form will have the same initial atom and the same number of arguments. Other results relating h-calculus wffs in head-normal form to SKI-calculus wffs in lazy-normal form are stated and proved. The ideas behind M. Schonfinkel's SKI-calculus, C. P. Wadsworth's graph oriented h-G-calculus, and D. A. Turner's SASL implementation are combined with the concept of lazy-normal form to produce a new deterministic combinator based graph and machine oriented reduction calculus: the LFN-calculus. The LFN-calculus is equivalent in power to the h-calculus et al., but is much more directly and efficiently implementable. This is due primarily to the structure sharing properties of the LFN-calculus wffs. Both garbage nodes and forwarding arcs (indirection pointers), concepts that are usually relegated to a calculus's implementation, are given formal definitions in this calculus. The design and experimental Lisp Machine implementation of LFN, a fully lazy higher order purely functional programming language with reduction semantics, are discussed. The LFN compiler transforms high level expressions into representations of LFN-calculus wffs. LFN's runtime system, a direct realization of the LFN-calculus's "is reducible to" relation, takes as input LFN-calculus wffs and produces irreducible wffs (wffs in lazy-normal form) as result. The thesis ends with brief discussions of alternate approaches to functional programming language compilation and runtime system organization. Learning by Understanding Analogies Russell Greiner Stanford University Ph.D. 1985, 417 pages Computer Science University Microfilms International ADG86-02479 Pattern-Based and Knowledge-Directed Query Compilation for Recursive Data Bases The phenomenon of learning has intrigued scholars for ages; this fascination is reflected in Artificial Intelligence, which has always considered learning to be one of its major challenges. This dissertation provides a formal account of one mode of learning, learning by analogy. In particular, it defines the useful analogical inference process (UAI), which uses a given analogical hint of the form " A is like B" and a particular target problem to map known facts about B onto proposed conjectures about A. UAI only considers conjectures which are useful to the target problem; that is, the conjectures must lead to a plausible solution to that problem. To construct a procedure that can effectively find these useful analogies, we use two sets of heuristics to refine the general UA| process. The first set is based on the intuition that useful analogies often correspond to "coherent" clusters of facts. This suggests that UAI seeks only the analogies that correspond to common abstractions, where abstractions are relations that encode solution methods to past problems. The other set of rules embody the claim that "better analogies impose fewer constraints on the world". Basically, these rules prefer the analogies which require the fewest additional conjectures. This dissertation also describes a running program, NLAG, which implements this model of analogy. It is then used in a battery of tests, designed to empirically validate our claim that UAI is an effective technique for acquiring new facts. This data also demonstrates that the heuristics are effective, and suggests why. In summary, the primary contributions of this research are (1) a formal definition of UAI, described semantically (using a new variant of Tarskian semantics), syntactically and operationally; (2) a collection of heuristics which efficiently guide this process towards useful analogies; and (3) various empirical results, which illustrate the source of power underlying this approach. Expert database systems (EDSs) comprise an interesting class of computer systems which represent a confluence of research in artificial intelligence, logic, and database management systems. They involve knowledge-direct- Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 135 The FINITE STRING Newsletter Jiawei Han The University of Wisconsin - Madison Ph.D. 1985, 216 pages Computer Science University Microfilms International ADG86-01539 Abstracts of Current Literature ed processing of large volumes of shared information and constitute a new generation of knowledge management systems• Our research is on the deductive augmentation of relational database systems, especially on the efficient realization of recursion, We study the compilation and processing of recursive rules in relational database systems, investigating two related approaches: pattern-based recursive rule compilation and knowledge-directed recursive rule compilation and planning• Pattern-based recursive rule compilation is a method of compiling and processing recursive rules based on their recurslon patterns• We classify recursive rules according to their processing complexity and develop three kinds of algorithms for compiling and processing different classes of recursive rules: transitive closure algorithms, SLSR wavefront algorithms, and stack-directed compilation algorithms• These algorithms, though distinct, are closely related. The more complex algorithms are generalizations of the simpler ones, and all apply the heuristics of performing selection first and utilizing previous processing results (wavefronts) in reducing query processing costs. The algorithms are formally described and verified, and important aspects of their behavior are analyzed and experimentally tested. To further improve search efficiency, a knowledge-directed recursive rule compilation and planning technique is introduced• We analyze the issues raised for the compilation of recursive rules and propose to deal with them by incorporating functional definitions, domain-specific knowledge, query constants, and a planning technique• A prototype knowledge-directed relational planner, RELPLAN, which maintains a high level user view and query interface, has been designed and implemented, and experiments with the prototype are reported and illustrated. • A Theory of Scalar lmplicature Julia Bell Hirschberg University of Pennsylvania Ph.D. 1985, 230 pages Computer Science University Microfilms International ADG86-03648 136 #. Speakers may convey many sorts of 'meaning' via an utterance. While each of these contributes to the utterance's overall communicative effect, many are not captured by a truth-functional semantics. One class of nontruth-functional, context-dependent meanings, has been identified by Griee (1975) as conversational implicatures. This thesis presents a formal account of one type of conversational implicature, termed here scalar implicature identified from a study of a large corpus of naturally occurring data collected by the author and others from 1982 through 1985. Scalar implicatures rely for their generation and interpretation upon the assumption that cooperative speakers will say as much as they truthfully can that is relevant to a conversational exchange. For example, B's utterance of (la): (1) A: How was the party last night? a. B: Some people left early. b. Not all people left early• may convey to A that, as far as B knows, ( l b ) also holds - even though the truth of ( l b ) clearly does not follow from the truth of (la). Scalar implicatures may be distinguished from other conversational implicatures in that their generation and interpretation is dependent upon the identification of some salient relation that orders a concept referred to in an utterance with other concepts• In 1, for example, the salience of an inclusion relation between 'some people' and 'all people' in the discourse is prerequisite to B's implicating that ( l b ) - and to A's understanding that (1 b) has in fact been implicated. To illustrate potential applications of the theory presented, a module of a natural-language interface, QUASI, is described. QUASI calculates scalar implicatures that might be licensed by simple direct responses to yes-no questions. Where licensable implicatures are not consistent with the system's knowledge base, QUASI proposes alternative responses• This system demonstrates how natural language interfaces can use the calcu- Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter Abstracts of Current Literature lation of implicit meanings to avoid conveying misinformation and to convey desired information more succinctly. A Knowledge-based Approach to Language Production Paul Schafran Jacobs University of California, Berkeley Ph.D. 1985, 278 pages Computer Science University Microfilms International ADG86-10067 A Knowledge-based System for Debugging Concurrent Software Carol Helfgott LeDoux University of California, Los Angeles Ph.D. 1985, 322 pages Computer Science University Microfilms International ADG86-03965 The development of natural language interfaces to Artificial Intelligence systems is dependent on the representation of knowledge. A major impediment to building such systems has been the difficulty in adding sufficient linguistic and conceptual knowledge to extend and adapt their capabilities. This difficulty has been apparent in systems which perform the task o f language production, i. e. the generation of natural language output to satisfy the communicative requirements of a system. The problem of extending and adapting linguistic capabilities is rooted in the problem of integrating abstract and specialized knowledge and applying this knowledge to the language processing task. Three aspects of a knowledge representation system are highlighted by this problem: hierarchy, or the ability to represent relationships between abstract and specific knowledge structures; explicit referential knowledge, or knowledge about relationships among concepts used in referring to concepts; and uniformity, the use of a common framework for linguistic and conceptual knowledge. The knowledge-based approach to language production addresses the language generation task from within the broader context of the representation and application of conceptual and linguistic knowledge. This knowledge-based approach has led to the design and implementation of a knowledge representation framework, called Ace, geared towards facilitating the interaction of linguistic and conceptual knowledge in language processing. Ace is a uniform, hierarchical representation system, which facilitates the use of abstractions in the encoding of specialized knowledge and the representation of the referential and metaphorical relationships among concepts. A general-purpose natural language generator, KING (Knowledge INtensive Generator), has been implemented to apply knowledge in the Ace form. The generator is designed for knowledge-intensivity and incrementality, to exploit the power of the Ace knowledge in generation. The generator works by applying structured associations, or mappings, from conceptual to linguistic structures, and combining these structures into grammatical utterances. This has proven to be a simple but powerful mechanism, easy to adapt and extend, and has provided strong support for the role of conceptual organization in language generation. The recent development of high-level concurrent programming languages has emphasized the problem of limited debugging tools to support the development of applications using these languages. A new approach is necessary to improve the efficacy of debugging tools and to adapt them to the framework of a concurrent software environment. A knowledge-based debugging approach is presented that aids diagnosis of a variety of run-time errors that can occur in concurrent programs written in the Ada 1 programming language. In this approach, an event stream of program activity is captured in an historical database and accessed using Prolog-based queries constrained by temporal-logic primitives. Diagnosis is aided by applying rule-based descriptions of some common classes of software errors and by matching program specifications against the trace data base. This approach was used in building a prototype debugger, called Your Own Debugger for Ada (YODA). The design of YODA is described and analyses of several sample Ada programs are presented to illustrate diagnosis of errors associated with concurrency, including deadness errors and misuse of shared data. 1Ada is a registered trademark of the U.S. Government - Ada Joint Program Office. ComputationalLinguistics, Volume 13, Numbers 1-2, January-June 1987 137 The FINITE STRING Newsletter Plan Recognition and Discourse Analysis: an Integrated Approach for Understanding Dialogues Diane Judith Litman The University of Rochester Ph.D. 1986, 197 pages Computer Science University Microfilms International ADG86-10863 Correcting object-related misconceptions Kathleen Filliben McCoy University of PennsYlvania Ph.D. 1985, 166 pages Computer Science University Microfilms International ADG86-03674 Inferring Domain Plans in QuestionAnswering Martha Elizabeth Pollack 138 Abstracts of CurrentLiterature One promising computational approach to understanding dialogues has involved modeling the goals of the speakers in the domain of discourse. In general, these models work well as long as the topic follows the goal structure closely, but they have difficulty accounting for interrupting subdialogues such as clarifications and corrections. Furthermore, such models are typically unable to use many processing clues provided by the linguistic phenomena of the dialogues. This dissertation presents a computational theory and partial implementation of a discourse level model of dialogue understanding. T h e theory extends and integrates plan-based and linguistic-based approaches to language processing, arguing that such a synthesis is needed to computationally handle many discourse level phenomena present in naturally occurring dialogues. The simple, fairly syntactic results of discourse analysis (for example, explanations of phenomena in terms of very local discourse contexts as well as correlations between syntactic devices and discourse function) will be input to the plan recognition system, while the more complex inferential processes relating utterances have been totally reformulated within a plan-based framework. Such an integration has led to a new model of plan recognition, one that constructs a hierarchy of domain and meta-plans via the process of constraint satisfaction. Furthermore, the processing of the plan recognizer is explicitly coordinated with a set of linguistic clues. The resulting framework handles a wide variety of difficult linguistic phenomena (for example, interruptions, fragmental and elliptical utterances, and presence as well as absence of syntactic discourse clues), while maintaining the computational advantages of the plan-based approach. The implementation of the plan recognition aspects of this framework also addresses two difficult issues of knowledge representation inherent in any plan recognition task. Analysis of a corpus of naturally occurring data shows that users conversing with a database or expert system are likely to reveal misconceptions about the objects modelled by the system. Further analysis reveals that the sort of responses given when such misconceptions are encountered depends greatly on the discourse context. This work develops a contextsensitive method for automatically generating responses to object-related misconceptions with the goal of incorporating a correction module in the front-end Of a database or expert system. The method is demonstrated through the ROMPER system (Responding to Object-related Misconceptions using PERspective), which is able to generate responses to two classes of object-related misconceptions: misclassifications and misattributions. The transcript analysis reveals a number of specific strategies used by human experts to correct misconceptions, where each different strategy refutes a different kind of support for the misconception. In this work each strategy is paired with a structural specificati6n of the kind of support it refutes. ROMPER uses this specification, and a model of the user, to determine which kind of support is most likely. The corresponding response strategy is then instantiated. The above process is made context sensitive by a proposed addition to standard knowledge-representation systems termed object perspective. Object perspective is introduced as a method for augmenting a standard knowledge-representation system to reflect the highlighting affects of previous discourse. It is shown how this resulting highlighting can be used to account for the context-sensitive requirements of the correction process. The importance of plan inference in models of conversation has been widely noted in the computational-linguistics literature, and its incorporation in question-answering systems has enabled a range of cooperative Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter Abstracts of Current Literature University of Pennsylvania Ph.D. 1986, 191 pages Computer Science University Microfilms International ADG86-14850 behaviors. The plan inference process in each of these systems, however, has assumed that the questioner (Q) whose plan is being inferred and the respondent (R) who is drawing the inference have identical beliefs about the actions in the domain. I demonstrate that this assumption is too strong, and often results in failure not only of the plan inference process but also of the communicative process that plan inference is meant to support. In particular, it precludes the principled generation of appropriate responses to queries that arise from invalid plans. I present a model of plan inference in conversation that distinguishes between the beliefs of the questioner and the beliefs of the respondent. This model rests on an account of plans as mental phenomena: "having a plan" is analyzed as having a particular configuration of beliefs and intentions. Judgments that a plan is invalid are associated with particular discrepancies between the beliefs that R ascribes to Q, when R believes Q has some particular plan, and the beliefs R herself holds. I define several types of invalidities from which a plan may suffer, relating each to a particular type of belief discrepancy, and show that the types of any invalidities judged to be present in the plan underlying a query can affect the content of a cooperative response. The plan inference model has been implemented in SPIRIT - a System for Plan Inference that Reasons about Invalidities Too - which reasons about plans underlying queries in the domain of computer mail. Rational Interaction: Cooperation among Intelligent Agents Jeffrey Solomon Rosenschein Stanford University Ph.D. 1986, 145 pages Computer Science University Microfilms International ADG86-08219 The development of intelligent agents presents opportunities to exploit intelligent cooperation. Before this can occur, however, a framework must be built for reasoning about interactions. This dissertation describes such a framework, and explores strategies of interaction among intelligent agents. The formalism that has been developed removes some serious restrictions that underlie previous research in distributed artificial intelligence, particularly the assumption that the interacting agents have identical or non-conflicting goals. The formalism allows each agent to make various assumptions about both the goals and the rationality of other agents. A hierarchy of rationality assumptions is presented, along with an analysis of the consequences that result when an agent believes a particular level in the hierarchy describes other agents' rationality. In addition, the formalism presented allows the modeling of restrictions on communication and the modeling of binding promises among agents. Computation on the part of each individual agent can often obviate the need for inter-agent communication. However, when communication and promises are allowed, fewer assumptions need be made about the rationality of other agents when choosing one's own rational course of action. Recursions and Rule Selections on a High Level Relation Processor for KnowledgeBase Machine Dongpil Shin The University of Oklahoma Ph.D. 1986, 132 pages Computer Science University Microfilms International ADG86-13735 For the development of knowledge-based systems, various ways of incorporating a relational database system into a PROLOG-based questionanswering system have been investigated. To improve the performance of a knowledge-based system, a deductive search involving a large set of facts is performed separately by a relational database subsystem. The correctness property of an inference performed by such a system is formally studied. Problems associated with the scheme, such as determining a termination in a recursion, incorporating " c u t " operations, and the capabilities of relation processors of performing these operations, are studied. To solve these problems, first, database queries are classified into six levels based on their operations so that queries involvingrecursions and " c u t " can be identified. Next, relation processors are also classified so that corresponding query expressions can be evaluated. Then, a high-level machine, Data flow Relation Processor (DFRP), is designed so that all the problems defined previously can be solved with this machine. Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 139 The FINITE STRING Newsletter Abstracts of Current Literature The closed-connection graph is introduced so that a recursion could be visualized. The conditions for terminating a recursion are defined in terms of the closed-connection graph. A procedure that synthesizes a level-5 query is developed. Finally, functional definitions of DFRP are studied to evaluate a recursive query, and its simulation model is built. The major results of the simulation are (i) a binary join takes a constant time regardless of the cardinality of relations; (ii) n-ary join takes O(n) time, where n is the number of relations involved; (iii) DFRP is 104 times faster in performing a binary join operation than A which is the intermediate result of Japan's fifth-generation computer project; (iv) a knowledge-based system with DFRP is 10 to 20 times faster than MPDC in performing PROLOG queries of 2 to 30 subgoals, each involving 4096 facts. Controlling Inference David Eugene Smith Stanford University Ph.D. 1985, 237 pages Computer Science University Microfilms International ADG86-02539 Effective control of inference is a fundamental problem in Artificial Intelligence. Unguided inference leads to a combinatorial explosion of facts or subgoals for even simple domains. To overcome this problem, expert systems have used powerful domain-dependent control information in conjunction with syntactic domain-independent methods like depth-first backward chaining. While this is possible for some applications, it is not always feasible or appropriate for problem solvers that must solve a wide variety of different problems. In this dissertation I argue that a kind of semi-independent control is essential for problem solvers that must face a wide variety of different problems. Semi-independent control is based on the idea that there is underlying domain-independent rationale behind any good control decision. This rationale takes the form of simple utility theory applied to the expected cost and probability of success of different inference steps and strategies. These basic principles are domain-independent, but their application to any particular problem relies on global information about the nature and extent of facts and rules in the problem solver's data base. This approach to control is used in the solution of four different control problems: halting inference when all answers to a query have been found, halting recursive inference, ordering conjunctive queries when no inference is involved, and choosing the best inference step for problems where only a single answer is required. The first two control problems are cases of recognizing redundant portions of a search space, while the final two cases involve computing the expected cost for alternative strategies to a problem. Several novel theorems about control (for specific situations) are developed in these case studies. The issue of efficiency is also addressed. Semi-independent control often involves considerable computation, and may not be cost-effective for the majority of problems encountered in a particular domain. Interleaving of inference and control is proposed as a means of making this kind of control practical. The Essence of Rum: a Theory of the Intensional and Extensional Aspects of Lisp-type Computation Carolyn L. Talcott R u m is a theory of applicative, side-effect free computations over an algebraic data structure. It goes beyond a theory of functions computed by programs, treating both intensional and extensional aspects of computation. Powerful programming tools such as streams, object-oriented programming, escape mechanisms, and co-routines can be represented. Intensional properties include the number of multiplications executed, the number of the context switches, and the maximum stack depth required in a computation. Extensional properties include notions of equality for streams and co-routines and characterization of functionals implementing strategies for searching tree-structured spaces. Precise definitions of informal concepts such as stream and co-routine are given and their mathematical theory is developed. Operations on programs treated include program transformations which introduce functional and control Stanford University Ph.D. 1985, 248 pages Computer Science University Microfilms International ADG86-02549 140 Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter Abstracts of Current Literature abstractions; a compiling morphism that provides a representation of control abstractions as functional abstractions; and operations that transform intensional properties to extensional properties. The goal is not only to account for programming practice in Lisp, but also to improve practice by providing mathematical tools for developing programs and building programming systems. Rum views computation as a process of generating computation structures - trees for context-independent computations and sequences for context-dependent computations. The recursion theorem gives a fixedpoint function that computes computationally minimal fixed points. The context insensitivity theorem says that context-dependent computations are uniformly parameterized by the calling context and that computations in which context dependence is localized can be treated like context-independent computations. Rum machine structure and morphism are introduced to define and prove properties of compliers. The hierarchy of comparison relations on programs ranges from intensional equality to maximum approximation and equivalence relations that are extensional. The fixed-point function computes the least fixed point with respect to the maximum approximation. Comparison relations, combined with the interpretation of programs using computation structures, provide operations on programs both with meanings to preserve and meanings to transform. INT-AID: The Intelligent Aid for Relational Database Construction Mustafa Mahmoud Lehigh University Ph.D. 1986, 214 pages Computer Science University Microfilms International ADG86-16179 The Interactive Effects of Micro- and Macrostructural Processing during Text Comprehension Nicholas Geleta The Catholic University of America Ph.D. 1986, 170 pages Education, Psychology University Microfilms International ADG86-13460 In this dissertation we developed a system, INT-AID: The INTelligent AID for relational database construction. It is an intelligent interactive system that aids the relational database systems designers in constructing a good design. We defined a workable methodology by integrating a wide variety of algorithms, theories, and techniques in one system. The system uses a set of Functional Dependencies (FDs) to construct relations in Third Normal Form (3NF) following the Synthetic Approach in relational database design. We proposed a novel methodology in deriving and generating a set of functional dependencies. Unlike the standard conventional method which uses structured analysis techniques to build a set of FDs; our approach is an unstructured method that does not depend on structured analysis. This new method suits the evolutionary approach in database design. In our approach we deal with an incoherent body of data, that contains many unrelated and unstructured facts. From this body of data we want to let the natural relationships emerge dynamically, rather than imposing unnatural relationships on the data. As a result of this we might uncover some hidden relationships that were not known before. This new approach is an attempt towards the establishment of causal effects. We developed a formula using mathematical induction, to give the total number of what we called proposed FDs (their validity is yet to be determined). The Synthetic Approach, which we followed in our system to construct 3NF relations, does not always produce relations that are lossless with respect t o the join operation. To overcome this shortcoming we added an algorithm to check whether the synthesized relations are lossless with respect to join or not. This research investigated the text comprehension model first described in Kintsch and van Dijk (1978) and later elaborated by van Dijk and Kintsch (1983). The central research question involved determining if certain micro- and macropropositions were accessible to the r e a d e r / c o m p r e h e n d e r in working-memory during on-going reading of prose passages. Sixty subjects read texts presented one processing cycle at a time on the display screen of an Apple lie microcomputer. While reading they were interrupted by a probe to a specific Micro- (Mi) or Macroproposition (Ma) that varied in terms of the processing cycle in which it appeared before Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 t41 The FINITE STRING Newsletter Abstracts of Current Literature presentation of the probe-(i.e., Next-To-Last Cycle (NTLC) or Prior Cycle (PC)). The response time, measured in milliseconds, of a subject's judgement as to whether the probe's informational content was consistent with that of the passage was recorded. It was hypothesized that NTLC/Mi's, NTLC/Ma's and PC/Ma's would be responded to more rapidly than PC/Mi's since the propositions were predicted to be resident in working-memory. The results supported this prediction. It was concluded that good readers construct micro- and macrostructural representations of the text on-line. Time Course of Activation for High and Low Centrality Nouns in Scripts Jacqueline Sullivan Gorski The Catholic University of America Ph.D. 1986, 206 pages Education, Psychology University Microfilms International ADG86-032 79 The Effects of Expertise and Sentence Form on Reading Rate and Vocalization Latency Ann Lamiell Landy The Pennsylvania State University Ph.D. 1986, 187 pages Education, Psychology University Microfilms International ADG86-15210 142 This study investigated whether scripts, such as eating in a restaurant, are prestored or consciously constructed long term memory units and whether centrality is an organizing dimension. Five models for the activation of script nouns were proposed. These were differentiated by their predictions for the time course of activation for high and low centrality script nouns at each of three intervals. Sixty subjects generated script nouns. Twenty rated them on centrality. Forty-eight subjects participated in a computerized lexical decision task. Primes were script names and neutral XXXs. Targets were high and low centrality script nouns and nonwords. When the prime was a word and the target was a word, the prime either named the same script from which the word was taken or a different script. The interval between prime and target was varied between 250, 500, and 750 msec. The dependent variable was time to respond word or nonword to the target. Scores indicating whether same or different script primes facilitated or inhibited responding were computed by subtracting response times after script primes from response times after XXX primes. Facilitation was indicated by positive and inhibition by negative values. T-tests were conducted on mean scores for high and low centrality script nouns at each interval to determine the type (automatic or conscious) and extent of activation as indicated by the observed facilitation. Analyses of variance were conducted separately on scores for same and different script primes to identify possible effects due to list, time interval, and centrality. Results supported the Prestorage and Computation model. Same script primed responses to high centrality nouns were facilitated at all three intervals, while those for low centrality nouns were facilitated only at the longest. This suggested that highly important script concepts form a prestored unit which is automatically activated, while less important concepts must be consciously activated. An associative network theory of script memory representation accounted well for the data (cf. Yekovich & Walker 1985, in press). Suggestions for teachers and computer tutorial designers included cuing learners to consciously activate low level domain knowledge and providing adequate time to do. Two experiments were carried out to test the effects of knowledgeability and technical vocabulary on processing speed for sentences from familiar and unfamiliar technical domains. Using a priming paradigm, reading rate for sentence stems and vocalization latencies for target words that followed the stems were obtained for technically worded and simplified sentences. In the first experiment, biochemists' reading rate and vocalization latencies were compared for familiar (biochemistry) technical and simplified sentences, unfamiliar (psychopathology) technical and simplified sentences, and general expository sentences. In the second experiment, the relationship of distances within semantic networks to processing speed was explored by obtaining vocalization latencies for target words that followed related, neutral, and unrelated sentence stems with familiar (psychopathology) and unfamiliar (biochemistry) content. Clinical psychologists were subjects. Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter Abstracts of Current Literature For both groups of experts, knowledgeability played a greater role than vocabulary in sentence processing speed. Familiar sentences were processed faster than unfamiliar sentences regardless of the wording. Knowledgeability also interacted with vocabulary. Familiar technical sentences were processed at the same rate as familiar simplified sentences while unfamiliar simplified sentences were processed faster than unfamiliar technical sentences. The results are interpreted as supporting spreading activation models of memory organization and retrieval. Parallel Processing of Combinatorial Search Problems Guo-Jie Li Purdue University Ph.D. 1985, 208 pages Engineering, Electronics and Electrical University Microfilms International ,4DG86-065 75 The search for solutions in a combinatorially large problem space is a major problem in artificial intelligence and operations research. Parallel processing of combinatorial searches has become a key issue in designing new generation computer systems. The research gives a theoretical foundation of parallel processing of various combinatorial searches upon which the architectures are based. In this thesis parallel processing of searching AND trees (graphs), OR trees (graphs), and AND/OR trees (graphs) are investigated, and different functional requirements of the architecture are identified. Some of the difficulties in building parallel computers for searching arise from the inability to predict the performance of the resulting systems. One important issue in implementing AND-tree searches is to determine the granularity of parallelism. In this thesis, the optimal granularity of AND-tree searches is found and analyzed. Another important result of this research is in finding the bounds of performance of parallel OR-trees searches and a variety of conditions to cope with anomalies of parallel OR-tree searches that involve approximations and dominance tests. In contrast to previous results, our theoretical analysis and simulations show that a near-linear speedup can be achieved with respect to a large number of processors. Logic programming, one of the foundations of new generation computers, can be represented as searching AND/OR trees. In this research, an optimal search strategy that minimizes the expected overhead of searching AND/OR trees is found. An efficient heuristic search strategy for evaluating logic programs, which can be implemented on a multiprocessor architecture (MANIP-2), is proposed. Dynamic programming problems, a class of problems that can be formulated in multiple ways and solved by different architectures, are used to illustrate the results obtained on graph and tree searches. Dynamic programming formulations are classified into four types and various parallel processing schemes for implementing different formulations of dynamic programming problems are presented. In particular, efficient systolic arrays for solving monadic-serial dynamic programming problems are developed. Boundaries and the Treatment of Control The unifying theme of the dissertation is that properties of lexical argument structure "drive" the syntax in a number of interesting ways. First, lexical argument structure plays an important role in the determination of extraction possibilities in the syntax. Second, lexical properties are important in determining a number of phenomena at Logical Form; in particular, lexical semantics plays an important role in determining the interpretation of structures of "arbitrary" control. Chapters two and three of the dissertation deal with boundaries to extraction, particularly the phenomena subsumed under the Subject Condition and the Constraint on Extraction Domains. Chapter two focuses on a restricted class of nominals in English. The main puzzle addressed is the ability to strand prepositions in these nominals but not in other sorts of nominals. In chapter three, the ability to extract from a constituent is" related to the thematic relations which the constituent in question enters Robin Lee Clark University of California, Los Angeles Ph.D. 1985, 401 pages Language, Linguistics University Microfilms International ADG86-03940 Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 143 The FINITE STRING Newsletter Abstracts of Current Literature into. It is further demonstrated that the Boundary Condition allows us to abstract away from details of tree configuration in providing an account of these island phenomena. Chapters four and five develop an account of control based on recent research on Non-overt Operators. Particular attention is paid to so-called arbitrary control and it shows that arbitrary control differs from obligatory control only insofar as the former is a property of Logical F o r m while the latter is an S Structure property. Particular attention is given to the nature of Logical Form, how implicit arguments are realized at that level and how adverbs of quantification enter into control relations. The treatment of control is shown to bear a strong relationship to such diverse structures as purposive clauses, parasitic gaps, infinitival relatives, " t o u g h " movement constructions and certain sentential predicates. The Effect of Kind of Anaphor on the Accessibility of Antecedent Information Marylene Cloitre Columbia University Ph.D. 1985, 114 pages Language, Linguistics. Psychology, General University Microfilms International ADG86-04609 These studies compare processing differences during the resolution of anaphoric relationships for two types of anaphors: pronouns and repeated nouns. The initial studies show that subjects responded to antecedent-related information more rapidly following a pronoun than a noun-anaphor. Further investigation, using a levels-of-processing methodology, suggests that the pronominal advantage is largely derived from a difference in the level of representation at which the initial interpretation of antecedent information occurs. Specifically, the data suggest that pronouns directly access the conceptual representation of their antecedent while noun-anaphors initially access a more superficial form of representation. In each of five experiments, subjects were presented with a probe word following a sentence-final anaphor. The probe word was always an adjective which had modified the antecedent noun. The results in both a listening and a reading situation (Experiments 1 and 3) showed that subjects not only recognized the probe adjective faster following the pronoun than the noun-anaphor but also responded differentially to type of adjective. Subjects showed more rapid responses to concrete than abstract adjectives following the pronoun but showed little differential response following the noun-anaphor. The facilitation observed following the noun-anaphor, not associated with a differential response, was hypothesized as sensitivity to a more superficial level of analysis of antecedent information. In a delayed probe study (Experiment 2), the differential responses to a b s t r a c t / c o n crete information was observed following the noun-anaphor, providing evidence for the hypothesis that the noun-anaphor eventually shows sensitivity to the conceptual aspects of its antecedent though involved in some preliminary perhaps 'surface' level analysis of antecedent information. A direct investigation of the nature of the initial processing activities for each kind of anaphor was undertaken using the task-oriented methodology of the Levels-of-Processing paradigm. In a Lexical Decision Task, subjects showed greater response facilitation following the noun-anaphor than the pronoun. Subjects' sensitivity to 'surface' information during the lexical decision task following the noun-anaphor was suggested not only by response facilitation when the probe was a real (antecedent-related) word but also by the response inhibition following nonword probes easily confusable with the real word probes. Responses following pronouns did not show either of these effects. In contrast, subjects showed much stronger facilitation following pronouns than noun-anaphors in a Category Decision Task. Pronoun Resolution in Two-Clause Sentences Alison Matthews City University of New York Ph.D. 1986, 147 pages This dissertation examines the resolution of anaphoric pronoun references in two-clause sentences with the pronoun in the second clause and potential antecedents in the first. Evidence suggests that pronoun resolution involves a search of short-term memory. Experiments were performed to evaluate the predictions of linear, hierarchical, and parallel function 144 Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter Abstracts of Current Literature Language, Linguistics University Microfilms International ADG86-14690 searches in a word-by-word reading comprehension task. The results of Experiment 1 showed that when gender cues are present, pronoun coreference is resolved more quickly than when the cues are absent, and in their absence there were strong effects of left-right position of the antecedent on comprehension time. Experiment 2 varied the linear position and syntactic level of embedding of the antecedents in order to test the linear and hierarchical search models. Results were most consistent with a leftto-right, top-down breadth-first search such as that proposed by Hobbs (1978). Main subordinate clause order had no effect. Experiment 3 tested the predictions of the parallel function model using pronouns that had the same grammatical role as the contextually appropriate antecedent or a different grammatical role. Results indicated no significant effect of parallel function, although the positional differences found in Experiments 1 and 2 were once again obtained. The failure to find an effect of pronoun position suggests that the search may begin at the topmost node of the preceding clause rather than at the pronoun, requiring a modification of Hobb's model. Experiment 4 examined the psychological mechanism underlying the search for antecedents. Work by Holmes and Forster (1979) and Mehler et al (1978) indicate that the memory strength of adjectives and adverbs in a sentence may be related to their position and level of embedding. Experiment 4 used a rapid serial visual presentation task to measure memory for nouns as a function of these variables. Results for nouns showed significant effects of position and level o f probability of recall. This suggests that the left-to-right, top-down breadth-first search order may simply reflect the memory strength of the noun phrases which are potential antecedents for an anaphoric pronoun. Teaching Discourse Organization to the Deaf: the Use of Case-Role Detection in Text Analysis Stephanie Ruth Polowe The University of Rochester Ed.D. 1985, 214 pages Language, Linguistics University Microfilms International ADG86-O1324 Charles Fillmore is credited with bringing the concept of "case-roles" with their importance for semantic interpretation, into the field of postChomskyan linguistics. What Fillmore and his colleague, Wallace Chafe, have argued is that the notion of "case", as we have it from our study of "case-based" European languages, provides a "grammar" of semantic theory, a map for deep structure, which organizes the concepts used in lexical notation. Case grammar also makes the assumption that transformations do not retain the deep structure; that, while much of the deep structure may remain on the sentential level, more than a stylistic choice is being made as different transformations are chosen~ Several theoretical investigations (e.g., Sidner, 1980) have shown that the choice of a transformation is made on the basis of the discourse context of the utterance. Five case roles which seem critical to meaning interpretation of sentences are the Agent (the actor, the subject of an active sentence), the Neutral or Patient (the direct object of an active sentence), the Experiencer (the psychological recipient of the action of a verb), the Benefactive (the animate recipient of the Neutral object), and the Locative (an element in the determination of source, goal and state). Sidner in her work with natural language computer processing, has made the claim that the discourse "Focus" is placed in the neutral case, if other "focus-specifying" structures are absent. Focus specifiers are present in sentences which originally specify the topic (focus) of a discourse. Where these structures are absent, the neutral argument is the default choice for focus detection. Where the neutral case is absent, a hierarchy of candidate antecedents applies. The neutral argument of a neutral complement is a traditional focus specifier of choice. In this study, a test was developed to assess the subjects' use of functional (case) roles of sentence constituents in language processing. Instructional materials for teaching functional role detection and focus specification were written and used as an experimental curriculum. Analy- Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 145 The FINITE STRING Newsletter Abstracts of Current Literature sis of pre- and post-test data indicates that making functional role detection explicit may contribute to the language proficiency of deaf students. Computer Thought: Propositional Attitudes and Meta-knowledge Eric Stanley Dietrich The University of Arizona Ph.D. 1985, 220 pages Philosophy, Computer Science University Microfilms International ADG86-0333 7 Conventions and Speech Acts Seumas Roderick Macdonald Miller University of Melbourne (Australia) Ph.D. 1985, 401 pages Philosophy University Microfilms International ADG86-08982 Reference and Intentions to Refer: an Analysis of the Role of Intentions to Refer in a Theory of Reference Crolis Gayda Swain Loyola University of Chicago Ph.D. 1986, 275 pages Philosophy University Microfilms International ADG86-05559 146 Though artificial intelligence scientists frequently use words such as "belief" and "desire" when describing the computational capacities of their programs and computers, they have completely ignored the philosophical and psychological theories of belief and desire. Hence, their explanations of computational capacities which use these terms are frequently little better than folk-psychological explanati6ns. Conversely, though philosophers and.psychologists attempt to couch their theories of belief and desire in computational terms, they have consistently misunderstood the notions of computation and computational semantics. Hence, their theories of such attitudes are frequently inadequate. A computational theory of propositional attitudes (belief and desire) is presented here. It is argued that the theory of propositional attitudes put forth by philosophers and psychologists entails that propositional attitudes are a kind of abstract data type. This refined computational view of propositional attitudes bridges the gap between artificial intelligence, philosophy and psychology. Lastly, it is argued that this theory of propositional attitudes has consequences for meta-processing and consciousness in computers. Conventions play a large part in our lives. Our mode of dress, manner of eating, and linguistic performances, for example, are all governed by conventions. In Parts A and B of the thesis, a theory of convention is provided. In Part C the primary concern is with the question of the conventionality of speech acts. Part C includes a discussion of the convention to truth-tell, and an attempt to develop a theory of assertion taking H. P. Grice's account of speaker-meaning as a starting point. The theory of convention put forward in Parts A and B arises out of a detailed treatment of David Lewis' book entitled, Convention. Lewis' theory analyses conventions in terms of preferences and expectations. For example, I drive on the left because I prefer to do so, given others do so,and I expect others to do so. In Parts A and B it is argued that: (1) Lewis' preference structures need replacement. (2) The notion of a collective end needs to be introduced. (3) Convention followers' expectations depend on their having acquired "standing procedures" to conform. An important characteristic of such procedures is that if an agent A, has a standing procedure to X, then there is a presumption in favour of A's X-ing. This dissertation challenges the claim that reference is determined by intentions to refer by using a 'divide and conquer' strategy. The claim that reference is determined by intentions to refer is divided into two claims: one is a claim about how reference is disambiguated; the other is about how expressions in a language get their reference potential. By dividing the claims in this way, we can see in what contexts, and to what extent, reference is determined by intentions. The first claim, that reference is disambiguated by what a speaker intends to refer to, is the more plausible one. Part I of the dissertation clarifies and defends this claim. It rules out non-intentionalist accounts, • which try to explain disambiguation in terms of non-intentional contextual factors alone, because the features of the context to which these accounts appeal are themselves ambiguous. Nonetheless, it argues that contextual features are important non-linguistic determinants of reference. Part I concludes that the speaker's intentions do play a role in determining referComputational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter Abstracts of Current Literature ence, but it also concludes that when linguistic as well as non-linguistic determinants of reference are taken into account, the role that speakers' intentions play in determining reference turns out to be quite small. Part II of the dissertation refutes the claim that the set of possible referents for an expression in a language is determined by what some group of people (either a majority of them or the 'experts') intend to refer to with that expression. Part II argues that such accounts are either circular - they explain semantic reference in terms of speaker's reference, while basing speaker's reference on semantic reference, or they presuppose an untenable view of the way minds are related to the world. Case-Based Reasoning: a Computer Model of Subjective Assessment William Michael Bain Yale University Ph.D. 1986, 324 pages Computer Science DAI V47(08), SecB, pp3427 University Microfilms International ADG86-2 725 7 Consolidation: a Method for Reasoning about the Behavior of Devices Thomas Clare Bylander The Ohio State University Ph.D. 1986, 194 pages Computer Science DAI V47(07), SecB, pp2993 University Microfilms International ADG86-25188 People tend to improve their abilities to reason about situations by amassing experiences in reasoning. The more situations a person knows about, the more he can account for feature differences between new data and old knowledge. Resorting to previous instances of similar situations for guidance is known as case-based reasoning. A computer program that can improve its ability to reason must also have access to situations which it has previously reasoned about. Previous experiences thus require some mechanism for orderly storage and retrieval. The inability to save and modify reasoning chains for future use represents a serious shortcoming of most, if not all, rule-based expert systems. This research has involved modelling by computer the behavior of judges who sentence criminals. We have viewed this task as one in which people learn empirically from the process of producing relative assessments of input situations with respect to several concerns. What differentiates this task from many other reasoning tasks is that it provides little external feedback. People can perform such subjective tasks by at least trying to keep their assessments consistent; as a result, they often resort to using case-based reasoning. For assessment tasks, this reasoning style involves comparing a previous similar situation with an input one, and then extracting an assessment for the new input, based on both the assessment previously assigned to the older example, and differences found between them. The JUDGE system is an implementation of a case-based reasoning model for sentencing criminal cases in this manner. The system also stores input items to reflect their relationships to situations already contained in m e m o ry. Research on Naive Physics attempts to answer the questions: H o w do people reason about physical phenomena? H o w can computers be endowed with similar facilities? Artificial Intelligence research on Naive Physics concentrates on the second question, and by doing so, also seeks to achieve significant insight on the first. This research addresses one problem of Naive Physics, that of deriving the "potential behavior" of a device given the structure of the device and the potential behavior of its parts. The potential behavior of a physical object describes the object's behavioral characteristics without making assumptions about the behavior of other objects external to that object. The reasoning process that this research proposes is based on two strategies. The consolidation strategy is to select a "composite c o m p o n e n t " consisting of two components and infer the potential behavior of the composite from the potential behavior of its subcomponents. Successful application of consolidation on increasingly larger composite components results in inferring the potential behavior of the whole device. The other strategy is to represent potential behavior with a small number of "types of behavior" that allow behavioral interactions to be described by rules of composition. A type of behavior is an action on a substance at some location or on some path. The rules of composition, called "causal patterns," describe how one type of behavior can arise from Computational Linguistics, Volume 13,. Numbers 1-2, January-June 1987 147 The FINITE STRING Newsletter Abstracts of CurrentLiterature a structural combination of other types of behavior. For example, the "pump move" causal pattern states that a " m o v e " behavior can arise from an "allow" behavior and a " p u m p " behavior if both behaviors are on the same path and if the path goes from a potential source to a potential sink. These two strategies are incorporated into an overall framework for representing simple devices and reasoning about their potential behavior. In addition to introducing the consolidation framework and presenting examples of applying it, this dissertation also discusses the kinds of Artificial Intelligence theories that are appropriate for Naive Physics, carefully compares consolidation with qualitative simulation, and lists several areas for future research, including suggestions on how to overcome shortcomings of the proposed consolidation framework. Temporal Imagery: an Approach to Reasoning about Time for Planning and Problem Solving Thomas Linas Dean Yale University Ph.D. 1986, 299 pages Computer Science DAI V47(08), SecB, pp3428 University Microfilms International ADG86-2 7245 Refinement of Expert System Knowledge Bases: a Metalinguistic Framework for Heuristic Analysis Allen Ginsberg Rutgers University the State U. of New Jersey (New Brunswick) Ph.D. 1986, 276 pages Computer Science DAI V47(06), SecB, pp2509 University Microfilms International ADG86-20034 148 Reasoning about time typically involves drawing conclusions on the basis of incomplete information. Uncertainty arises in the form of ignorance, indeterminacy, and indecision. Despite the lack of complete information a problem solver is continually forced to make predictions in order to pursue hypotheses and plan for the future. Such predictions are frequently contravened by subsequent evidence. This dissertation presents a computational approach to temporal reasoning that directly confronts these issues. The approach centers around techniques for managing a data base of assertions corresponding to the occurrence of events and the persistence of their effects over time. The resulting computational framework performs the temporal analog of (static) reason maintenance (Doyle 1979) by keeping track of dependency information involving assumptions about the truth of facts spanning various intervals of time. The system developed in this dissertation extends classical predicate-calculus data bases, such as those used by Prolog (Brown 1981) to deal with time in an efficient and natural manner. The techniques presented here constitute a solute to the problem of updating a representation of the world changing over time as a consequence of various processes, otherwise known as the frame problem (McCarthy 1969). These techniques subsume the functionality of current approaches to dealing with time in planning (e.g., Sacerdoti 1977, Tate 1977, Vere 1983, Allen 1983). Applications in robot problem solving are stressed, but examples drawn from other application areas are used to demonstrate the generality of the techniques. The issues involved in processing temporal queries, propagating metric constraints, noticing the invalidation of default assumptions, and reasoning with incomplete knowledge are discussed in conjunction with the presentation of algorithms. Knowledge base refinement involves the generation, testing, and possible incorporation of plausible refinements to the rules in a knowledge base with the intention of thereby improving the empirical adequacy o'f an expert or knowledge-based system, i.e., its ability to correctly diagnose or classify the cases in its domain of expertise. The research presented in this thesis contributes to the development of useful knowledge base refinement systems both at the concrete level of system design, implementation, and testing, and also at the "meta-level" of development of tools and methodologies for pursuing research in this area. Relative to the former level, the following contributions have been made: (1) the empirically-grounded heuristic approach to refinement generation developed by Politakis and Weiss has been generalized and extended, i.e., the approach has been made applicable to a more powerful rule representation language, and heuristics encompassing a larger class of refinement operations have been incorporated, (2) an automatic refinement system utilizing this approach has been implemented and, based upon preliminary testing, has been shown to be capable of generating effective refinements. Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The F I N I T E S T R I N G Newsletter Abstracts of Current Literature Relative to the level of tools and methodology, a high-level Refinement Metalanguage, RM, allowing for the specification of a wide variety of alternative refinement concepts, heuristics, and strategies, has been designed and implemented. In addition to allowing for the growth of refinement systems by facilitating experimental research, RM also provides a means for refinement system customization and possible enhancement through the incorporation of domain-specific metaknowledge. The incorporation of a formal metalanguage for knowledge base refinement represents an extension of the traditional model of an expert system framework, and is a step in the direction of more powerful, robust, and self-improving expert system technology. Syntactic Extensions in the Programming Language Lisp Eugene Edmund Kohlbecker Jr. Indiana University Ph.D. 1986, 228 pages Computer Science DAI V47(08), SccB, pp3430 University Microfilms International ADG86-27998 A Non-cognitive Formal Approach to Knowledge Representation in Artificial Intelligence Jim A. McMannama Air Force Institute of Technology Ph.D. 1986, 309 pages Computer Science DAI V47(05), SecB, pp2060 University Microfilms International ADG86-17749 The traditional macro processing systems used in Lisp-family languages have a number of shortcomings. We identify five problems with the declaration tools customarily available to programmers. First, the declarations themselves are hard to read and write. Second, the declarations provide little explicit information about the form macro calls are to take. Third, syntactic checking of macro calls is usually ignored. Fourth, the notion of a macro binding for an identifier gives rise to a poor understanding of what macros really should be. Fifth, the unrestricted capabilities of the language used to declare macros cause some to take advantage of macros in ways inconsistent with their role as textual abstractions. Furthermore, the conventional algorithm used for the expansion of macro calls within Lisp often causes the inadvertent capture of an identifier appearing within the macro call by a macro-generated, binding instance of the same identifier. Lisp programmers have developed a few techniques for avoiding this problem, but they all have depended upon the macro writer taking some sort of special preventative action. We examine several existing macro processors, both inside and outside of the Lisp-family. We then enumerate a set of design principles for macro processing systems. These principles are general enough that they apply to the organization of macro processing systems for a large number of highlevel languages. Taking our principles as guidelines, we design a new macro processing system for Lisp. The new macro declaration tool addresses each of the five problems from which the traditional tools suffer. A description of the use of our tool and an annotated presentation of its implementation are provided. We also develop a new macro expansion algorithm that eliminates the capturing problem. The macro expander has the responsibility for avoiding the unwanted capture of identifiers appearing within macro calls. With the entry of Artificial Intelligence (AI) into real-time applications, a rigorous analysis of AI expert systems, is required in order to validate them for operational use. To satisfy this requirement for analysis of the associated knowledge representations, the techniques of formal language theory are used. A combination of theorems, proofs and problem-solving techniques from formal language theory are employed to analyze language equivalents of the more commonly used AI knowledge representations of production rules (excluding working memory or situation data) and semantic networks. Using formal language characteristics, it is shown that no single support-tool or automatic-programming tool can ever be constructed that can handle all possible production-rule or semantic-network variations. Additionally, it is shown that the entire set of finite production-rule languages is able to be stored in and retrieved from finite semantic-network languages. In effect, the semantic-network structure is shown to be a viable candidate for a centralized database of knowledge. Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 149 The FINITE STRING Newsletter Abstracts of Current Literature Compiling Queries in Indefinite Deductive Databases under the Generalized Closed World Assumption Hyung-Sik Park Northwestern Uniw~rsity Ph.D. 1986, 118 pages Computer Science DAI V47(08), SecB, pp3433 University Microfilms International ADG86-2 7386 This research report presents several fundamental results on compiling queries that will correctly answer "true", indefinite, and "false" in IDDB (Indefinite Deductive Databases) under the GCWA (Generalized Closed World Assumption). IDDB does not allow function symbols, but does allow non-Horn clauses. Further, although the GCWA is used to derive negative assumptions, we do also allow negative clauses to occur explicitly. Our goal is to develop the effective techniques for compiling queries in such IDDB. We show a fundamental relationship between indefiniteness and inference engines in IDDB under the GCWA. We introduce two basic notions of NH (Non-Horn) and PSUB (Potential Subsumption) sets providing a basis for compilation, and consider three representation alternatives to separate the CDB (Clausal DB) from the RDB (Relational DB). We introduce a saturated resolution method to compile unit queries on CDB and evaluate them through the RDB in non-recursive IDDB, and develop five primitive NH-reduction rules and two NH-inheritance rules. We also present a basic idea on compiling unit queries in recursive IDDB by the pattern generation method. Finally, we introduce the decomposition and evaluation theorems to evaluate disjunctive and conjunctive queries by decomposing them into their unit subqueries and utilizing the compiled information for such subquedes. Towards a Natural Language Interface for Computer Aided Design Tariq Samad Carnegie-Mellon University Ph.D. 1986, 137 pages Computer Science DAI V47(05), SecB, pp2062 University Microfilms International ADG86-16520 We propose a natural language interface as part of the solution to the problems posed by the continuing increase in the number and sophistication of CAD tools. The advantages of a natural language interface for CAD are numerous, but the complexity and the scope of the CAD domain renders most previous work in natural language interfaces of limited utility; an approach of much greater generality and power is required. We describe a natural language interface (named Cleopatra) that we have developed for the sub-domain of circuit-simulation post-processing. Cleopatra is the first step in a research program the ultimate goal of which is the development of a natural language interface for an integrated design environment. Cleopatra significantly extends what is in essence a lexically-driven case-frame parser by incorporating a couple of novel features: high degrees of flexibility and parallelism. The flexibility of our approach enables the representation of constraints that, for instance, cannot be represented by semantic-grammar-based systems, and it also enables the specification of arbitrary and idiosyncratic actions to guide the parsing process. The parallelism, which is supplemented with a notion of "confidence-levels", enables straightforward treatment of most kinds of ambiguity. Cleopatra can handle simple nominal coordination, substitutional ellipsis, some kinds of subordinate clauses, there-insertion sentences and wh-frontings, and its abilities make it a useful CAD tool in its own right, as well as demonstrating the feasibility of our ultimate goal. Extending Cleopatra's linguistic coverage, as well as extending Cleopatra to other sub-domains of CAD, should be greatly facilitated by the generality and power of our approach. Formalization and Representation of Expert Systems Miriam R. Tausner Stevens Institute of Technology Ph.D. 1986, 182 pages Computer Science DAI V47(07), SecB, pp2825 University Microfilms International ADG86-241 76 Based on a critical analysis of the canonical forms of expert systems, definitions of classical forward-chaining production rule based expert systems and classical backward-chaining production rule based expert systems are isolated as the fundamental basic definitions of an expert system. Two representation theorems are presented for the two fundamental types of expert systems defined. Both the classical forward-chaining production rule based expert systems and classical backward-chaining production rule based expert systems are shown to be representable as type 3 languages (finite state automata). The pragmatic usefulness of the finite state repre- 150 Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter Abstracts of CurrentLiterature sentation of an expert system is established in the design of a multilevel expert system for general systems problems solving. A type 3 language is used to encapsulate the knowledge base and reasoning strategies of the front-end of the expert system, thus representing the front-end as a deterministic finite automaton. This provides a new approach to the problem of interfacing multilevel expert systems. A Structured Memory Access Architecture for Lisp Matthew Jacob Thazhuthaveetil The University of Wisconsin - Madison Ph.D. 1986, 186 pages Computer Science DA! V47(07), SecB, pp3004 University Microfilms International ADG86-18296 A Speech Error Correction Algorithm for Natural Language Input Processing Peter James Wetterlind Texas A&M University Ph.D. 1986, 119 pages Computer Science DAI V47(07), SecB, pp3004 University Microfilms International ADG86-25455 Lisp has been a popular programming language for well over 20 years. The power and popularity of Lisp are derived from its extensibility and flexibility. These two features also contribute to the large semantic gap that separates Lisp from the conventional von Neumann machine, typically leading to the inefficient execution of Lisp programs. This dissertation investigates how the semantic gap can be bridged. We identify function calling, environment maintenance, list access, and heap maintenance as the four key run-time demands of Lisp programs, and survey the techniques that have been developed to meet them in current Lisp machines. Previous studies have revealed that Lisp list access streams show spatial locality as well as temporal locality of access. While the presence of temporal locality suggests the use of fast buffer memories, the spatial locality displayed by a Lisp program is implementation dependent and hence difficult for a computer architect to exploit. We introduce the concept of structural locality as a generalization of spatial locality, and describe techniques that were used to analyse the structural locality shown by the list access streams generated from a suite of benchmark Lisp programs. This analysis suggests architectural features for improved Lisp execution. The SMALL Lisp machine architecture incorporates these features. It partitions functionality across two specialised processing elements whose overlapped execution leads to efficient Lisp program evaluation. Tracedriven simulations of the SMALL architecture reveal the advantages of this partition. In addition, SMALL appears to be a suitable basis for the development of a multi-processing Lisp system. Computerized processing of human speech input may be accomplished by (1) recognizing the phoneme sounds in the speech signals, (2) correctly identifying the words in each spoken sentence, (3) interpreting the meaning of the sentence, and (4) generating proper responses for each utterance. Individual speakers talk differently, and even an individual's enunciation patterns change with differing environments and discourse domains. These differences are called speaker idiosyncracies. Regional speech dialects are included as a speaker idiosyncracy. The source of such speaker differences has been identified as an individual's pronunciations of the vowel phonemes. Computerized speech processors treat these speaker idiosyncracies as errors when the input phoneme sounds are unrecognizable. A generalized speech recognition system must accommodate such speech errors and, more particularly, ignore the speaker dependent pronunciations of phonemes. This implies that vowel phoneme pronunciations, the source of speaker idiosyncraeies and speech processing errors, should be overlooked during recognition of vocalized sentences. The research experiment consisted of construction of a system for identifying a natural language sentence using only speaker independent phonemes as the input. The motivating hypothesis for the experiment is that spoken sentences can be recognized from limited phoneme input. The research system accepts only strings of consonant phonemes, which are recognizable in a speaker independent environment. The original Zspoken' sentence is reproduced from the consonant phonemes and formatted as a word sequence for subsequent transmission to a natural language process- Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 151 The FINITE STRING Newsletter Abstracts of CurrentLiterature ing system. The system uses a vocabulary of general words and an expandable dictionary of domain specific words during the sentence reconstruction process. The research conclusions are that such a system can be built, and that the useful vocabulary must be expandable as the recognition system becomes more frequently used. The research system is intended as an interface between existing acoustic phoneme recognizers and existing natural language processors. The system accomplishes word recognition using only the consonant phonemes from continuous speech sentences, and generates word sequences in sentence form for output to an existing natural language processor. The domain specific vocabulary subsets used by the system facilitate its use as a sentence pre-processor especially with natural language understanding systems which rely on scripts, and the associated domain specific vocabularies, for semantic processing of topic oriented sentence groups. EPILOG: a Parallel Interpreter for Logic Programs Michael J. Wise University of New South Wales (Australia) Ph.D. 1985 Computer Science DAI V47(05), SecB, pp2063 This item is not available from University Microfilms International. ADG05-58882 Knowledge Representation using Linguistic Fuzzy Relations Wen-Ran Zhang University of South Carolina Ph.D. 1986, 118 pages Computer Science DAI V47(08), SecB, pp3437 University Microfilms International ADG86-26303 152 Through combining the logic programming language Prolog with a data driven execution mechanism we may be closer to solving the problems encountered when designing tightly coupled multiprocessors involving more than a trivial number of processor elements. This is the central idea around which the work is constructed. The report begins with a review of current multiprocessors and a description of one of the more attractive alternative models - the data-flow model of computation. Among the early chapters there is also an informal introduction to Prolog seen from the point of view of the unification algo,rithm. Discussion then moves onto the substantive issues of the thesis - a critique of the problems found in the data-flow model and the properties of Prolog that make it a n attractive solution. The synthesis of these two concepts is the EPILOG model which, stated simply, substitutes breadth first execution for Prolog's depth first pattern, and then provides mechanisms for controlling the abundant parallelism that would otherwise lead to combinatorial explosion. The EPILOG model is described in detail. This EPILOG model is, however, just the first, more abstract stage - the so called "basic" model. The next stage is to fit the basic model onto specific architectures. This is done via a simulation written in Pascal, which is described together with the set of underlying assumptions about the architectures being simulated. Results are then presented for the first set of experiments and some tentative conclusions drawn. Finally, the related work of other authors is reviewed in the light of the earlier discussion of EPILOG. This dissertation presents a theoretical framework for semantic representation, linguistic computation, knowledge representation, and approximate reasoning about object relations in knowledge engineering. The notions of term sets are extended; the notions of Linguistic Fuzzy Relation (LFR), Linguistic Fuzzy Similarity Relation (LSR), and Linguistic Transitive Closure (LTC) are proposed based on the theory of numerical fuzzy relation, numerical similarity relation, the extension principle, and the extended term set definitions. Theorems are given that provide conditions for the existence and uniqueness of the LTCs of an LFR under three different operations of extended max-min, extended max-product, and extended max-A; two algorithms for obtaining the LTCs are presented, and some interesting features of different LTCs are identified and illustrated by numerical examples. POOL - a semantic model for approximate reaso.ning - is proposed based on the theory of LFRs. A prototype system has been implemented in Franz Lisp under the UNIX ~ Operating System (Berkley 4.2 bsd) on a SUN 2 / 1 2 0 workstation. Results confirm that the proposed model can Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter Abstracts of Current Literature provide knowledge-based systems with both representational and inferential power. 1UNIX is a trademark of AT&T Bell Laboratories. Computer Generation of Meta-technicai Utterances in Tutoring Mathematics Ingrid Zukerman University of California, Los Angeles Ph.D. 1986, 162 pages Computer Science DAI V47(06), SecB, pp2516 University Microfilms International ADG86-21162 Instantiating Maps and Text R. Robert Abel Arizona State University Ph.D. 1986, ! 56 pages Education, Psychology DAI V47(05), SecA, pp1651 University Microfilms International ADG86-16447 A technical discussion often contains conversational expressions like: "however", "as I have stated before", "next", etc. These expressions, denoted Meta-Technical Utterances (MTUs), carry important information which the listener uses to speed up the comprehension process. The goal of this research is to understand the semantics of text containing MTUs, the mechanisms by which people generate them, and the processes required for generating them mechanically. To achieve this goal, we model the meaning of MTUs in terms of their anticipated effect on the listener comprehension, and use these predictions to select MTUs and embed them in a computer generated discourse. This paradigm was implemented in a computer system called FIGMENT, which generates commentaries on the solution of algebraic equations. We classify MTUs according to their function, as seen by the speaker, in transmitting the subject matter to the listener, and distinguish among three main types of MTUs: (1) Knowledge Organization, (2) Knowledge Acquisition, and (3) Affect Maintenance. Knowledge-Organization MTUs reflect the organization of the material in the speaker's mind (e.g., "however," "in order to"), Knowledge-Acquisition MTUs provide information that enables the listener to prepare adequate knowledge-assimilating facilities (e.g., "we shall now introduce," "as I have stated before"), and (3) Affect-Maintenance MTUs convey the affective impact of an event (e.g., "fortunately"), and foster a positive attitude in the listener (e.g., "I shall go over this explanation again"). This classification governs the generation of MTUs in the following manner: Knowledge-Organization and some Affect-Maintenance MTUs are generated directly from the organization of the system's knowledge of the subject matter; Knowledge-Acquisition MTUs and the majority of Affect-Maintenance MTUs are generated by consulting simplified models of some mental processes which the user presumably activates upon encountering a technical message. For example, determining the context in which a technical message should be processed, building up motivation to attend to the next item of discourse, and so on. The main contribution of this dissertation is the presentation of an explicit model for the generation of MTUs. This model can be incorporated into a text-generation facility to enable the generation of fluent and coherent discourse. The purpose of this research was to investigate the conjoint retention of spatial and linguistic information in an instructional context. In Experiment 1, subjects wrote either physical descriptions, fictional narratives, or personalized narratives while processing a reference map. Results indicated that both types of narrative processing significantly increased overall map recall, supporting the prediction that richer semantic processing leads to more effective spatial learning. HowevEr, the act of personalizing the map space and concomitant semantic processing was, in fact, detrimental to locational memory. Experiment 2 showed maps capable of improving learning of general or abstract text, and lent further substance to the notion that, under correct learning conditions, map features and related text are stored as conjoint units in memory. Subjects producing idiosyncratic map features and placing them on the map as exemplars of text concepts were much more likely to retain both perceptual map information and conceptual text information. Data on order of recall indicated that subjects who viewed intact maps as they processed text were able to rely on both the map structure and the serial order of the original passage as a Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 153 The FINITESTRING Newsletter Abstracts of CurrentLiterature guide for free recall. However, subjects viewing lists of identical map features displayed only the use of passage structure for text recall. This difference appears attributable to the availability of map images for subjects viewing maps at encoding, and the lack of imaginal support for the list group. Finally, generating features and their spatial locations led to superior locational accuracy at recall. Both studies support the conjoint retention hypothesis which states that probability of recalling information from maps or related text is predictable directly from the trace strength of jointly encoded imaginal and verbal memory representations. Assessing Lexical Knowledge Merlynn Rosell Bergen Stanford University Ph.D. 1986, 294 pages Education, Psychology DAI V47(06), SecA, pp2082 University Microfilms International ADG86-19716 A Conceptual Database Design and Analysis Methodology (Volumes I and II). Rob H. Rucker Arizona State University Ph.D. 1986, 583 pages Engineering, Industrial DAI V47(06), SecB, pp2574 University Microfilms International ADG86-2203 7 154 The words we use are hypothesized to lie within a richly interlinked semantic network. To answer the question "What does a student know when he or she knows the meaning of a word?" the present study used a structured interview to provide a frame within which 48 4th- and 6th-grade students gave detailed descriptions of their semantic networks for 32 words from their basal readers. A child learns the meanings of most words from daily experiences. Only later, as a result of schooling, does he or she learn the more formal uses of language. A lexical model that predicts separability of natural and formal lexical knowledge was introduced and the two aspects differentially measured. The purposes were: (a) to assess the lexical knowledge of students who varied in vocabulary ability, grade, and gender; (b) to determine performance differences when difficulty of the task was manipulated through the use of word (form class, difficulty, word origin), presentation (modality, context), and task factors; and (c) to test the Lexical Model. The results were: (a) students' formal lexical knowledge was below mastery for these known words, but student protocols indicated a rich web of natural lexical knowledge (high ability students reliably outperformed average ability students, grade differences were significant only for the formal word knowledge measures, and gender was never a significant source of variance); (b) factors designed to vary the difficulty of the task did not yield consistently significant performance differences; and (c) factor analyses of the dependent measures indicated separability of natural and formal word knowledge as had been predicted by the Lexical Model. Measuring lexical knowledge in the detailed fashion of the present study has been neglected in earlier research. The hypothesis that natural word knowledge skills are separable from formal has not previously been tested in this way. As all of the students provided evidence of a rich web of natural word knowledge, deliberately building formal lexical skills from within this semantically-interlinked base, rather than separately from it, would appear to be a useful pedagogical strategy. This dissertation presents a methodology, and describes an environment, useful for conceptual data base design at the requirements analysis level. The methodology is called DREAMERS. This is an acronym for Domain, Relation, Entity, Attribute, Mathematical Modeling with Expert System Support. The methodology allows for the rapid prototyping of proposed skeleton conceptual design via an implementation within a selected relational data base management system. The methodology includes design, implementation and analysis components. The sequence of steps in the methodology proceeds from system specifications, to high level abstract data/transactions models, to an entity-relation type digraph model, to a relational conceptual design, to a relational implementation and thence to analysis of the operational skeleton prototype data base. Underlying the analysis portion of the methodology is the use of the mathematics of categories, digraphs, lattices, simplicial complexes and predicate logic as well as a large scale relational database to aid in the Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter Abstracts of Current Literature development process itself. The categorial portion of the mathematics is used to first classify the models used in the methodology, and then, the graphical theories are used to derive canonical structures for analysis and implementation. Considering the transactions and their associated data structures, doubly ordered lattices have been constructed (Transaction/Data Maps) that provide additional graphical insight to the analyst. Other graphical techniques have been employed that create binary matrices from relational tables or skeleton digraphs via an operation called "complexing". From a binary matrix, at least three graphical representations - digraphs, simplexes and lattices - may be derived. The environment for the database development discussed here has been constructed by the author by using the SQL/DS database management system product together with an interactive interface based on the language APL2. These two major products are supplemented with the expert system shell ESE/VM together with additional support from the packages GRAPHPAK, and REXX. This developmental environment aids the designer by providing dialog management, graphics, analysis, expert system, communications, and database management system services. Parallel Processing of Natural Language Hui Olivia Chang Northwestern University Ph.D. 1986, 208 pages Language, Linguistics DAI V47(08), SecA, pp3020 University Microfilms International ADG86-2 733 7 Linguistics and Translation: Some Semantic Problems in Arabic-English Translation Ahmed Mouakket Georgetown University Ph.D. 1986, 250 pages Language, Linguistics DAI V47(07), SecA, pp2565 University Microfilms International Two types of parallel natural language processing are studied in this work: (1) the parallelism between syntactic and non-syntactic processing and, (2) the parallelism within syntactic processing. It is recognized that a syntactic category can potentially be attached to more than one node in the syntactic tree of a sentence. Even if all the attachments are syntactically wellformed, non-syntactic factors such as semantic and pragmatic consideration may require one particular attachment. Syntactic processing must synchronize and communicate with non-syntactic processing. Two syntactic processing algorithms are proposed for use in a parallel environment: Early's algorithm and the LR(k) algorithm. Conditions are identified to detect the syntactic ambiguity and the algorithms are augmented accordingly. It is shown that by using non-syntactic information during syntactic processing, backtracking can be reduced, and the performance of the syntactic processor is improved. For the second type of parallelism, it is recognized that one portion of a grammar can be isolated from the rest of the grammar and be processed by a separate processor. A partial grammar of a larger grammar is defined. Parallel syntactic processing is achieved by using two processors concurrently: the main processor (mp) and the auxiliary processor (ap). The auxiliary processor processes/accepts a substring in the input that is generated by the partial grammar. The main processor is responsible for processing the rest of the input and for interprocessor communication. The LR(k) algorithm is augmented to the effect that the main processor can take advantage of the processing result of the auxiliary processor. It is shown that the performance of the proposed parallel processing is supported by many of the syntactic constraints in natural languages. In addition, by recognizing the divisibility of the grammar, parallel parsing supports partial semantic interpretation during the course of the processing and is useful for constructing fault-tolerant NLP. Translating from Arabic into English involves certain morphological, syntactic and semantic problems. To understand these problems, one has to return to the cultural and social backgrounds of the Arabic language and try to discover how these may affect the process of translating into English. It is also essential to note that Arabic is a VSO, non-Indo-European language whose speakers differ in cultural and social behavior from those of the Western languages. The problem, therefore, will be threefold: (a) to look into the cultural Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 155 The FINITE STRING Newsletter Abstracts of Current Literature ADG86-22332 and social backgrounds of Arabic and discover the basic elements which affect the process of translation, (b) to account for the peculiarities of Arabic lexicon and structure by examining some Arabic texts that have been translated into the English language by native speakers of English, and (c) to relate the above descriptions and findings to a theory of translation in the light of what Nida termed "dynamic equivalence." It is on the areas of cross-cultural communication, connotative meanings, intersentential levels and textual levels that this study will focus most. Furthermore, recent developments in the fields of theoretical linguistics and the research done in the fields of Case G r a m m a r and semantics have paved the way toward a deeper understattding of underlying structures. In this study, the works of Fillmore (1968, 1971), Chafe (1970), and Cook's (1979) Matrix Model will furnish some basic concepts of semantic representations in order to account "for the analysis of certain data. The study shows the pragmatic aspects of translating certain Arabic texts into English. It also gives a short account of the use and application of translation in some educational areas. Finally, it provides implications for an interpretable theory which considers translation both an art and a science. Computer Assisted Dialect Adaptation: the Tucanoan Experiment This dissertation provides the theoretical basis for a computer program that adapts textual material from one language of the Tucanoan family to another. Tucanoan languages are spoken by small groups living in south, eastern Colombia, northwestern Brazil, northern Peru, and northern Ecuador. This work represents the first attempt to apply principles of machine translation and computational linguistics to indigenous languages of Colombia. It discusses aspects of translation theory relevant to machine translation. Some features of the Tucanoan languages relevant to the adaptation process are discussed in depth, including differences in suffix systems marking case, noun classifiers, and the evidential systems of the various languages. Of particular interest for automated parsing is the problem of null allomorphs of certain morphemes. Robert Bruce Reed The University of Texas at Arlington Ph.D. 1986, 272 pages Language, Linguistics DAI V47(06), SecA, pp2146 University Microfilms International ADG86-21742 The Semantics of Anaphora in Discourse Rebecca Louis Root The University of Texas at Austin Ph.D. 1986, 175 pages Language, Linguistics DAI V47(05), SecA, pp1716 University Microfilms International A DG86-185 74 156 The syntactic variety found in anaphoric relationships in discourses of more than one sentence is quite great. This is particularly true when plural anaphors are involved. In addition to the expected forms, there are links from anaphors to discontinuous antecedents, as in " J o h n found a piano teacher. They are both fond of Mozart", and links from plural anaphors to singular antecedents in distributive contexts, as in " E v e r y girl brought a cake. CoincidentaUy, they were all chocolate." Despite the surface variation, it is argued here that a uniform account of these constructions is possible. An analysis of the semantic properties of discourse anaphora is presented here which offers a unified explanation of both the truth conditions and the acceptability conditions of this phenomenon. The analysis is framed within the context of the Discourse Representation Theory of Hans Kamp. In this theory, each sentence contributes to the construction of a representation of the meaning of the discourse. This is done in part through the introduction of "reference markers" for each noun phrase. Truth is defined for such a representation in terms of an embedding of the representation in a model. In this account, an anaphoric link is truthful if the set of individuals to which the anaphor's reference marker is mapped, is the same as the set of individuals to which the reference markers of the antecedent(s) are mapped in the course of finding an embedding for the representation. This definition formalized, together with the assumptions that the discourse is true and that there is a principle of semantic number agreement, constitute the analysis presented here. In addition to the formal explanations, it is argued that this approach is attractive from a Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 The FINITE STRING Newsletter Abstracts o f Current Literature computational point of view, and an implementation of the leading ideas is presented. A Prolegomenon to Theory of Translation Robert Darrell Firmage The University of Utah Ph.D. 1986, 361 pages Philosophy DAI V47(07), SecA, pp2612 University Microfilms International ADG86-24441 As its title indicates, this dissertation seeks both to determine the scope and nature of theory of translation, and to provide a basis for future attempts to produce such a theory. Its general strategy is to clear the ground for an analysis of the nature of the equivalence obtaining between a translation and its original, by grounding it in an adequate theory of language. Thus, its primary focus is on existing theories of meaning and of truth, particularly as enunciated by contemporary representatives of the analytic tradition of philosophy. It is divided into three chapters. Chapter I centers about the problem of indeterminacy of translation, as introduced by Quine, and serves mainly as a prospectus of current philosophic discussion involving the notion of translation. It attempts both to enunciate the major problems and to review and criticize various significant viewpoints concerning them. Since theory of translation is shown to involve theory of meaning, Chapter II attempts to adumbrate a theory of meaning adequate to the needs and practices of translation. Such a theory, in turn, is shown to involve the notion of truth in relation to human practice, and hence Chapter III is devoted to theory of truth. In a short Postscript, the results of these discussions are refocused on the problem of translational equivalence, in the endeavor to provide an heuristic for subsequent analysis. Although it cannot presume to have provided an adequate theory of translation, this dissertation claims to have sketched the basis for such a theory, by virtue of having provided an account of the workings of interlinguistic exchanges, and language in general, from the perspective of the actual practice of translation, rather than from the typical "armchair linguistics" of most philosophic theory. In the process, it has provided a perspective on the ancient problems of meaning and truth, which, if correct, would necessitate a thorough revamping of most of the traditional approaches to those problems. Although centered about the notion of translation, owing to the ramifications of this notion, it could perhaps as easily be seen as a prolegomenon to theory of knowledge. A Theory of Events Kathleen Gill Indiana University Ph.D, 270 pages Philosophy DAI V47(05). SecA, pp1750 An account of events is developed in which events are characterized as a series of momentary states of affairs. This characterization is motivated by a study of the structural features required to capture our notion of an event. Events have structure in the sense that they involve objects and properties, and, since they necessarily occur over an interval of time, events have a transtemporal structure. This latter feature is used to account for a variety of relationships between events, as well as distinctions between states, processes, and completal events. Special attention is given to the problem of event identity. Some progress is made on this issue by laying the groundwork which is necessary for its resolution. This consists, first of all, in sorting out various cases of identity, e.g. distinguishing between the problems of adverbial modification and property identity, and, secondly, in providing a metaphysical framework within which to interpret the problem of event identity. Natural Language Semantics and Guise Theory Frant~sco Orilia Indiana University Ph.D. 1986, 263 pages Philosophy DAI V47(08), SecA, pp3069 University Microfilms International ADG86-28010 I assume that the task of natural language semantics is to provide an unambiguous logical language into which natural language can be translated in such a way that the translating expressions display a structure which is isomorphic to the meaning of the translated expressions. Since language is a means of thinking and communicating mental contents, the meanings o f singular terms cannot be the individuals of the substratist tradition, because such individuals are not cognizable entities. Thus I propose that the logical language be based on Castaneda's guise theory, according to Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987 157 The FINITE STRING Newsletter Abstracts of Current Literature which singular terms always denote guises, i.e., roughly, (finite) bundles of properties. This, I argue, would result in a semantics which is in accordance with projects such as Lakoff's natural logic or Fodor's methodological solipsism. I first propose a formal system, GCC, which tries to be as faithful as possible to Castaneda's informal presentation of guise theory. It is therefore characterized by different forms of predication and a distinction between a level of property composition and a level of proposition composition. Such a distinction is dropped in a second system, GF, which presents a more traditional Fregean representation of predication. Yet, GF endorses essential assumptions of guise theory such as the existence of different sameness relations that can provide various interpretations for the English "is". I claim that GF provides more theoretical simplicity than GCC. Finally, I show the fruitfulness of the present approach by applying GF to a vast collection of linguistico-philosophical puzzles which includes but is not restricted to those that guise theory was originally designed to address: various versions of Frege's paradox, the paradox of analysis, Quine's puzzle on the number of planets, issues of reidentification and intentional identity, the anaphoric "it" of sentences such as "the lizard's tail fell off but then it grew back," problems connected with the use of "knowing-who (-which)," proper names, indexicals and demonstratives. The Influence of Domain-Specific Know- ledge on Processing Resources During the Comprehension of Domain-Related Information Timothy Andrew Post University of Pittsburgh Ph.D. 1986, 86 pages; Psychology, Developmental DAI V47(06), SecB, pp2646 University Microfilms International ADG86-20226 158 Three experiments were designed to test assumptions regarding domainspecific knowledge and the allocation of processing load. A general model of comprehension was used to frame the current effort. This model specifies that comprehenders are able to instantiate stored memories of events, referred to as substructures, that are related to contextual situations. Once instanfiated, a substructure does not require processing load and facilitates the ifitegration of domain-specific materials. Four general findings are described: Sequences of statements about an event in a domain are processed with a large initial reading time, followed by a decline; unusual events can cause changes in processing load allocation, but only to an initial degree and not as a function of the degree of unusualness; reading goals, i.e., the intent that one has in comprehending a portion of text, may be changed as a function of importance and unusualness at the sentence level; and domain-specific knowledge may be viewed as a cognitive adaptation that primarily reduces the processing demands of text comprehension. A model is sketched that describes how these four phenomena may occur during reading. Computational Linguistics, Volume 13, Numbers 1-2, January-June 1987