“Identifying Phrasal Connectives in Italian Using Quantitative Methods”

In Stefania Nuccorini (ed.), Phrases and Phraseology – Data and Descriptions. Peter Lang Verlag (2002)
Abstract
In recent decades, the analysis of phraseology has made use of the exploration of large corpora as a source of quantitative information about language. This paper intends to present the main lines of work in progress based on this empirical approach to linguistic analysis. In particular, we focus our attention on some problems relating to the morpho-syntactic annotation of corpora. The CORIS/CODIS corpus of contemporary written Italian, developed at CILTA – University of Bologna (Rossini Favretti 2000; Rossini Favretti, Tamburini, De Santis in press), is a synchronic 100-million-word corpus and is being lemmatised and annotated with part-of-speech (POS) tags, in order to increase the quantity of information and improve data retrieval procedures (Tamburini 2000). The aim of POS tagging is to assign each lexical unit to the appropriate word class. Usually the set of tags is pre-established by the linguist, who uses his/her competence to identify the different word classes. The very first experiments we made revealed how the traditional part-of-speech distinctions in Italian (generally based on morphological and semantic criteria) are often inadequate to represent the syntactic features of words in context. It is worth noting that the uncertainties in categorisation contained in Italian grammars and dictionaries reflect a growing difficulty as they move from fundamental linguistic classes, such as nouns and verbs, to more complex classes, such as adverbs, pronouns, prepositions and conjunctions. This latter class, that groups together elements traditionally used to express connections between sentences, appears inadequate when describing cohesive relations in Italian. This phenomenon actually seems to involve other elements traditionally assigned to different classes, such as adverbs, pronouns and interjections. Recent studies proposed the class of ‘connectives’, grouping all words that, apart from their traditional word class, have the function of connecting phrases and contributing to textual cohesion. From this point of view, conjunctions can be considered as part of phrasal connectives, that can in turn be included in the wider category of textual connectives. The aim of this study is to identify elements that can be included in the class of phrasal connectives, using quantitative methods. According to Shannon and Weaver’s (1949) observation that words are linked by dependent probabilities, corroborated by Halliday’s (1991) argument that the grammatical “system” (in Firth’s sense of the term) is essentially probabilistic, quantitative data are introduced in order to provide evidence of relative frequencies. Section 2 presents a description of word-class categorisation from the point of view of grammars and dictionaries arguing that the traditional category of conjunctions is inadequate for capturing the notion of phrasal connective. Section 3 examines the notion of ‘connective’ and suggests a truth-function interpretation of connective behaviour. Section 4 describes the quantitative methods proposed for analysing the distributional properties of lexical units, and section 5 comments on the results obtained by applying such methods drawing some provisional conclusions.
Keywords No keywords specified (fix it)
Categories (categorize this paper)
Options
 Save to my reading list
Follow the author(s)
My bibliography
Export citation
Find it on Scholar
Edit this record
Mark as duplicate
Revision history Request removal from index Translate to english
 
Download options
PhilPapers Archive
External links This entry has no external links. Add one.
Through your library Configure
References found in this work BETA

No references found.

Citations of this work BETA

No citations found.

Similar books and articles
Rui P. Chaves (2008). Linearization-Based Word-Part Ellipsis. Linguistics and Philosophy 31 (3):261-307.
Analytics

Monthly downloads

Added to index

2010-12-18

Total downloads

52 ( #25,829 of 1,088,905 )

Recent downloads (6 months)

9 ( #12,169 of 1,088,905 )

How can I increase my downloads?

My notes
Sign in to use this feature


Discussion
Start a new thread
Order:
There  are no threads in this forum
Nothing in this forum yet.