Part-of-Speech Tagging from 97% to 100%: Is It Time for Some Linguistics?

Abstract
I examine what would be necessary to move part-of-speech tagging performance from its current level of about 97.3% token accuracy (56% sentence accuracy) to close to 100% accuracy. I suggest that it must still be possible to greatly increase tagging performance and examine some useful improvements that have recently been made to the Stanford Part-of-Speech Tagger. However, an error analysis of some of the remaining errors suggests that there is limited further mileage to be had either from better machine learning or better features in a discriminative sequence classifier. The prospects for further gains from semisupervised learning also seem quite limited. Rather, I suggest and begin to demonstrate that the largest opportunity for further progress comes from improving the taxonomic basis of the linguistic resources from which taggers are trained. That is, from improved descriptive linguistics. However, I conclude by suggesting that there are also limits to this process. The status of some words may not be able to be adequately captured by assigning them to one of a small number of categories. While conventions can be used in such cases to improve tagging consistency, they lack a strong linguistic basis.
Keywords No keywords specified (fix it)
Categories (categorize this paper)
Options
 Save to my reading list
Follow the author(s)
My bibliography
Export citation
Find it on Scholar
Edit this record
Mark as duplicate
Revision history Request removal from index Translate to english
 
Download options
PhilPapers Archive


Upload a copy of this paper     Check publisher's policy on self-archival     Papers currently archived: 10,350
External links
Setup an account with your affiliations in order to access resources via your University's proxy server
Configure custom proxy (use this if your affiliation does not provide a proxy)
Through your library
References found in this work BETA

No references found.

Citations of this work BETA

No citations found.

Similar books and articles
Analytics

Monthly downloads

Added to index

2011-02-10

Total downloads

19 ( #85,584 of 1,096,754 )

Recent downloads (6 months)

1 ( #273,068 of 1,096,754 )

How can I increase my downloads?

My notes
Sign in to use this feature


Discussion
Start a new thread
Order:
There  are no threads in this forum
Nothing in this forum yet.