Accurate Unlexicalized Parsing

Abstract We demonstrate that an unlexicalized PCFG can parse much more accurately than previously shown, by making use of simple, linguistically motivated state splits, which break down false independence assumptions latent in a vanilla treebank grammar. Indeed, its performance of 86.36% (LP/LR F1) is better than that of early lexicalized PCFG models, and surprisingly close to the current state-of-theart. This result has potential uses beyond establishing a strong lower bound on the maximum possible accuracy of unlexicalized models: an unlexicalized PCFG is much more compact, easier to replicate, and easier to interpret than more complex lexical models, and the parsing algorithms are simpler, more widely understood, of lower asymptotic complexity, and easier to optimize.
Keywords No keywords specified (fix it)
Categories No categories specified (fix it)
Options
 Save to my reading list
Follow the author(s)
My bibliography
Export citation
Find it on Scholar
Edit this record
Mark as duplicate
Revision history Request removal from index
 
Download options
PhilPapers Archive


Upload a copy of this paper     Check publisher's policy on self-archival     Papers currently archived: 5,653
External links
  •   Try with proxy.
  • Through your library Only published papers are available at libraries

    Similar books and articles

    Analytics

    Monthly downloads

    Added to index

    2010-12-22

    Total downloads

    2 ( #232,211 of 548,977 )

    Recent downloads (6 months)

    1 ( #63,511 of 548,977 )

    How can I increase my downloads?


    My notes
    Sign in to use this feature


    Discussion
    Start a new thread
    Order:
    There  are no threads in this forum
    Nothing in this forum yet.

    Other forums