|Abstract||We demonstrate that an unlexicalized PCFG can parse much more accurately than previously shown, by making use of simple, linguistically motivated state splits, which break down false independence assumptions latent in a vanilla treebank grammar. Indeed, its performance of 86.36% (LP/LR F1) is better than that of early lexicalized PCFG models, and surprisingly close to the current state-of-theart. This result has potential uses beyond establishing a strong lower bound on the maximum possible accuracy of unlexicalized models: an unlexicalized PCFG is much more compact, easier to replicate, and easier to interpret than more complex lexical models, and the parsing algorithms are simpler, more widely understood, of lower asymptotic complexity, and easier to optimize.|
|Keywords||No keywords specified (fix it)|
|Categories||No categories specified (fix it)|
|Through your library||Only published papers are available at libraries|
Similar books and articles
Dan Klein & Christopher D. Manning, Fast Exact Inference with a Factored Model for Natural Language Parsing.
Dan Klein & Christopher D. Manning, Parsing with Treebank Grammars: Empirical Bounds, Theoretical Models, and the Structure of the Penn Treebank.
Christopher D. Manning & Kristina Toutanova, Parse Selection on the Redwoods Corpus: 3rd Growth Results.
Christopher Manning, An ¢¡¤£¦¥¨§ Agenda-Based Chart Parser for Arbitrary Probabilistic Context-Free Grammars.
Dan Klein & Christopher D. Manning, An Ç ´Ò¿ Μ Agenda-Based Chart Parser for Arbitrary Probabilistic Context-Free Grammars.
Dan Klein & Christopher D. Manning, A Generative Constituent-Context Model for Improved Grammar Induction.
Added to index2010-12-22
Total downloads2 ( #232,211 of 548,977 )
Recent downloads (6 months)1 ( #63,511 of 548,977 )
How can I increase my downloads?