Resumen
Prepositional Phrase (PP) attachment can be addressed by considering frequency counts of dependency triples seen in a non-annotated corpus. However, not all triples appear even in very big corpora. To solve this problem, several techniques have been used. We evaluate two different backoff methods, one based on WordNet and the other on a distributional (automatically created) thesaurus. We work on Spanish. The thesaurus is created using the dependency triples found in the same corpus used for counting the frequency of unambiguous triples. The training corpus used for both methods is an encyclopaedia. The method based on a distributional thesaurus has higher coverage but lower precision than the WordNet method.
Idioma original | Inglés |
---|---|
Páginas (desde-hasta) | 177-188 |
Número de páginas | 12 |
Publicación | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
Volumen | 3406 |
DOI | |
Estado | Publicada - 2005 |
Evento | 6th International Conference, CICLing 2005 - Mexico City, México Duración: 13 feb. 2005 → 19 feb. 2005 |