Unsupervised learning for syntactic disambiguation

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

We present a methodology framework for syntactic disambiguation in natural language texts. The method takes advantage of an existing manually compiled non-probabilistic and non-lexicalized grammar, and turns it into a probabilistic lexicalized grammar by automatically learning a kind of subcategorization frames or selectional preferences for all words observed in the training corpus. The dictionary of subcategorization frames or selectional preferences obtained in the training process can be subsequently used for syntactic disambiguation of new unseen texts. The learning process is unsupervised and requires no manual markup. The learning algorithm proposed in this paper can take advantage of any existing disambiguation method, including linguistically motivated methods of filtering or weighting competing alternative parse trees or syntactic relations, thus allowing for integration of linguistic knowledge and unsupervised machine learning.

Original languageEnglish
Pages (from-to)329-344
Number of pages16
JournalComputacion y Sistemas
Volume18
Issue number2
DOIs
StatePublished - 2014

Keywords

  • Natural language processing
  • Syntactic disambiguation
  • Syntactic parsing
  • Unsupervised machine learning

Fingerprint

Dive into the research topics of 'Unsupervised learning for syntactic disambiguation'. Together they form a unique fingerprint.

Cite this