Dependency syntax analysis using grammar induction and a lexical categories precedence system

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The unsupervised approach for syntactic analysis tries to discover the structure of the text using only raw text. In this paper we explore this approach using Grammar Inference Algorithms. Despite of still having room for improvement, our approach tries to minimize the effect of the current limitations of some grammar inductors by adding morphological information before the grammar induction process, and a novel system for converting a shallow parse to dependencies, which reconstructs information about inductor's undiscovered heads by means of a lexical categories precedence system. The performance of our parser, which needs no syntactic tagged resources or rules, trained with a small corpus, is 10% below to that of commercial semi-supervised dependency analyzers for Spanish, and comparable to the state of the art for English.

Original languageEnglish
Title of host publicationComputational Linguistics and Intelligent Text Processing - 12th International Conference, CICLing 2011, Proceedings
Pages109-120
Number of pages12
EditionPART 1
DOIs
StatePublished - 2011
Event12th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2011 - Tokyo, Japan
Duration: 20 Feb 201126 Feb 2011

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume6608 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference12th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2011
Country/TerritoryJapan
CityTokyo
Period20/02/1126/02/11

Fingerprint

Dive into the research topics of 'Dependency syntax analysis using grammar induction and a lexical categories precedence system'. Together they form a unique fingerprint.

Cite this