The unsupervised approach: Grammar induction

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

Abstract

There are mainly two approaches for creating syntactic dependency analyzers: supervised and unsupervised. The main goal of the first approach is to attain the best possible performance for a single language. For this purpose, a large collection of resources is gathered (using manually annotated corpora with part-of-speech annotations and syntactic and structure tags), which requires a significant amount of work and time. The state of the art in this approach attains syntactic annotation in about 85% of all full sentences (Rooth in Proceedings of the symposium on representation and acquisition of lexical knowledge. AAAI, 1995 [172]); in English, it attains over 90%. On the other hand, the unsupervised approach tries to discover the structure of a text using only raw text, which allows the creation of a dependency analyzer for virtually any language. Here, we explore this second approach. We present the model of an unsupervised dependency analyzer, named DILUCT-GI (GI short for grammar inference).

Original languageEnglish
Title of host publicationStudies in Computational Intelligence
PublisherSpringer Verlag
Pages111-124
Number of pages14
DOIs
StatePublished - 2018

Publication series

NameStudies in Computational Intelligence
Volume765
ISSN (Print)1860-949X

Fingerprint

Dive into the research topics of 'The unsupervised approach: Grammar induction'. Together they form a unique fingerprint.

Cite this