Case-sensitivity of classifiers for WSD: Complex systems disambiguate tough words better

Harri M.T. Saarikoski, Steve Legrand, Alexander Gelbukh

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

We present a novel method for improving disambiguation accuracy by building an optimal ensemble (OE) of systems where we predict the best available system for target word using a priori case factors (e.g. amount of training per sense). We report promising results of a series of best-system prediction tests (best prediction accuracy is 0.92) and show that complex/simple systems disambiguate tough/easy words better. The method provides the following benefits: (1) higher disambiguation accuracy for virtually any base systems (current best OE yields close to 2% accuracy gain over Senseval-3 state of the art) and (2) economical way of building more effective ensembles of all types (e.g. optimal, weighted voting and cross-validation based). The method is also highly scalable in that it utilizes readily available factors available for any ambiguous word in any language for estimating word difficulty and defines classifier complexity using known properties only.

Original languageEnglish
Title of host publicationComputational Linguistics and Intelligent Text Processing - 8th International Conference, CICLing 2007, Proceedings
PublisherSpringer Verlag
Pages253-266
Number of pages14
ISBN (Print)354070938X, 9783540709381
DOIs
StatePublished - 2007
Event8th Annual Conference on Intelligent Text Processing and Computational Linguistics, CICLing 2007 - Mexico City, Mexico
Duration: 18 Feb 200724 Feb 2007

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4394 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference8th Annual Conference on Intelligent Text Processing and Computational Linguistics, CICLing 2007
Country/TerritoryMexico
CityMexico City
Period18/02/0724/02/07

Fingerprint

Dive into the research topics of 'Case-sensitivity of classifiers for WSD: Complex systems disambiguate tough words better'. Together they form a unique fingerprint.

Cite this