On Detection of Malapropisms by Multistage Collocation Testing

Igor A. Bolshakov, Alexander Gelbukh

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

15 Scopus citations

Abstract

Malapropism is a (real-word) error in a text consisting in unintended replacement of one content word by another existing content word similar in sound but semantically incompatible with the context and thus destructing text cohesion, e.g.: they travel around the word. We present an algorithm of malapropism detection and correction based on evaluating the cohesion. As a measure of semantic compatibility of words we consider their ability to form syntactically linked and semantically admissible word combinations (collocations), e.g: travel (around the) world. With this, text cohesion at a content word is measured as the number of collocations it forms with the words in its immediate context. We detect malapropisms as words forming no collocations in the context. To test whether two words can form a collocation, we consider two types of resources: a collocation DB and an Internet search engine, e.g., Google. We illustrate the proposed method by classifying, tracing, and evaluating several English malapropisms.

Original languageEnglish
Title of host publicationNatural Language Processing and Information Systems, 8th International Conference on Applications of Natural Language to Information Systems, NLDB 2003
EditorsAntje Dusterhoft, Bernhard Thalheim
PublisherGesellschaft fur Informatik (GI)
Pages28-41
Number of pages14
ISBN (Electronic)388579358X
StatePublished - 2003
Event8th International Conference on Applications of Natural Language to Information Systems, NLDB 2003 - Burg, Germany
Duration: 23 Jun 200325 Jun 2003

Publication series

NameLecture Notes in Informatics (LNI), Proceedings - Series of the Gesellschaft fur Informatik (GI)
VolumeP-29
ISSN (Print)1617-5468

Conference

Conference8th International Conference on Applications of Natural Language to Information Systems, NLDB 2003
Country/TerritoryGermany
CityBurg
Period23/06/0325/06/03

Fingerprint

Dive into the research topics of 'On Detection of Malapropisms by Multistage Collocation Testing'. Together they form a unique fingerprint.

Cite this