Text comparison using soft cardinality

Sergio Jimenez, Fabio Gonzalez, Alexander Gelbukh

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

27 Scopus citations

Abstract

The classical set theory provides a method for comparing objects using cardinality and intersection, in combination with well-known resemblance coefficients such as Dice, Jaccard, and cosine. However, set operations are intrinsically crisp: they do not take into account similarities between elements. We propose a new general-purpose method for comparison of objects using a soft cardinality function that show that the soft cardinality method is superior via an auxiliary affinity (similarity) measure. Our experiments with 12 text matching datasets suggest that the soft cardinality method is superior to known approximate string comparison methods in text comparison task.

Original languageEnglish
Title of host publicationString Processing and Information Retrieval - 17th International Symposium, SPIRE 2010, Proceedings
Pages297-302
Number of pages6
DOIs
StatePublished - 2010
Event17th International Symposium on String Processing and Information Retrieval, SPIRE 2010 - Los Cabos, Mexico
Duration: 11 Oct 201013 Oct 2010

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume6393 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference17th International Symposium on String Processing and Information Retrieval, SPIRE 2010
Country/TerritoryMexico
CityLos Cabos
Period11/10/1013/10/10

Fingerprint

Dive into the research topics of 'Text comparison using soft cardinality'. Together they form a unique fingerprint.

Cite this