Mathematical properties of soft cardinality: Enhancing Jaccard, Dice and cosine similarity measures with element-wise distance

Sergio Jimenez, Fabio A. Gonzalez, Alexander Gelbukh

Research output: Contribution to journalArticlepeer-review

39 Scopus citations

Abstract

The soft cardinality function generalizes the concept of counting measure of the classic cardinality of sets. This function provides an intuitive measure of the amount of elements in a collection (i.e. a set or a bag) exploiting the similarities among them. Although soft cardinality was first proposed in an ad-hoc way, it has been successfully used in various tasks in the field of natural language processing. In this paper, a formal definition of soft cardinality is proposed together with an analysis of its boundaries, monotonicity property and a method for constructing similarity functions. Additionally, an empirical evaluation of the model was carried out using synthetic data.

Original languageEnglish
Pages (from-to)373-389
Number of pages17
JournalInformation Sciences
Volume367-368
DOIs
StatePublished - 1 Nov 2016

Keywords

  • Cardinality-based similarity measures
  • Cosine similarity
  • Dice's index
  • Diversity-based similarity functions
  • Jaccard's index
  • Soft cardinality

Fingerprint

Dive into the research topics of 'Mathematical properties of soft cardinality: Enhancing Jaccard, Dice and cosine similarity measures with element-wise distance'. Together they form a unique fingerprint.

Cite this