TY - GEN
T1 - Generalized Mongue-Elkan method for approximate text string comparison
AU - Jimenez, Sergio
AU - Becerra, Claudia
AU - Gelbukh, Alexander
AU - Gonzalez, Fabio
PY - 2009
Y1 - 2009
N2 - The Mongue-Elkan method is a general text string comparison method based on an internal character-based similarity measure (e.g. edit distance) combined with a token level (i.e. word level) similarity measure. We propose a generalization of this method based on the notion of the generalized arithmetic mean instead of the simple average used in the expression to calculate the Monge-Elkan method. The experiments carried out with 12 well-known name-matching data sets show that the proposed approach outperforms the original Monge-Elkan method when character-based measures are used to compare tokens.
AB - The Mongue-Elkan method is a general text string comparison method based on an internal character-based similarity measure (e.g. edit distance) combined with a token level (i.e. word level) similarity measure. We propose a generalization of this method based on the notion of the generalized arithmetic mean instead of the simple average used in the expression to calculate the Monge-Elkan method. The experiments carried out with 12 well-known name-matching data sets show that the proposed approach outperforms the original Monge-Elkan method when character-based measures are used to compare tokens.
UR - http://www.scopus.com/inward/record.url?scp=67650550255&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-00382-0_45
DO - 10.1007/978-3-642-00382-0_45
M3 - Contribución a la conferencia
SN - 3642003818
SN - 9783642003813
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 559
EP - 570
BT - Computational Linguistics and Intelligent Text Processing - 10th International Conference, CICLing 2009, Proceedings
T2 - 10th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2009
Y2 - 1 March 2009 through 7 March 2009
ER -