TY - JOUR
T1 - Paraphrase Identification
T2 - 2022 Iberian Languages Evaluation Forum, IberLEF 2022
AU - Rahman, Abu Bakar Siddiqur
AU - Ta, Hoang Thang
AU - Najjar, Lotfollah
AU - Gelbukh, Alexander
N1 - Publisher Copyright:
© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
PY - 2022
Y1 - 2022
N2 - In this paper, we work on Paraphrase Identification in Mexican Spanish (PAR-MEX) at the sentence level. We introduced two lightweight methods, linear regression and multilayer perceptron for training data on features, extracted from pre-trained models. A rule of thumb, pair similarity is used to filter noises in the positive examples. We obtained the best F1 of 88.67%, which points out the effectiveness of traditional methods with the support of pre-trained models. In the challenge, our result ranked fourth in the organizers' result table.
AB - In this paper, we work on Paraphrase Identification in Mexican Spanish (PAR-MEX) at the sentence level. We introduced two lightweight methods, linear regression and multilayer perceptron for training data on features, extracted from pre-trained models. A rule of thumb, pair similarity is used to filter noises in the positive examples. We obtained the best F1 of 88.67%, which points out the effectiveness of traditional methods with the support of pre-trained models. In the challenge, our result ranked fourth in the organizers' result table.
KW - IberLEF
KW - Linear Regression
KW - MultiLayer Perceptron
KW - PAR-MEX
KW - Paraphrase Identification
KW - Text Classification
UR - http://www.scopus.com/inward/record.url?scp=85137375834&partnerID=8YFLogxK
M3 - Artículo de la conferencia
AN - SCOPUS:85137375834
SN - 1613-0073
VL - 3202
JO - CEUR Workshop Proceedings
JF - CEUR Workshop Proceedings
Y2 - 20 September 2022
ER -