TY - GEN
T1 - Authorship Link Retrieval Between Documents
AU - Calvo, Hiram
AU - García-Mendoza, Consuelo Varinia
AU - Ruiz-Chávez, Esteban Andrés
AU - Gambino, Omar Juárez
N1 - Publisher Copyright:
© 2020, Springer Nature Switzerland AG.
PY - 2020
Y1 - 2020
N2 - In this paper we propose a method for automatic author clustering called Document Authoring Link Retriever, DALIR. Documents are represented using Doc2Vec, experimenting with several parameters; afterwards, vectors are clustered (or linked together) using K-means and Hierarchical Agglomerative Clustering. We experimented with different vector representation sizes, different fixed number of clusters, and clustering methods. We evaluated our method on the author clustering task of PAN @ CLEF 2017. We used the BCubed F-score evaluation scheme of this task, being able to overcome some of the reported results from the first places of this challenge, although our method requires to manually establish a number of clusters a priori.
AB - In this paper we propose a method for automatic author clustering called Document Authoring Link Retriever, DALIR. Documents are represented using Doc2Vec, experimenting with several parameters; afterwards, vectors are clustered (or linked together) using K-means and Hierarchical Agglomerative Clustering. We experimented with different vector representation sizes, different fixed number of clusters, and clustering methods. We evaluated our method on the author clustering task of PAN @ CLEF 2017. We used the BCubed F-score evaluation scheme of this task, being able to overcome some of the reported results from the first places of this challenge, although our method requires to manually establish a number of clusters a priori.
KW - Author profiling
KW - Clustering
KW - Computational linguistics
KW - Style analysis
UR - http://www.scopus.com/inward/record.url?scp=85092934829&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-60887-3_27
DO - 10.1007/978-3-030-60887-3_27
M3 - Contribución a la conferencia
AN - SCOPUS:85092934829
SN - 9783030608866
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 297
EP - 305
BT - Advances in Computational Intelligence - 19th Mexican International Conference on Artificial Intelligence, MICAI 2020, Proceedings
A2 - Martínez-Villaseñor, Lourdes
A2 - Ponce, Hiram
A2 - Herrera-Alcántara, Oscar
A2 - Castro-Espinoza, Félix A.
PB - Springer Science and Business Media Deutschland GmbH
T2 - 19th Mexican International Conference on Artificial Intelligence, MICAI 2020
Y2 - 12 October 2020 through 17 October 2020
ER -