TY - JOUR
T1 - Assessing Wordplay-Pun classification from JOKER dataset with pretrained BERT humorous models
AU - Palma Preciado, Victor Manuel
AU - Sidorov, Grigori
AU - Preciado, Carolina Palma
N1 - Publisher Copyright:
© 2022 Copyright for this paper by its authors.
PY - 2022
Y1 - 2022
N2 - Humor is one of the most subjective matters of human behavior since it includes a wide range of variables: sentiments, wordplay, double meanings structurally or phonetic, all of this within the construction of written humor. It is important to assess the humor from a different point of view since this variability tends to provide insight into the true structure or the main core of the humoristic dilemma, as we know the range of humor is so diverse that it presents a high skilled problem even on the simplest tasks. Pre-trained base Bert and DistilBert models trained with a humorous one-liners dataset were used, these trained models were tested with a merged dataset from JOKER from data of tasks 1 and task 3, the collected data was trimmed from duplicated records and special characters to create a final dataset with 3,601 humorous sentences. Under this experiment we try to see if our models were able to detect a different humor from the initial type with which they were trained, it was noted that both methods are able to successfully classify another type of humor. On the one hand, it was expected that the pre-trained models would be able to classify at least a portion of the humor in the data set, the results obtained were much better than anticipated, obtaining 95.64% for BERT and 92.58% for DistilBERT, the models were really able to identify humor, an analysis of the worst and best cases were taken into account.
AB - Humor is one of the most subjective matters of human behavior since it includes a wide range of variables: sentiments, wordplay, double meanings structurally or phonetic, all of this within the construction of written humor. It is important to assess the humor from a different point of view since this variability tends to provide insight into the true structure or the main core of the humoristic dilemma, as we know the range of humor is so diverse that it presents a high skilled problem even on the simplest tasks. Pre-trained base Bert and DistilBert models trained with a humorous one-liners dataset were used, these trained models were tested with a merged dataset from JOKER from data of tasks 1 and task 3, the collected data was trimmed from duplicated records and special characters to create a final dataset with 3,601 humorous sentences. Under this experiment we try to see if our models were able to detect a different humor from the initial type with which they were trained, it was noted that both methods are able to successfully classify another type of humor. On the one hand, it was expected that the pre-trained models would be able to classify at least a portion of the humor in the data set, the results obtained were much better than anticipated, obtaining 95.64% for BERT and 92.58% for DistilBERT, the models were really able to identify humor, an analysis of the worst and best cases were taken into account.
KW - Classifiers
KW - Humor identification
KW - Humourism
KW - Transformers
UR - http://www.scopus.com/inward/record.url?scp=85136920312&partnerID=8YFLogxK
M3 - Artículo de la conferencia
AN - SCOPUS:85136920312
SN - 1613-0073
VL - 3180
SP - 1828
EP - 1833
JO - CEUR Workshop Proceedings
JF - CEUR Workshop Proceedings
T2 - 2022 Conference and Labs of the Evaluation Forum, CLEF 2022
Y2 - 5 September 2022 through 8 September 2022
ER -