TY - JOUR
T1 - Extending persian sentiment lexicon with idiomatic expressions for sentiment analysis
AU - Dashtipour, Kia
AU - Gogate, Mandar
AU - Gelbukh, Alexander
AU - Hussain, Amir
N1 - Publisher Copyright:
© 2021, The Author(s).
PY - 2022/12
Y1 - 2022/12
N2 - Nowadays, it is important for buyers to know other customer opinions to make informed decisions on buying a product or service. In addition, companies and organizations can exploit customer opinions to improve their products and services. However, the Quintilian bytes of the opinions generated every day cannot be manually read and summarized. Sentiment analysis and opinion mining techniques offer a solution to automatically classify and summarize user opinions. However, current sentiment analysis research is mostly focused on English, with much fewer resources available for other languages like Persian. In our previous work, we developed PerSent, a publicly available sentiment lexicon to facilitate lexicon-based sentiment analysis of texts in the Persian language. However, PerSent-based sentiment analysis approach fails to classify the real-world sentences consisting of idiomatic expressions. Therefore, in this paper, we describe an extension of the PerSent lexicon with more than 1000 idiomatic expressions, along with their polarity, and propose an algorithm to accurately classify Persian text. Comparative experimental results reveal the usefulness of the extended lexicon for sentiment analysis as compared to PerSent lexicon-based sentiment analysis as well as Persian-to-English translation-based approaches. The extended version of the lexicon will be made publicly available.
AB - Nowadays, it is important for buyers to know other customer opinions to make informed decisions on buying a product or service. In addition, companies and organizations can exploit customer opinions to improve their products and services. However, the Quintilian bytes of the opinions generated every day cannot be manually read and summarized. Sentiment analysis and opinion mining techniques offer a solution to automatically classify and summarize user opinions. However, current sentiment analysis research is mostly focused on English, with much fewer resources available for other languages like Persian. In our previous work, we developed PerSent, a publicly available sentiment lexicon to facilitate lexicon-based sentiment analysis of texts in the Persian language. However, PerSent-based sentiment analysis approach fails to classify the real-world sentences consisting of idiomatic expressions. Therefore, in this paper, we describe an extension of the PerSent lexicon with more than 1000 idiomatic expressions, along with their polarity, and propose an algorithm to accurately classify Persian text. Comparative experimental results reveal the usefulness of the extended lexicon for sentiment analysis as compared to PerSent lexicon-based sentiment analysis as well as Persian-to-English translation-based approaches. The extended version of the lexicon will be made publicly available.
UR - http://www.scopus.com/inward/record.url?scp=85120674609&partnerID=8YFLogxK
U2 - 10.1007/s13278-021-00840-1
DO - 10.1007/s13278-021-00840-1
M3 - Artículo
AN - SCOPUS:85120674609
SN - 1869-5450
VL - 12
JO - Social Network Analysis and Mining
JF - Social Network Analysis and Mining
IS - 1
M1 - 9
ER -