Extending persian sentiment lexicon with idiomatic expressions for sentiment analysis

Kia Dashtipour, Mandar Gogate, Alexander Gelbukh, Amir Hussain

Research output: Contribution to journalArticlepeer-review

15 Scopus citations

Abstract

Nowadays, it is important for buyers to know other customer opinions to make informed decisions on buying a product or service. In addition, companies and organizations can exploit customer opinions to improve their products and services. However, the Quintilian bytes of the opinions generated every day cannot be manually read and summarized. Sentiment analysis and opinion mining techniques offer a solution to automatically classify and summarize user opinions. However, current sentiment analysis research is mostly focused on English, with much fewer resources available for other languages like Persian. In our previous work, we developed PerSent, a publicly available sentiment lexicon to facilitate lexicon-based sentiment analysis of texts in the Persian language. However, PerSent-based sentiment analysis approach fails to classify the real-world sentences consisting of idiomatic expressions. Therefore, in this paper, we describe an extension of the PerSent lexicon with more than 1000 idiomatic expressions, along with their polarity, and propose an algorithm to accurately classify Persian text. Comparative experimental results reveal the usefulness of the extended lexicon for sentiment analysis as compared to PerSent lexicon-based sentiment analysis as well as Persian-to-English translation-based approaches. The extended version of the lexicon will be made publicly available.

Original languageEnglish
Article number9
JournalSocial Network Analysis and Mining
Volume12
Issue number1
DOIs
StatePublished - Dec 2022

Fingerprint

Dive into the research topics of 'Extending persian sentiment lexicon with idiomatic expressions for sentiment analysis'. Together they form a unique fingerprint.

Cite this