TY - GEN
T1 - Empirical study of machine learning based approach for opinion mining in tweets
AU - Sidorov, Grigori
AU - Miranda-Jiménez, Sabino
AU - Viveros-Jiménez, Francisco
AU - Gelbukh, Alexander
AU - Castro-Sánchez, Noé
AU - Velásquez, Francisco
AU - Díaz-Rangel, Ismael
AU - Suárez-Guerra, Sergio
AU - Treviño, Alejandro
AU - Gordon, Juan
PY - 2013
Y1 - 2013
N2 - Opinion mining deals with determining of the sentiment orientation- positive, negative, or neutral-of a (short) text. Recently, it has attracted great interest both in academia and in industry due to its useful potential applications. One of the most promising applications is analysis of opinions in social networks. In this paper, we examine how classifiers work while doing opinion mining over Spanish Twitter data. We explore how different settings (n-gram size, corpus size, number of sentiment classes, balanced vs. unbalanced corpus, various domains) affect precision of the machine learning algorithms. We experimented with Naïve Bayes, Decision Tree, and Support Vector Machines. We describe also language specific preprocessing-in our case, for Spanish language-of tweets. The paper presents best settings of parameters for practical applications of opinion mining in Spanish Twitter. We also present a novel resource for analysis of emotions in texts: a dictionary marked with probabilities to express one of the six basic emotions(Probability Factor of Affective use (PFA)(Spanish Emotion Lexicon that contains 2,036 words.
AB - Opinion mining deals with determining of the sentiment orientation- positive, negative, or neutral-of a (short) text. Recently, it has attracted great interest both in academia and in industry due to its useful potential applications. One of the most promising applications is analysis of opinions in social networks. In this paper, we examine how classifiers work while doing opinion mining over Spanish Twitter data. We explore how different settings (n-gram size, corpus size, number of sentiment classes, balanced vs. unbalanced corpus, various domains) affect precision of the machine learning algorithms. We experimented with Naïve Bayes, Decision Tree, and Support Vector Machines. We describe also language specific preprocessing-in our case, for Spanish language-of tweets. The paper presents best settings of parameters for practical applications of opinion mining in Spanish Twitter. We also present a novel resource for analysis of emotions in texts: a dictionary marked with probabilities to express one of the six basic emotions(Probability Factor of Affective use (PFA)(Spanish Emotion Lexicon that contains 2,036 words.
KW - Opinion mining
KW - Spanish Emotion Lexicon
KW - Spanish Twitter corpus
KW - sentiment analysis
KW - sentiment classification
UR - http://www.scopus.com/inward/record.url?scp=84875822753&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-37807-2_1
DO - 10.1007/978-3-642-37807-2_1
M3 - Contribución a la conferencia
SN - 9783642378065
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 1
EP - 14
BT - Advances in Artificial Intelligence - 11th Mexican International Conference on Artificial Intelligence, MICAI 2012, Revised Selected Papers
T2 - 11th Mexican International Conference on Artificial Intelligence, MICAI 2012
Y2 - 27 October 2012 through 4 November 2012
ER -