Empirical study of machine learning based approach for opinion mining in tweets

Grigori Sidorov, Sabino Miranda-Jiménez, Francisco Viveros-Jiménez, Alexander Gelbukh, Noé Castro-Sánchez, Francisco Velásquez, Ismael Díaz-Rangel, Sergio Suárez-Guerra, Alejandro Treviño, Juan Gordon

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

92 Scopus citations

Abstract

Opinion mining deals with determining of the sentiment orientation- positive, negative, or neutral-of a (short) text. Recently, it has attracted great interest both in academia and in industry due to its useful potential applications. One of the most promising applications is analysis of opinions in social networks. In this paper, we examine how classifiers work while doing opinion mining over Spanish Twitter data. We explore how different settings (n-gram size, corpus size, number of sentiment classes, balanced vs. unbalanced corpus, various domains) affect precision of the machine learning algorithms. We experimented with Naïve Bayes, Decision Tree, and Support Vector Machines. We describe also language specific preprocessing-in our case, for Spanish language-of tweets. The paper presents best settings of parameters for practical applications of opinion mining in Spanish Twitter. We also present a novel resource for analysis of emotions in texts: a dictionary marked with probabilities to express one of the six basic emotions(Probability Factor of Affective use (PFA)(Spanish Emotion Lexicon that contains 2,036 words.

Original languageEnglish
Title of host publicationAdvances in Artificial Intelligence - 11th Mexican International Conference on Artificial Intelligence, MICAI 2012, Revised Selected Papers
Pages1-14
Number of pages14
EditionPART 1
DOIs
StatePublished - 2013
Event11th Mexican International Conference on Artificial Intelligence, MICAI 2012 - San Luis Potosi, Mexico
Duration: 27 Oct 20124 Nov 2012

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume7629 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference11th Mexican International Conference on Artificial Intelligence, MICAI 2012
Country/TerritoryMexico
CitySan Luis Potosi
Period27/10/124/11/12

Keywords

  • Opinion mining
  • Spanish Emotion Lexicon
  • Spanish Twitter corpus
  • sentiment analysis
  • sentiment classification

Fingerprint

Dive into the research topics of 'Empirical study of machine learning based approach for opinion mining in tweets'. Together they form a unique fingerprint.

Cite this