Empirical study of machine learning based approach for opinion mining in tweets

Grigori Sidorov; Sabino Miranda-Jiménez; Francisco Viveros-Jiménez; Alexander Gelbukh; Noé Castro-Sánchez; Francisco Velásquez; Ismael Díaz-Rangel; Sergio Suárez-Guerra; Alejandro Treviño; Juan Gordon

doi:10.1007/978-3-642-37807-2_1

Empirical study of machine learning based approach for opinion mining in tweets

Grigori Sidorov, Sabino Miranda-Jiménez, Francisco Viveros-Jiménez, Alexander Gelbukh, Noé Castro-Sánchez, Francisco Velásquez, Ismael Díaz-Rangel, Sergio Suárez-Guerra, Alejandro Treviño, Juan Gordon

Centro de Investigación en Computación (CIC)

Producción científica: Capítulo del libro/informe/acta de congreso › Contribución a la conferencia › revisión exhaustiva

92 Citas (Scopus)

Resumen

Opinion mining deals with determining of the sentiment orientation- positive, negative, or neutral-of a (short) text. Recently, it has attracted great interest both in academia and in industry due to its useful potential applications. One of the most promising applications is analysis of opinions in social networks. In this paper, we examine how classifiers work while doing opinion mining over Spanish Twitter data. We explore how different settings (n-gram size, corpus size, number of sentiment classes, balanced vs. unbalanced corpus, various domains) affect precision of the machine learning algorithms. We experimented with Naïve Bayes, Decision Tree, and Support Vector Machines. We describe also language specific preprocessing-in our case, for Spanish language-of tweets. The paper presents best settings of parameters for practical applications of opinion mining in Spanish Twitter. We also present a novel resource for analysis of emotions in texts: a dictionary marked with probabilities to express one of the six basic emotions(Probability Factor of Affective use (PFA)(Spanish Emotion Lexicon that contains 2,036 words.

Idioma original	Inglés
Título de la publicación alojada	Advances in Artificial Intelligence - 11th Mexican International Conference on Artificial Intelligence, MICAI 2012, Revised Selected Papers
Páginas	1-14
Número de páginas	14
Edición	PART 1
DOI	https://doi.org/10.1007/978-3-642-37807-2_1
Estado	Publicada - 2013
Evento	11th Mexican International Conference on Artificial Intelligence, MICAI 2012 - San Luis Potosi, México Duración: 27 oct. 2012 → 4 nov. 2012

Serie de la publicación

Nombre	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Número	PART 1
Volumen	7629 LNAI
ISSN (versión impresa)	0302-9743
ISSN (versión digital)	1611-3349

Conferencia

Conferencia	11th Mexican International Conference on Artificial Intelligence, MICAI 2012
País/Territorio	México
Ciudad	San Luis Potosi
Período	27/10/12 → 4/11/12

Acceder al documento

10.1007/978-3-642-37807-2_1

Otros archivos y enlaces

Enlace a la publicación en Scopus

Citar esto

Sidorov, G., Miranda-Jiménez, S., Viveros-Jiménez, F., Gelbukh, A., Castro-Sánchez, N., Velásquez, F., Díaz-Rangel, I., Suárez-Guerra, S., Treviño, A., & Gordon, J. (2013). Empirical study of machine learning based approach for opinion mining in tweets. En Advances in Artificial Intelligence - 11th Mexican International Conference on Artificial Intelligence, MICAI 2012, Revised Selected Papers (PART 1 ed., pp. 1-14). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7629 LNAI, N.º PART 1). https://doi.org/10.1007/978-3-642-37807-2_1

Sidorov, Grigori ; Miranda-Jiménez, Sabino ; Viveros-Jiménez, Francisco et al. / Empirical study of machine learning based approach for opinion mining in tweets. Advances in Artificial Intelligence - 11th Mexican International Conference on Artificial Intelligence, MICAI 2012, Revised Selected Papers. PART 1. ed. 2013. pp. 1-14 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 1).

@inproceedings{1fced9714a84404bbdcbb43f64c487f0,

title = "Empirical study of machine learning based approach for opinion mining in tweets",

abstract = "Opinion mining deals with determining of the sentiment orientation- positive, negative, or neutral-of a (short) text. Recently, it has attracted great interest both in academia and in industry due to its useful potential applications. One of the most promising applications is analysis of opinions in social networks. In this paper, we examine how classifiers work while doing opinion mining over Spanish Twitter data. We explore how different settings (n-gram size, corpus size, number of sentiment classes, balanced vs. unbalanced corpus, various domains) affect precision of the machine learning algorithms. We experimented with Na{\"i}ve Bayes, Decision Tree, and Support Vector Machines. We describe also language specific preprocessing-in our case, for Spanish language-of tweets. The paper presents best settings of parameters for practical applications of opinion mining in Spanish Twitter. We also present a novel resource for analysis of emotions in texts: a dictionary marked with probabilities to express one of the six basic emotions(Probability Factor of Affective use (PFA)(Spanish Emotion Lexicon that contains 2,036 words.",

keywords = "Opinion mining, Spanish Emotion Lexicon, Spanish Twitter corpus, sentiment analysis, sentiment classification",

author = "Grigori Sidorov and Sabino Miranda-Jim{\'e}nez and Francisco Viveros-Jim{\'e}nez and Alexander Gelbukh and No{\'e} Castro-S{\'a}nchez and Francisco Vel{\'a}squez and Ismael D{\'i}az-Rangel and Sergio Su{\'a}rez-Guerra and Alejandro Trevi{\~n}o and Juan Gordon",

year = "2013",

doi = "10.1007/978-3-642-37807-2_1",

language = "Ingl{\'e}s",

isbn = "9783642378065",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

number = "PART 1",

pages = "1--14",

booktitle = "Advances in Artificial Intelligence - 11th Mexican International Conference on Artificial Intelligence, MICAI 2012, Revised Selected Papers",

edition = "PART 1",

note = "11th Mexican International Conference on Artificial Intelligence, MICAI 2012 ; Conference date: 27-10-2012 Through 04-11-2012",

}

Sidorov, G, Miranda-Jiménez, S, Viveros-Jiménez, F, Gelbukh, A, Castro-Sánchez, N, Velásquez, F, Díaz-Rangel, I, Suárez-Guerra, S, Treviño, A & Gordon, J 2013, Empirical study of machine learning based approach for opinion mining in tweets. En Advances in Artificial Intelligence - 11th Mexican International Conference on Artificial Intelligence, MICAI 2012, Revised Selected Papers. PART 1 ed., Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), n.º PART 1, vol. 7629 LNAI, pp. 1-14, 11th Mexican International Conference on Artificial Intelligence, MICAI 2012, San Luis Potosi, México, 27/10/12. https://doi.org/10.1007/978-3-642-37807-2_1

Empirical study of machine learning based approach for opinion mining in tweets. / Sidorov, Grigori; Miranda-Jiménez, Sabino; Viveros-Jiménez, Francisco et al.
Advances in Artificial Intelligence - 11th Mexican International Conference on Artificial Intelligence, MICAI 2012, Revised Selected Papers. PART 1. ed. 2013. p. 1-14 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7629 LNAI, N.º PART 1).

Producción científica: Capítulo del libro/informe/acta de congreso › Contribución a la conferencia › revisión exhaustiva

TY - GEN

T1 - Empirical study of machine learning based approach for opinion mining in tweets

AU - Sidorov, Grigori

AU - Miranda-Jiménez, Sabino

AU - Viveros-Jiménez, Francisco

AU - Gelbukh, Alexander

AU - Castro-Sánchez, Noé

AU - Velásquez, Francisco

AU - Díaz-Rangel, Ismael

AU - Suárez-Guerra, Sergio

AU - Treviño, Alejandro

AU - Gordon, Juan

PY - 2013

Y1 - 2013

N2 - Opinion mining deals with determining of the sentiment orientation- positive, negative, or neutral-of a (short) text. Recently, it has attracted great interest both in academia and in industry due to its useful potential applications. One of the most promising applications is analysis of opinions in social networks. In this paper, we examine how classifiers work while doing opinion mining over Spanish Twitter data. We explore how different settings (n-gram size, corpus size, number of sentiment classes, balanced vs. unbalanced corpus, various domains) affect precision of the machine learning algorithms. We experimented with Naïve Bayes, Decision Tree, and Support Vector Machines. We describe also language specific preprocessing-in our case, for Spanish language-of tweets. The paper presents best settings of parameters for practical applications of opinion mining in Spanish Twitter. We also present a novel resource for analysis of emotions in texts: a dictionary marked with probabilities to express one of the six basic emotions(Probability Factor of Affective use (PFA)(Spanish Emotion Lexicon that contains 2,036 words.

AB - Opinion mining deals with determining of the sentiment orientation- positive, negative, or neutral-of a (short) text. Recently, it has attracted great interest both in academia and in industry due to its useful potential applications. One of the most promising applications is analysis of opinions in social networks. In this paper, we examine how classifiers work while doing opinion mining over Spanish Twitter data. We explore how different settings (n-gram size, corpus size, number of sentiment classes, balanced vs. unbalanced corpus, various domains) affect precision of the machine learning algorithms. We experimented with Naïve Bayes, Decision Tree, and Support Vector Machines. We describe also language specific preprocessing-in our case, for Spanish language-of tweets. The paper presents best settings of parameters for practical applications of opinion mining in Spanish Twitter. We also present a novel resource for analysis of emotions in texts: a dictionary marked with probabilities to express one of the six basic emotions(Probability Factor of Affective use (PFA)(Spanish Emotion Lexicon that contains 2,036 words.

KW - Opinion mining

KW - Spanish Emotion Lexicon

KW - Spanish Twitter corpus

KW - sentiment analysis

KW - sentiment classification

UR - http://www.scopus.com/inward/record.url?scp=84875822753&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-37807-2_1

DO - 10.1007/978-3-642-37807-2_1

M3 - Contribución a la conferencia

SN - 9783642378065

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 1

EP - 14

BT - Advances in Artificial Intelligence - 11th Mexican International Conference on Artificial Intelligence, MICAI 2012, Revised Selected Papers

T2 - 11th Mexican International Conference on Artificial Intelligence, MICAI 2012

Y2 - 27 October 2012 through 4 November 2012

ER -

Sidorov G, Miranda-Jiménez S, Viveros-Jiménez F, Gelbukh A, Castro-Sánchez N, Velásquez F et al. Empirical study of machine learning based approach for opinion mining in tweets. En Advances in Artificial Intelligence - 11th Mexican International Conference on Artificial Intelligence, MICAI 2012, Revised Selected Papers. PART 1 ed. 2013. p. 1-14. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 1). doi: 10.1007/978-3-642-37807-2_1

Empirical study of machine learning based approach for opinion mining in tweets

Resumen

Serie de la publicación

Conferencia

Acceder al documento

Otros archivos y enlaces

Huella

Citar esto