A tweets classifier based on cosine similarity

Carolina Focil-Arias, Jorge Ziiniga, Grigori Sidorov, Ildar Batyrshin, Alexander Gelbukh

Research output: Contribution to journalConference articlepeer-review

Abstract

The 2017 Microblog Cultural Contextualization task consists in three challenges: (1) Content Analysis, (2) Microblog search, and (3) TimeLine illustration. This paper describes the use of cosine similarity, which is characterized by the comparison of similarity between two vectors of an inner product space. This research used two approaches: (1) word2vec and (2) Bag-of-Words (BoW) for extracting all relevant tweets to each event related to the four festivals: Charrues, Transmusicales, Avignon and Edinburgh.

Original languageEnglish
JournalCEUR Workshop Proceedings
Volume1866
StatePublished - 2017
Event18th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2017 - Dublin, Ireland
Duration: 11 Sep 201714 Sep 2017

Keywords

  • Bag-of-Words
  • Cosine similarity
  • Information retrieval
  • Natural language processing
  • Opinion mining
  • Word2vec

Fingerprint

Dive into the research topics of 'A tweets classifier based on cosine similarity'. Together they form a unique fingerprint.

Cite this