JUNITMZ at SemEval-2016 Task 1: Identifying semantic similarity using levenshtein ratio

Sandip Sarkar, Partha Pakray, Dipankar Das, Alexander Gelbukh

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

11 Scopus citations

Abstract

In this paper we describe the JUNITMZ 1 system that was developed for participation in Se-mEval 2016 Task 1: Semantic Textual Similarity. Methods for measuring the textual similarity are useful to a broad range of applications including: text mining, information retrieval, dialogue systems, machine translation and text summarization. However, many systems developed specifically for STS are complex, making them hard to incorporate as a module within a larger applied system. In this paper, we present an STS system based on three simple and robust similarity features that can be easily incorporated into more complex applied systems. The shared task results show that on most of the shared tasks evaluation sets, these signals achieve a strong (>0.70) level of correlation with human judgements. Our system's three features are: unigram overlap count, length normalized edit distance and the score computed by the METEOR machine translation metric. Features are combined to produces a similarity prediction using both a feedforward and recurrent neural network.

Original languageEnglish
Title of host publicationSemEval 2016 - 10th International Workshop on Semantic Evaluation, Proceedings
PublisherAssociation for Computational Linguistics (ACL)
Pages702-705
Number of pages4
ISBN (Electronic)9781941643952
DOIs
StatePublished - 2016
Event10th International Workshop on Semantic Evaluation, SemEval 2016 - San Diego, United States
Duration: 16 Jun 201617 Jun 2016

Publication series

NameSemEval 2016 - 10th International Workshop on Semantic Evaluation, Proceedings

Conference

Conference10th International Workshop on Semantic Evaluation, SemEval 2016
Country/TerritoryUnited States
CitySan Diego
Period16/06/1617/06/16

Fingerprint

Dive into the research topics of 'JUNITMZ at SemEval-2016 Task 1: Identifying semantic similarity using levenshtein ratio'. Together they form a unique fingerprint.

Cite this