NLP-NITMZ@DPIL-FIRE2016: Language independent paraphrases detection

Sandip Sarkar, Saurav Saha, Jereemi Bentham, Partha Pakray, Dipankar Das, Alexander Gelbukh

Producción científica: Contribución a una revistaArtículo de la conferenciarevisión exhaustiva

4 Citas (Scopus)

Resumen

In this paper we describe the detailed information of NLP-NITMZ system on the participation of DPIL1 shared task at Forum for Information Retrieval Evaluation (FIRE 2016). The main aim of DPIL shared task is to detect paraphrases in Indian Languages. Paraphrase detection is an important part in the field of Information Retrieval, Document Summarization, Question Answering, Plagiarism Detection etc. In our approach, we used language independent feature-set to detect paraphrases in Indian languages. Features are mainly based on lexical based similarity. Our system's three features are: Jaccard Similarity, length normalized Edit Distance and Cosine Similarity. Finally, these feature-set are trained using Probabilistic Neural Network (PNN) to detect the paraphrases. With our feature-set, we achieved 88.13% average accuracy in Sub-Task 1 and 71.98% average accuracy in Sub-Task 2.

Idioma originalInglés
Páginas (desde-hasta)256-259
Número de páginas4
PublicaciónCEUR Workshop Proceedings
Volumen1737
EstadoPublicada - 2016
Evento2016 Forum for Information Retrieval Evaluation, FIRE 2016 - Kolkata, India
Duración: 7 dic. 201610 dic. 2016

Huella

Profundice en los temas de investigación de 'NLP-NITMZ@DPIL-FIRE2016: Language independent paraphrases detection'. En conjunto forman una huella única.

Citar esto