JU-CSE-TE: System description QA@CLEF 2010 - ResPubliQA

Partha Pakray, Pinaki Bhaskar, Santanu Pal, Dipankar Das, Sivaji Bandyopadhyay, Alexander Gelbukh

Producción científica: Contribución a una revistaArtículo de la conferenciarevisión exhaustiva

7 Citas (Scopus)

Resumen

The article presents the experiments carried out as part of the participation in the Paragraph Selection (PS) Task and Answer Selection (AS) Task of QA@CLEF 2010 - ResPubliQA. Our System use Apache Lucene for document retrieval system. All test documents are indexed using Apache Lucene. Stop words are removed from each question and query words are identified to retrieve the most relevant documents using Lucene. Relevant paragraphs are selected from the retrieved documents based on the TF-IDF of the matching query words along with n-gram overlap of the paragraph with the original question. Chunk boundaries are detected in the original question and key chunks are identified. Chunk boundaries are also detected in each sentence in a paragraph. The key chunks are matched in each sentence in a paragraph and relevant sentences are identified based on the key chunk matching score. Each question is analyzed to identify its possible answer type. The SRL Tool (Assert Tool Kit) [1] is applied on each sentence in a paragraph to assign semantic roles to each chunk. The Answer Extraction module identifies the appropriate chunk in a sentence as the exact answer whose semantic role matches with the possible answer type for the question. The tasks have been carried out for English. The Paragraph Selection task has been evaluated on the test data with an overall accuracy score of 0.37 and c@1 measure of 0.50. The Answer Extraction task has performed poorly with an overall accuracy score of 0.16 and c@1 measure of 0.26.

Idioma originalInglés
PublicaciónCEUR Workshop Proceedings
Volumen1176
EstadoPublicada - 2010
Evento2010 Cross Language Evaluation Forum Conference, CLEF 2010 - Padua, Italia
Duración: 22 sep. 201023 sep. 2010

Huella

Profundice en los temas de investigación de 'JU-CSE-TE: System description QA@CLEF 2010 - ResPubliQA'. En conjunto forman una huella única.

Citar esto