A hybrid question answering system based on information retrieval and answer validation

Partha Pakray; Pinaki Bhaskar; Somnath Banerjee; Bidhan Chandra Pal; Sivaji Bandyopadhyay; Alexander Gelbukh

A hybrid question answering system based on information retrieval and answer validation

Partha Pakray, Pinaki Bhaskar, Somnath Banerjee, Bidhan Chandra Pal, Sivaji Bandyopadhyay, Alexander Gelbukh

Centro de Investigación en Computación (CIC)

Producción científica: Contribución a una revista › Artículo de la conferencia › revisión exhaustiva

12 Citas (Scopus)

Resumen

The article presents the experiments carried out as part of the participation in the main task of QA4MRE@CLEF 2011. We have submitted total five unique runs in the main task: two runs from systems based on Answer Validation (AV) machine reading techniques, one run from systems based on Question Answering (QA) techniques while the last two runs are hybrid systems where the decision is taken based on the outputs from the AV and QA based systems. In the AV system, we first combine the question and each answer option to form the Hypothesis (H). Stop words are removed from each H and query words are identified to retrieve the most relevant sentences from the associated document using Lucene. Relevant sentences are retrieved from the associated document based on the TF-IDF of the matching query words along with n-gram overlap of the sentence with the H. Each retrieved sentence defines the Text T. Each T-H pair is assigned a ranking score in the AV system that works on textual entailment principle. The answer option for which the TH pair gets the maximum score is selected as the possible answer. The two unique runs differ in the way in which the relevant sentences are retrieved from the associated document. The second system is based on Question Answering (QA) technique. Each question along with each answer option generates the possible answer patterns. Each sentence in the associated document is assigned an inference score with respect to each answer pattern. The sentence that receives the highest inference score corresponding to the answer patterns is identified as the relevant sentence in the document and the corresponding answer option is selected as the answer to the given question.

Idioma original	Inglés
Publicación	CEUR Workshop Proceedings
Volumen	1177
Estado	Publicada - 2011
Evento	2011 Cross Language Evaluation Forum Conference, CLEF 2011 - Amsterdam, Países Bajos Duración: 19 sep. 2011 → 22 sep. 2011

Otros archivos y enlaces

Enlace a la publicación en Scopus

Citar esto

@article{ce709b7c13cc4a1a903bc56137c7ed3e,

title = "A hybrid question answering system based on information retrieval and answer validation",

abstract = "The article presents the experiments carried out as part of the participation in the main task of QA4MRE@CLEF 2011. We have submitted total five unique runs in the main task: two runs from systems based on Answer Validation (AV) machine reading techniques, one run from systems based on Question Answering (QA) techniques while the last two runs are hybrid systems where the decision is taken based on the outputs from the AV and QA based systems. In the AV system, we first combine the question and each answer option to form the Hypothesis (H). Stop words are removed from each H and query words are identified to retrieve the most relevant sentences from the associated document using Lucene. Relevant sentences are retrieved from the associated document based on the TF-IDF of the matching query words along with n-gram overlap of the sentence with the H. Each retrieved sentence defines the Text T. Each T-H pair is assigned a ranking score in the AV system that works on textual entailment principle. The answer option for which the TH pair gets the maximum score is selected as the possible answer. The two unique runs differ in the way in which the relevant sentences are retrieved from the associated document. The second system is based on Question Answering (QA) technique. Each question along with each answer option generates the possible answer patterns. Each sentence in the associated document is assigned an inference score with respect to each answer pattern. The sentence that receives the highest inference score corresponding to the answer patterns is identified as the relevant sentence in the document and the corresponding answer option is selected as the answer to the given question.",

keywords = "Named entity, QA4MRE data sets, Question answering technique, Textual entailment",

author = "Partha Pakray and Pinaki Bhaskar and Somnath Banerjee and Pal, {Bidhan Chandra} and Sivaji Bandyopadhyay and Alexander Gelbukh",

year = "2011",

language = "Ingl{\'e}s",

volume = "1177",

journal = "CEUR Workshop Proceedings",

issn = "1613-0073",

publisher = "CEUR-WS",

note = "2011 Cross Language Evaluation Forum Conference, CLEF 2011 ; Conference date: 19-09-2011 Through 22-09-2011",

}

TY - JOUR

T1 - A hybrid question answering system based on information retrieval and answer validation

AU - Pakray, Partha

AU - Bhaskar, Pinaki

AU - Banerjee, Somnath

AU - Pal, Bidhan Chandra

AU - Bandyopadhyay, Sivaji

AU - Gelbukh, Alexander

PY - 2011

Y1 - 2011

N2 - The article presents the experiments carried out as part of the participation in the main task of QA4MRE@CLEF 2011. We have submitted total five unique runs in the main task: two runs from systems based on Answer Validation (AV) machine reading techniques, one run from systems based on Question Answering (QA) techniques while the last two runs are hybrid systems where the decision is taken based on the outputs from the AV and QA based systems. In the AV system, we first combine the question and each answer option to form the Hypothesis (H). Stop words are removed from each H and query words are identified to retrieve the most relevant sentences from the associated document using Lucene. Relevant sentences are retrieved from the associated document based on the TF-IDF of the matching query words along with n-gram overlap of the sentence with the H. Each retrieved sentence defines the Text T. Each T-H pair is assigned a ranking score in the AV system that works on textual entailment principle. The answer option for which the TH pair gets the maximum score is selected as the possible answer. The two unique runs differ in the way in which the relevant sentences are retrieved from the associated document. The second system is based on Question Answering (QA) technique. Each question along with each answer option generates the possible answer patterns. Each sentence in the associated document is assigned an inference score with respect to each answer pattern. The sentence that receives the highest inference score corresponding to the answer patterns is identified as the relevant sentence in the document and the corresponding answer option is selected as the answer to the given question.

AB - The article presents the experiments carried out as part of the participation in the main task of QA4MRE@CLEF 2011. We have submitted total five unique runs in the main task: two runs from systems based on Answer Validation (AV) machine reading techniques, one run from systems based on Question Answering (QA) techniques while the last two runs are hybrid systems where the decision is taken based on the outputs from the AV and QA based systems. In the AV system, we first combine the question and each answer option to form the Hypothesis (H). Stop words are removed from each H and query words are identified to retrieve the most relevant sentences from the associated document using Lucene. Relevant sentences are retrieved from the associated document based on the TF-IDF of the matching query words along with n-gram overlap of the sentence with the H. Each retrieved sentence defines the Text T. Each T-H pair is assigned a ranking score in the AV system that works on textual entailment principle. The answer option for which the TH pair gets the maximum score is selected as the possible answer. The two unique runs differ in the way in which the relevant sentences are retrieved from the associated document. The second system is based on Question Answering (QA) technique. Each question along with each answer option generates the possible answer patterns. Each sentence in the associated document is assigned an inference score with respect to each answer pattern. The sentence that receives the highest inference score corresponding to the answer patterns is identified as the relevant sentence in the document and the corresponding answer option is selected as the answer to the given question.

KW - Named entity

KW - QA4MRE data sets

KW - Question answering technique

KW - Textual entailment

UR - http://www.scopus.com/inward/record.url?scp=84922032478&partnerID=8YFLogxK

M3 - Artículo de la conferencia

AN - SCOPUS:84922032478

SN - 1613-0073

VL - 1177

JO - CEUR Workshop Proceedings

JF - CEUR Workshop Proceedings

T2 - 2011 Cross Language Evaluation Forum Conference, CLEF 2011

Y2 - 19 September 2011 through 22 September 2011

ER -

A hybrid question answering system based on information retrieval and answer validation

Resumen

Otros archivos y enlaces

Huella

Citar esto