TY - JOUR
T1 - A hybrid question answering system based on information retrieval and answer validation
AU - Pakray, Partha
AU - Bhaskar, Pinaki
AU - Banerjee, Somnath
AU - Pal, Bidhan Chandra
AU - Bandyopadhyay, Sivaji
AU - Gelbukh, Alexander
PY - 2011
Y1 - 2011
N2 - The article presents the experiments carried out as part of the participation in the main task of QA4MRE@CLEF 2011. We have submitted total five unique runs in the main task: two runs from systems based on Answer Validation (AV) machine reading techniques, one run from systems based on Question Answering (QA) techniques while the last two runs are hybrid systems where the decision is taken based on the outputs from the AV and QA based systems. In the AV system, we first combine the question and each answer option to form the Hypothesis (H). Stop words are removed from each H and query words are identified to retrieve the most relevant sentences from the associated document using Lucene. Relevant sentences are retrieved from the associated document based on the TF-IDF of the matching query words along with n-gram overlap of the sentence with the H. Each retrieved sentence defines the Text T. Each T-H pair is assigned a ranking score in the AV system that works on textual entailment principle. The answer option for which the TH pair gets the maximum score is selected as the possible answer. The two unique runs differ in the way in which the relevant sentences are retrieved from the associated document. The second system is based on Question Answering (QA) technique. Each question along with each answer option generates the possible answer patterns. Each sentence in the associated document is assigned an inference score with respect to each answer pattern. The sentence that receives the highest inference score corresponding to the answer patterns is identified as the relevant sentence in the document and the corresponding answer option is selected as the answer to the given question.
AB - The article presents the experiments carried out as part of the participation in the main task of QA4MRE@CLEF 2011. We have submitted total five unique runs in the main task: two runs from systems based on Answer Validation (AV) machine reading techniques, one run from systems based on Question Answering (QA) techniques while the last two runs are hybrid systems where the decision is taken based on the outputs from the AV and QA based systems. In the AV system, we first combine the question and each answer option to form the Hypothesis (H). Stop words are removed from each H and query words are identified to retrieve the most relevant sentences from the associated document using Lucene. Relevant sentences are retrieved from the associated document based on the TF-IDF of the matching query words along with n-gram overlap of the sentence with the H. Each retrieved sentence defines the Text T. Each T-H pair is assigned a ranking score in the AV system that works on textual entailment principle. The answer option for which the TH pair gets the maximum score is selected as the possible answer. The two unique runs differ in the way in which the relevant sentences are retrieved from the associated document. The second system is based on Question Answering (QA) technique. Each question along with each answer option generates the possible answer patterns. Each sentence in the associated document is assigned an inference score with respect to each answer pattern. The sentence that receives the highest inference score corresponding to the answer patterns is identified as the relevant sentence in the document and the corresponding answer option is selected as the answer to the given question.
KW - Named entity
KW - QA4MRE data sets
KW - Question answering technique
KW - Textual entailment
UR - http://www.scopus.com/inward/record.url?scp=84922032478&partnerID=8YFLogxK
M3 - Artículo de la conferencia
AN - SCOPUS:84922032478
SN - 1613-0073
VL - 1177
JO - CEUR Workshop Proceedings
JF - CEUR Workshop Proceedings
T2 - 2011 Cross Language Evaluation Forum Conference, CLEF 2011
Y2 - 19 September 2011 through 22 September 2011
ER -