NLP for shallow question answering of legal documents using graphs

Alfredo Monroy; Hiramand Calvo; Alexander Gelbukh

doi:10.1007/978-3-642-00382-0_40

NLP for shallow question answering of legal documents using graphs

Alfredo Monroy, Hiramand Calvo, Alexander Gelbukh

Centro de Investigación en Computación (CIC)

Producción científica: Capítulo del libro/informe/acta de congreso › Contribución a la conferencia › revisión exhaustiva

18 Citas (Scopus)

Resumen

Previous work has shown that modeling relationships between articles of a regulation as vertices of a graph network works twice as better than traditional information retrieval systems for returning articles relevant to the question. In this work we experiment by using natural language techniques such as lemmatizing and using manual and automatic thesauri for improving question based document retrieval. For the construction of the graph, we follow the approach of representing the set of all the articles as a graph; the question is split in two parts, and each of them is added as part of the graph. Then several paths are constructed from part A of the question to part B, so that the shortest path contains the relevant articles to the question. We evaluate our method comparing the answers given by a traditional information retrieval system - vector space model adjusted for article retrieval, instead of document retrieval - and the answers to 21 questions given manually by the general lawyer of the National Polytechnic Institute, based on 25 different regulations (academy regulation, scholarships regulation, postgraduate studies regulation, etc.); with the answer of our system based on the same set of regulations. We found that lemmatizing increases performance in around 10%, while the use of thesaurus has a low impact.

Idioma original	Inglés
Título de la publicación alojada	Computational Linguistics and Intelligent Text Processing - 10th International Conference, CICLing 2009, Proceedings
Páginas	498-508
Número de páginas	11
DOI	https://doi.org/10.1007/978-3-642-00382-0_40
Estado	Publicada - 2009
Evento	10th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2009 - Mexico City, México Duración: 1 mar. 2009 → 7 mar. 2009

Serie de la publicación

Nombre	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volumen	5449 LNCS
ISSN (versión impresa)	0302-9743
ISSN (versión digital)	1611-3349

Conferencia

Conferencia	10th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2009
País/Territorio	México
Ciudad	Mexico City
Período	1/03/09 → 7/03/09

Acceder al documento

10.1007/978-3-642-00382-0_40

Otros archivos y enlaces

Enlace a la publicación en Scopus

Citar esto

Monroy, A., Calvo, H., & Gelbukh, A. (2009). NLP for shallow question answering of legal documents using graphs. En Computational Linguistics and Intelligent Text Processing - 10th International Conference, CICLing 2009, Proceedings (pp. 498-508). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5449 LNCS). https://doi.org/10.1007/978-3-642-00382-0_40

Monroy, Alfredo ; Calvo, Hiramand ; Gelbukh, Alexander. / NLP for shallow question answering of legal documents using graphs. Computational Linguistics and Intelligent Text Processing - 10th International Conference, CICLing 2009, Proceedings. 2009. pp. 498-508 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{4f1ccde516234806aeb523c5072094a9,

title = "NLP for shallow question answering of legal documents using graphs",

abstract = "Previous work has shown that modeling relationships between articles of a regulation as vertices of a graph network works twice as better than traditional information retrieval systems for returning articles relevant to the question. In this work we experiment by using natural language techniques such as lemmatizing and using manual and automatic thesauri for improving question based document retrieval. For the construction of the graph, we follow the approach of representing the set of all the articles as a graph; the question is split in two parts, and each of them is added as part of the graph. Then several paths are constructed from part A of the question to part B, so that the shortest path contains the relevant articles to the question. We evaluate our method comparing the answers given by a traditional information retrieval system - vector space model adjusted for article retrieval, instead of document retrieval - and the answers to 21 questions given manually by the general lawyer of the National Polytechnic Institute, based on 25 different regulations (academy regulation, scholarships regulation, postgraduate studies regulation, etc.); with the answer of our system based on the same set of regulations. We found that lemmatizing increases performance in around 10%, while the use of thesaurus has a low impact.",

author = "Alfredo Monroy and Hiramand Calvo and Alexander Gelbukh",

note = "Funding Information: We thank the support of Mexican Government (SNI, SIP-IPN, COFAA-IPN, and PIFI-IPN) and Japanese Government. Second author is a JSPS fellow.; 10th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2009 ; Conference date: 01-03-2009 Through 07-03-2009",

year = "2009",

doi = "10.1007/978-3-642-00382-0_40",

language = "Ingl{\'e}s",

isbn = "3642003818",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

pages = "498--508",

booktitle = "Computational Linguistics and Intelligent Text Processing - 10th International Conference, CICLing 2009, Proceedings",

}

Monroy, A, Calvo, H & Gelbukh, A 2009, NLP for shallow question answering of legal documents using graphs. En Computational Linguistics and Intelligent Text Processing - 10th International Conference, CICLing 2009, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 5449 LNCS, pp. 498-508, 10th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2009, Mexico City, México, 1/03/09. https://doi.org/10.1007/978-3-642-00382-0_40

NLP for shallow question answering of legal documents using graphs. / Monroy, Alfredo; Calvo, Hiramand ; Gelbukh, Alexander.
Computational Linguistics and Intelligent Text Processing - 10th International Conference, CICLing 2009, Proceedings. 2009. p. 498-508 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5449 LNCS).

Producción científica: Capítulo del libro/informe/acta de congreso › Contribución a la conferencia › revisión exhaustiva

TY - GEN

T1 - NLP for shallow question answering of legal documents using graphs

AU - Monroy, Alfredo

AU - Calvo, Hiramand

AU - Gelbukh, Alexander

N1 - Funding Information: We thank the support of Mexican Government (SNI, SIP-IPN, COFAA-IPN, and PIFI-IPN) and Japanese Government. Second author is a JSPS fellow.

PY - 2009

Y1 - 2009

N2 - Previous work has shown that modeling relationships between articles of a regulation as vertices of a graph network works twice as better than traditional information retrieval systems for returning articles relevant to the question. In this work we experiment by using natural language techniques such as lemmatizing and using manual and automatic thesauri for improving question based document retrieval. For the construction of the graph, we follow the approach of representing the set of all the articles as a graph; the question is split in two parts, and each of them is added as part of the graph. Then several paths are constructed from part A of the question to part B, so that the shortest path contains the relevant articles to the question. We evaluate our method comparing the answers given by a traditional information retrieval system - vector space model adjusted for article retrieval, instead of document retrieval - and the answers to 21 questions given manually by the general lawyer of the National Polytechnic Institute, based on 25 different regulations (academy regulation, scholarships regulation, postgraduate studies regulation, etc.); with the answer of our system based on the same set of regulations. We found that lemmatizing increases performance in around 10%, while the use of thesaurus has a low impact.

AB - Previous work has shown that modeling relationships between articles of a regulation as vertices of a graph network works twice as better than traditional information retrieval systems for returning articles relevant to the question. In this work we experiment by using natural language techniques such as lemmatizing and using manual and automatic thesauri for improving question based document retrieval. For the construction of the graph, we follow the approach of representing the set of all the articles as a graph; the question is split in two parts, and each of them is added as part of the graph. Then several paths are constructed from part A of the question to part B, so that the shortest path contains the relevant articles to the question. We evaluate our method comparing the answers given by a traditional information retrieval system - vector space model adjusted for article retrieval, instead of document retrieval - and the answers to 21 questions given manually by the general lawyer of the National Polytechnic Institute, based on 25 different regulations (academy regulation, scholarships regulation, postgraduate studies regulation, etc.); with the answer of our system based on the same set of regulations. We found that lemmatizing increases performance in around 10%, while the use of thesaurus has a low impact.

UR - http://www.scopus.com/inward/record.url?scp=67650513845&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-00382-0_40

DO - 10.1007/978-3-642-00382-0_40

M3 - Contribución a la conferencia

SN - 3642003818

SN - 9783642003813

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 498

EP - 508

BT - Computational Linguistics and Intelligent Text Processing - 10th International Conference, CICLing 2009, Proceedings

T2 - 10th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2009

Y2 - 1 March 2009 through 7 March 2009

ER -

Monroy A, Calvo H , Gelbukh A. NLP for shallow question answering of legal documents using graphs. En Computational Linguistics and Intelligent Text Processing - 10th International Conference, CICLing 2009, Proceedings. 2009. p. 498-508. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-642-00382-0_40

NLP for shallow question answering of legal documents using graphs

Resumen

Serie de la publicación

Conferencia

Acceder al documento

Otros archivos y enlaces

Huella

Citar esto