Formal grammar for Hispanic named entities analysis

Grettel Barceló, Eduardo Cendejas, Grigori Sidorov, Igor A. Bolshakov

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

3 Citas (Scopus)

Resumen

A task that has been widely studied in the field of natural language processing is the Named Entity Recognition (NER). A great number of approaches have been developed to deal with the identification and classification of named entity strings in specific-and open-domains. Nevertheless, external modules have to be incorporated into many of the NER systems in order to solve the interpretation problems derived from proper nouns. In this article our focus will be on the study of ambiguity in Hispanic Nominal Sequences which constitution assumes three main problems: (1) the association of given names and/or surnames; (2) the composition of such elements by means of a connector; (3) and the duality of given name/surname. In order to analyze the magnitude of the problem, two gazetteers were made, one with 93998 given names and the other with 13779 surnames. The gazetteers entries were used as terminal symbols of the proposed grammar to determine the valid interpretations in the nominal sequences; this is done by means of an automatic labeling of all the elements the nominal sequences are made of.

Idioma originalInglés
Título de la publicación alojadaComputational Linguistics and Intelligent Text Processing - 10th International Conference, CICLing 2009, Proceedings
Páginas183-194
Número de páginas12
DOI
EstadoPublicada - 2009
Evento10th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2009 - Mexico City, México
Duración: 1 mar. 20097 mar. 2009

Serie de la publicación

NombreLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volumen5449 LNCS
ISSN (versión impresa)0302-9743
ISSN (versión digital)1611-3349

Conferencia

Conferencia10th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2009
País/TerritorioMéxico
CiudadMexico City
Período1/03/097/03/09

Huella

Profundice en los temas de investigación de 'Formal grammar for Hispanic named entities analysis'. En conjunto forman una huella única.

Citar esto