Web-based sources for an annotated corpus building and composite proper name identification

Sofía N. Galicia-Haro, Alexander Gelbukh, Igor A. Bolshakov

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

1 Cita (Scopus)

Resumen

Nowadays, collections of texts with annotations on several levels are useful resources. Huge efforts are required to develop this resource for langua-ges like Spanish. In this work, we present the initial step, lexical level annotati-on, for the compilation of an annotated Mexican corpus using Web-based sour-ces. We also describe a method based on heterogeneous knowledge and simple Web-based sources for the proper name identification required in such annota-tion. We focused our work on composite entities (names with coordinated constituents, names with several prepositional phrases, and names of songs, books, movies, etc.). The preliminary obtained results are presented.

Idioma originalInglés
Título de la publicación alojadaAdvances in Web Intelligence - 2nd International Atlantic Web Intelligence Conference, AWIC 2004, Proceedings
EditoresJesus Favela, Ernestina Menasalvas, Edgar Chavez
EditorialSpringer Verlag
Páginas115-124
Número de páginas10
ISBN (versión impresa)9783540246817
DOI
EstadoPublicada - 2004
Evento2nd International Atlantic Web Intelligence Conference, AWIC 2004 - Cancun, México
Duración: 16 may. 200419 may. 2004

Serie de la publicación

NombreLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volumen3034
ISSN (versión impresa)0302-9743
ISSN (versión digital)1611-3349

Conferencia

Conferencia2nd International Atlantic Web Intelligence Conference, AWIC 2004
País/TerritorioMéxico
CiudadCancun
Período16/05/0419/05/04

Huella

Profundice en los temas de investigación de 'Web-based sources for an annotated corpus building and composite proper name identification'. En conjunto forman una huella única.

Citar esto