Web-based sources for an annotated corpus building and composite proper name identification

Sofía N. Galicia-Haro, Alexander Gelbukh, Igor A. Bolshakov

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Nowadays, collections of texts with annotations on several levels are useful resources. Huge efforts are required to develop this resource for langua-ges like Spanish. In this work, we present the initial step, lexical level annotati-on, for the compilation of an annotated Mexican corpus using Web-based sour-ces. We also describe a method based on heterogeneous knowledge and simple Web-based sources for the proper name identification required in such annota-tion. We focused our work on composite entities (names with coordinated constituents, names with several prepositional phrases, and names of songs, books, movies, etc.). The preliminary obtained results are presented.

Original languageEnglish
Title of host publicationAdvances in Web Intelligence - 2nd International Atlantic Web Intelligence Conference, AWIC 2004, Proceedings
EditorsJesus Favela, Ernestina Menasalvas, Edgar Chavez
PublisherSpringer Verlag
Pages115-124
Number of pages10
ISBN (Print)9783540246817
DOIs
StatePublished - 2004
Event2nd International Atlantic Web Intelligence Conference, AWIC 2004 - Cancun, Mexico
Duration: 16 May 200419 May 2004

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3034
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference2nd International Atlantic Web Intelligence Conference, AWIC 2004
Country/TerritoryMexico
CityCancun
Period16/05/0419/05/04

Fingerprint

Dive into the research topics of 'Web-based sources for an annotated corpus building and composite proper name identification'. Together they form a unique fingerprint.

Cite this