Web-based sources for an annotated corpus building and composite proper name identification

Sofía N. Galicia-Haro; Alexander Gelbukh; Igor A. Bolshakov

doi:10.1007/978-3-540-24681-7_14

Web-based sources for an annotated corpus building and composite proper name identification

Sofía N. Galicia-Haro, Alexander Gelbukh, Igor A. Bolshakov

Centro de Investigación en Computación (CIC)

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

1 Scopus citations

Abstract

Nowadays, collections of texts with annotations on several levels are useful resources. Huge efforts are required to develop this resource for langua-ges like Spanish. In this work, we present the initial step, lexical level annotati-on, for the compilation of an annotated Mexican corpus using Web-based sour-ces. We also describe a method based on heterogeneous knowledge and simple Web-based sources for the proper name identification required in such annota-tion. We focused our work on composite entities (names with coordinated constituents, names with several prepositional phrases, and names of songs, books, movies, etc.). The preliminary obtained results are presented.

Original language	English
Title of host publication	Advances in Web Intelligence - 2nd International Atlantic Web Intelligence Conference, AWIC 2004, Proceedings
Editors	Jesus Favela, Ernestina Menasalvas, Edgar Chavez
Publisher	Springer Verlag
Pages	115-124
Number of pages	10
ISBN (Print)	9783540246817
DOIs	https://doi.org/10.1007/978-3-540-24681-7_14
State	Published - 2004
Event	2nd International Atlantic Web Intelligence Conference, AWIC 2004 - Cancun, Mexico Duration: 16 May 2004 → 19 May 2004

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	3034
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	2nd International Atlantic Web Intelligence Conference, AWIC 2004
Country/Territory	Mexico
City	Cancun
Period	16/05/04 → 19/05/04

Access to Document

10.1007/978-3-540-24681-7_14

Cite this

Galicia-Haro, S. N., Gelbukh, A., & Bolshakov, I. A. (2004). Web-based sources for an annotated corpus building and composite proper name identification. In J. Favela, E. Menasalvas, & E. Chavez (Eds.), Advances in Web Intelligence - 2nd International Atlantic Web Intelligence Conference, AWIC 2004, Proceedings (pp. 115-124). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 3034). Springer Verlag. https://doi.org/10.1007/978-3-540-24681-7_14

Galicia-Haro, Sofía N. ; Gelbukh, Alexander ; Bolshakov, Igor A. / Web-based sources for an annotated corpus building and composite proper name identification. Advances in Web Intelligence - 2nd International Atlantic Web Intelligence Conference, AWIC 2004, Proceedings. editor / Jesus Favela ; Ernestina Menasalvas ; Edgar Chavez. Springer Verlag, 2004. pp. 115-124 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{ba30ce9c2fde461b8db71a242e871ae0,

title = "Web-based sources for an annotated corpus building and composite proper name identification",

abstract = "Nowadays, collections of texts with annotations on several levels are useful resources. Huge efforts are required to develop this resource for langua-ges like Spanish. In this work, we present the initial step, lexical level annotati-on, for the compilation of an annotated Mexican corpus using Web-based sour-ces. We also describe a method based on heterogeneous knowledge and simple Web-based sources for the proper name identification required in such annota-tion. We focused our work on composite entities (names with coordinated constituents, names with several prepositional phrases, and names of songs, books, movies, etc.). The preliminary obtained results are presented.",

author = "Galicia-Haro, {Sof{\'i}a N.} and Alexander Gelbukh and Bolshakov, {Igor A.}",

note = "Publisher Copyright: {\textcopyright} Springer-Verlag Berlin Heidelberg 2004.; 2nd International Atlantic Web Intelligence Conference, AWIC 2004 ; Conference date: 16-05-2004 Through 19-05-2004",

year = "2004",

doi = "10.1007/978-3-540-24681-7_14",

language = "Ingl{\'e}s",

isbn = "9783540246817",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Verlag",

pages = "115--124",

editor = "Jesus Favela and Ernestina Menasalvas and Edgar Chavez",

booktitle = "Advances in Web Intelligence - 2nd International Atlantic Web Intelligence Conference, AWIC 2004, Proceedings",

address = "Alemania",

}

Galicia-Haro, SN, Gelbukh, A & Bolshakov, IA 2004, Web-based sources for an annotated corpus building and composite proper name identification. in J Favela, E Menasalvas & E Chavez (eds), Advances in Web Intelligence - 2nd International Atlantic Web Intelligence Conference, AWIC 2004, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 3034, Springer Verlag, pp. 115-124, 2nd International Atlantic Web Intelligence Conference, AWIC 2004, Cancun, Mexico, 16/05/04. https://doi.org/10.1007/978-3-540-24681-7_14

Web-based sources for an annotated corpus building and composite proper name identification. / Galicia-Haro, Sofía N.; Gelbukh, Alexander; Bolshakov, Igor A.
Advances in Web Intelligence - 2nd International Atlantic Web Intelligence Conference, AWIC 2004, Proceedings. ed. / Jesus Favela; Ernestina Menasalvas; Edgar Chavez. Springer Verlag, 2004. p. 115-124 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 3034).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Web-based sources for an annotated corpus building and composite proper name identification

AU - Galicia-Haro, Sofía N.

AU - Gelbukh, Alexander

AU - Bolshakov, Igor A.

N1 - Publisher Copyright: © Springer-Verlag Berlin Heidelberg 2004.

PY - 2004

Y1 - 2004

N2 - Nowadays, collections of texts with annotations on several levels are useful resources. Huge efforts are required to develop this resource for langua-ges like Spanish. In this work, we present the initial step, lexical level annotati-on, for the compilation of an annotated Mexican corpus using Web-based sour-ces. We also describe a method based on heterogeneous knowledge and simple Web-based sources for the proper name identification required in such annota-tion. We focused our work on composite entities (names with coordinated constituents, names with several prepositional phrases, and names of songs, books, movies, etc.). The preliminary obtained results are presented.

AB - Nowadays, collections of texts with annotations on several levels are useful resources. Huge efforts are required to develop this resource for langua-ges like Spanish. In this work, we present the initial step, lexical level annotati-on, for the compilation of an annotated Mexican corpus using Web-based sour-ces. We also describe a method based on heterogeneous knowledge and simple Web-based sources for the proper name identification required in such annota-tion. We focused our work on composite entities (names with coordinated constituents, names with several prepositional phrases, and names of songs, books, movies, etc.). The preliminary obtained results are presented.

UR - http://www.scopus.com/inward/record.url?scp=7444222478&partnerID=8YFLogxK

U2 - 10.1007/978-3-540-24681-7_14

DO - 10.1007/978-3-540-24681-7_14

M3 - Contribución a la conferencia

SN - 9783540246817

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 115

EP - 124

BT - Advances in Web Intelligence - 2nd International Atlantic Web Intelligence Conference, AWIC 2004, Proceedings

A2 - Favela, Jesus

A2 - Menasalvas, Ernestina

A2 - Chavez, Edgar

PB - Springer Verlag

T2 - 2nd International Atlantic Web Intelligence Conference, AWIC 2004

Y2 - 16 May 2004 through 19 May 2004

ER -

Galicia-Haro SN, Gelbukh A, Bolshakov IA. Web-based sources for an annotated corpus building and composite proper name identification. In Favela J, Menasalvas E, Chavez E, editors, Advances in Web Intelligence - 2nd International Atlantic Web Intelligence Conference, AWIC 2004, Proceedings. Springer Verlag. 2004. p. 115-124. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-540-24681-7_14

Web-based sources for an annotated corpus building and composite proper name identification

Abstract

Publication series

Conference

Access to Document

Other files and links

Fingerprint

Cite this