TY - JOUR
T1 - Complex named entities in Spanish texts
T2 - Structures and properties
AU - Galicia-Haro, Sofía N.
AU - Gelbukh, Alexander
PY - 2007
Y1 - 2007
N2 - We present a linguistic analysis of Named Entities in Spanish texts. Our work is focused on the determination of the structure of complex proper names: names with coordinated constituents, names with prepositional phrases and names formed by several content words initialized by a capital letter. We present the analysis of circa 49,000 examples obtained from Mexican newspapers. We detailed their structure and give some notions about the context surrounding them. Since named entities belong to open class of words they are being created daily, so the challenge for a named entity recognizer is to precisely determine the boundaries of new entity names in any text and to analyze thoroughly their components for deep semantic analysis. Knowing their general classes of structure it should be possible to derive useful heuristics or a specific grammar for natural language processing applications.
AB - We present a linguistic analysis of Named Entities in Spanish texts. Our work is focused on the determination of the structure of complex proper names: names with coordinated constituents, names with prepositional phrases and names formed by several content words initialized by a capital letter. We present the analysis of circa 49,000 examples obtained from Mexican newspapers. We detailed their structure and give some notions about the context surrounding them. Since named entities belong to open class of words they are being created daily, so the challenge for a named entity recognizer is to precisely determine the boundaries of new entity names in any text and to analyze thoroughly their components for deep semantic analysis. Knowing their general classes of structure it should be possible to derive useful heuristics or a specific grammar for natural language processing applications.
KW - Conjunctions
KW - Corpus linguistics
KW - Discourse structure
KW - Named identity recognition
KW - Natural language processing
KW - Prepositions
UR - http://www.scopus.com/inward/record.url?scp=51249146812&partnerID=8YFLogxK
U2 - 10.1075/li.30.1.06gal
DO - 10.1075/li.30.1.06gal
M3 - Artículo
SN - 0378-4169
VL - 30
SP - 69
EP - 94
JO - Lingvisticae Investigationes
JF - Lingvisticae Investigationes
IS - 1
ER -