A simple spanish part of speech tagger for detection and correction of accentuation error

S. N. Galicia-Haro; I. A. Bolshakov; A. F. Gelbukh

doi:10.1007/3-540-48239-3_40

A simple spanish part of speech tagger for detection and correction of accentuation error

S. N. Galicia-Haro, I. A. Bolshakov, A. F. Gelbukh

Centro de Investigación en Computación (CIC)

Producción científica: Capítulo del libro/informe/acta de congreso › Contribución a la conferencia › revisión exhaustiva

4 Citas (Scopus)

Resumen

One of the most frequent kind of typographic errors specific to Spanish is connected with accentuation, namely, with omission of an obligatory stress mark or insertion of a superfluous one. If such an error transforms one word to another existing one, the latter cannot be detected by usual spell-checkers, since some context analysis is necessary. A simple procedure is proposed for this task. It relies on (1) some simple heuristics that determine linear context and (2) on a small list of pairs of words that differ only in accentuation mark. This idea is applied to numerous nouns or adjectives like número that pass to quasi-homonymous personal verb forms if they lose their stress marks.

Idioma original	Inglés
Título de la publicación alojada	Text, Speech and Dialogue - 2nd International Workshop, TSD 1999, Proceedings
Editores	Václav Matousek, Pavel Mautner, Jana Oceláková, Petr Sojka
Editorial	Springer Verlag
Páginas	219-222
Número de páginas	4
ISBN (versión impresa)	3540664947, 9783540664949
DOI	https://doi.org/10.1007/3-540-48239-3_40
Estado	Publicada - 1999
Evento	2nd International Workshop on Text, Speech and Dialogue, TSD 1999 - Plzen, República Checa Duración: 13 sep. 1999 → 17 sep. 1999

Serie de la publicación

Nombre	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volumen	1692
ISSN (versión impresa)	0302-9743
ISSN (versión digital)	1611-3349

Conferencia

Conferencia	2nd International Workshop on Text, Speech and Dialogue, TSD 1999
País/Territorio	República Checa
Ciudad	Plzen
Período	13/09/99 → 17/09/99

Acceder al documento

10.1007/3-540-48239-3_40

Otros archivos y enlaces

Enlace a la publicación en Scopus

Citar esto

Galicia-Haro, S. N., Bolshakov, I. A., & Gelbukh, A. F. (1999). A simple spanish part of speech tagger for detection and correction of accentuation error. En V. Matousek, P. Mautner, J. Oceláková, & P. Sojka (Eds.), Text, Speech and Dialogue - 2nd International Workshop, TSD 1999, Proceedings (pp. 219-222). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 1692). Springer Verlag. https://doi.org/10.1007/3-540-48239-3_40

Galicia-Haro, S. N. ; Bolshakov, I. A. ; Gelbukh, A. F. / A simple spanish part of speech tagger for detection and correction of accentuation error. Text, Speech and Dialogue - 2nd International Workshop, TSD 1999, Proceedings. editor / Václav Matousek ; Pavel Mautner ; Jana Oceláková ; Petr Sojka. Springer Verlag, 1999. pp. 219-222 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{e5889023f4b0463c848f78a9c48e093c,

title = "A simple spanish part of speech tagger for detection and correction of accentuation error",

abstract = "One of the most frequent kind of typographic errors specific to Spanish is connected with accentuation, namely, with omission of an obligatory stress mark or insertion of a superfluous one. If such an error transforms one word to another existing one, the latter cannot be detected by usual spell-checkers, since some context analysis is necessary. A simple procedure is proposed for this task. It relies on (1) some simple heuristics that determine linear context and (2) on a small list of pairs of words that differ only in accentuation mark. This idea is applied to numerous nouns or adjectives like n{\'u}mero that pass to quasi-homonymous personal verb forms if they lose their stress marks.",

author = "Galicia-Haro, {S. N.} and Bolshakov, {I. A.} and Gelbukh, {A. F.}",

note = "Publisher Copyright: {\textcopyright} Springer-Verlag Berlin Heidelberg 1999.; 2nd International Workshop on Text, Speech and Dialogue, TSD 1999 ; Conference date: 13-09-1999 Through 17-09-1999",

year = "1999",

doi = "10.1007/3-540-48239-3_40",

language = "Ingl{\'e}s",

isbn = "3540664947",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Verlag",

pages = "219--222",

editor = "V{\'a}clav Matousek and Pavel Mautner and Jana Ocel{\'a}kov{\'a} and Petr Sojka",

booktitle = "Text, Speech and Dialogue - 2nd International Workshop, TSD 1999, Proceedings",

address = "Alemania",

}

Galicia-Haro, SN, Bolshakov, IA & Gelbukh, AF 1999, A simple spanish part of speech tagger for detection and correction of accentuation error. En V Matousek, P Mautner, J Oceláková & P Sojka (eds.), Text, Speech and Dialogue - 2nd International Workshop, TSD 1999, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 1692, Springer Verlag, pp. 219-222, 2nd International Workshop on Text, Speech and Dialogue, TSD 1999, Plzen, República Checa, 13/09/99. https://doi.org/10.1007/3-540-48239-3_40

A simple spanish part of speech tagger for detection and correction of accentuation error. / Galicia-Haro, S. N.; Bolshakov, I. A.; Gelbukh, A. F.
Text, Speech and Dialogue - 2nd International Workshop, TSD 1999, Proceedings. ed. / Václav Matousek; Pavel Mautner; Jana Oceláková; Petr Sojka. Springer Verlag, 1999. p. 219-222 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 1692).

Producción científica: Capítulo del libro/informe/acta de congreso › Contribución a la conferencia › revisión exhaustiva

TY - GEN

T1 - A simple spanish part of speech tagger for detection and correction of accentuation error

AU - Galicia-Haro, S. N.

AU - Bolshakov, I. A.

AU - Gelbukh, A. F.

N1 - Publisher Copyright: © Springer-Verlag Berlin Heidelberg 1999.

PY - 1999

Y1 - 1999

N2 - One of the most frequent kind of typographic errors specific to Spanish is connected with accentuation, namely, with omission of an obligatory stress mark or insertion of a superfluous one. If such an error transforms one word to another existing one, the latter cannot be detected by usual spell-checkers, since some context analysis is necessary. A simple procedure is proposed for this task. It relies on (1) some simple heuristics that determine linear context and (2) on a small list of pairs of words that differ only in accentuation mark. This idea is applied to numerous nouns or adjectives like número that pass to quasi-homonymous personal verb forms if they lose their stress marks.

AB - One of the most frequent kind of typographic errors specific to Spanish is connected with accentuation, namely, with omission of an obligatory stress mark or insertion of a superfluous one. If such an error transforms one word to another existing one, the latter cannot be detected by usual spell-checkers, since some context analysis is necessary. A simple procedure is proposed for this task. It relies on (1) some simple heuristics that determine linear context and (2) on a small list of pairs of words that differ only in accentuation mark. This idea is applied to numerous nouns or adjectives like número that pass to quasi-homonymous personal verb forms if they lose their stress marks.

UR - http://www.scopus.com/inward/record.url?scp=84957803877&partnerID=8YFLogxK

U2 - 10.1007/3-540-48239-3_40

DO - 10.1007/3-540-48239-3_40

M3 - Contribución a la conferencia

SN - 3540664947

SN - 9783540664949

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 219

EP - 222

BT - Text, Speech and Dialogue - 2nd International Workshop, TSD 1999, Proceedings

A2 - Matousek, Václav

A2 - Mautner, Pavel

A2 - Oceláková, Jana

A2 - Sojka, Petr

PB - Springer Verlag

T2 - 2nd International Workshop on Text, Speech and Dialogue, TSD 1999

Y2 - 13 September 1999 through 17 September 1999

ER -

Galicia-Haro SN, Bolshakov IA, Gelbukh AF. A simple spanish part of speech tagger for detection and correction of accentuation error. En Matousek V, Mautner P, Oceláková J, Sojka P, editores, Text, Speech and Dialogue - 2nd International Workshop, TSD 1999, Proceedings. Springer Verlag. 1999. p. 219-222. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/3-540-48239-3_40

A simple spanish part of speech tagger for detection and correction of accentuation error

Resumen

Serie de la publicación

Conferencia

Acceder al documento

Otros archivos y enlaces

Huella

Citar esto