Example of application of n-grams: Authorship attribution using syllables

Producción científica: Capítulo del libro/informe/acta de congresoCapítulorevisión exhaustiva

1 Cita (Scopus)

Resumen

As we described in the previous chapters, mainstream of the modern computational linguistics is based on application of machine learning methods. We represent our task as a classification task, represent our objects formally using features and their values (constructing vector space model), and then apply well-known classification algorithms. In this pipeline, the crucial question is how to select the features. For example, we can use as features words or n-grams of words (sequences of words) or sequences of characters (character n-grams), etc. An interesting question arises: Can we use syllables as features? It is very rarely done in computational linguistics, but there is certain linguistic reality behind syllables. This chapter explores this possibility for the authorship attribution task; it follows our research paper [99]. Note that syllables are somewhat similar to character n-grams in the sense that they are composed of several characters (being not too long).

Idioma originalInglés
Título de la publicación alojadaSpringerBriefs in Computer Science
EditorialSpringer
Páginas27-39
Número de páginas13
DOI
EstadoPublicada - 2019

Serie de la publicación

NombreSpringerBriefs in Computer Science
ISSN (versión impresa)2191-5768
ISSN (versión digital)2191-5776

Huella

Profundice en los temas de investigación de 'Example of application of n-grams: Authorship attribution using syllables'. En conjunto forman una huella única.

Citar esto