Prior latent distribution comparison for the RNN Variational Autoencoder in low-resource language modeling

Yevhen Kostiuk; Mykola Lukashchuk; Alexander Gelbukh; Grigori Sidorov

doi:10.3233/JIFS-219243

Prior latent distribution comparison for the RNN Variational Autoencoder in low-resource language modeling

Yevhen Kostiuk, Mykola Lukashchuk, Alexander Gelbukh, Grigori Sidorov

Centro de Investigación en Computación (CIC)

Producción científica: Contribución a una revista › Artículo › revisión exhaustiva

Resumen

Probabilistic Bayesian methods are widely used in the machine learning domain. Variational Autoencoder (VAE) is a common architecture for solving the Language Modeling task in a self-supervised way. VAE consists of a concept of latent variables inside the model. Latent variables are described as a random variable that is fit by the data. Up to now, in the majority of cases, latent variables are considered normally distributed. The normal distribution is a well-known distribution that can be easily included in any pipeline. Moreover, the normal distribution is a good choice when the Central Limit Theorem (CLT) holds. It makes it effective when one is working with i.i.d. (independent and identically distributed) random variables. However, the conditions of CLT in Natural Language Processing are not easy to check. So, the choice of distribution family is unclear in the domain. This paper studies the priors selection impact of continuous distributions in the Low-Resource Language Modeling task with VAE. The experiment shows that there is a statistical difference between the different priors in the encoder-decoder architecture. We showed that family distribution hyperparameter is important in the Low-Resource Language Modeling task and should be considered for the model training.

Idioma original	Inglés
Páginas (desde-hasta)	4541-4549
Número de páginas	9
Publicación	Journal of Intelligent and Fuzzy Systems
Volumen	42
N.º	5
DOI	https://doi.org/10.3233/JIFS-219243
Estado	Publicada - 2022

Acceder al documento

10.3233/JIFS-219243

Otros archivos y enlaces

Enlace a la publicación en Scopus

Citar esto

@article{ad4ab8eb8e434413bc9c73566439873f,

title = "Prior latent distribution comparison for the RNN Variational Autoencoder in low-resource language modeling",

abstract = "Probabilistic Bayesian methods are widely used in the machine learning domain. Variational Autoencoder (VAE) is a common architecture for solving the Language Modeling task in a self-supervised way. VAE consists of a concept of latent variables inside the model. Latent variables are described as a random variable that is fit by the data. Up to now, in the majority of cases, latent variables are considered normally distributed. The normal distribution is a well-known distribution that can be easily included in any pipeline. Moreover, the normal distribution is a good choice when the Central Limit Theorem (CLT) holds. It makes it effective when one is working with i.i.d. (independent and identically distributed) random variables. However, the conditions of CLT in Natural Language Processing are not easy to check. So, the choice of distribution family is unclear in the domain. This paper studies the priors selection impact of continuous distributions in the Low-Resource Language Modeling task with VAE. The experiment shows that there is a statistical difference between the different priors in the encoder-decoder architecture. We showed that family distribution hyperparameter is important in the Low-Resource Language Modeling task and should be considered for the model training.",

keywords = "Bayesian model, NLP, RNN, VAE, Variational Autoencoder, low-resource language modeling, priors",

author = "Yevhen Kostiuk and Mykola Lukashchuk and Alexander Gelbukh and Grigori Sidorov",

year = "2022",

doi = "10.3233/JIFS-219243",

language = "Ingl{\'e}s",

volume = "42",

pages = "4541--4549",

journal = "Journal of Intelligent and Fuzzy Systems",

issn = "1064-1246",

number = "5",

}

TY - JOUR

T1 - Prior latent distribution comparison for the RNN Variational Autoencoder in low-resource language modeling

AU - Kostiuk, Yevhen

AU - Lukashchuk, Mykola

AU - Gelbukh, Alexander

AU - Sidorov, Grigori

PY - 2022

Y1 - 2022

N2 - Probabilistic Bayesian methods are widely used in the machine learning domain. Variational Autoencoder (VAE) is a common architecture for solving the Language Modeling task in a self-supervised way. VAE consists of a concept of latent variables inside the model. Latent variables are described as a random variable that is fit by the data. Up to now, in the majority of cases, latent variables are considered normally distributed. The normal distribution is a well-known distribution that can be easily included in any pipeline. Moreover, the normal distribution is a good choice when the Central Limit Theorem (CLT) holds. It makes it effective when one is working with i.i.d. (independent and identically distributed) random variables. However, the conditions of CLT in Natural Language Processing are not easy to check. So, the choice of distribution family is unclear in the domain. This paper studies the priors selection impact of continuous distributions in the Low-Resource Language Modeling task with VAE. The experiment shows that there is a statistical difference between the different priors in the encoder-decoder architecture. We showed that family distribution hyperparameter is important in the Low-Resource Language Modeling task and should be considered for the model training.

AB - Probabilistic Bayesian methods are widely used in the machine learning domain. Variational Autoencoder (VAE) is a common architecture for solving the Language Modeling task in a self-supervised way. VAE consists of a concept of latent variables inside the model. Latent variables are described as a random variable that is fit by the data. Up to now, in the majority of cases, latent variables are considered normally distributed. The normal distribution is a well-known distribution that can be easily included in any pipeline. Moreover, the normal distribution is a good choice when the Central Limit Theorem (CLT) holds. It makes it effective when one is working with i.i.d. (independent and identically distributed) random variables. However, the conditions of CLT in Natural Language Processing are not easy to check. So, the choice of distribution family is unclear in the domain. This paper studies the priors selection impact of continuous distributions in the Low-Resource Language Modeling task with VAE. The experiment shows that there is a statistical difference between the different priors in the encoder-decoder architecture. We showed that family distribution hyperparameter is important in the Low-Resource Language Modeling task and should be considered for the model training.

KW - Bayesian model

KW - NLP

KW - RNN

KW - VAE

KW - Variational Autoencoder

KW - low-resource language modeling

KW - priors

UR - http://www.scopus.com/inward/record.url?scp=85128227957&partnerID=8YFLogxK

U2 - 10.3233/JIFS-219243

DO - 10.3233/JIFS-219243

M3 - Artículo

AN - SCOPUS:85128227957

SN - 1064-1246

VL - 42

SP - 4541

EP - 4549

JO - Journal of Intelligent and Fuzzy Systems

JF - Journal of Intelligent and Fuzzy Systems

IS - 5

ER -

Prior latent distribution comparison for the RNN Variational Autoencoder in low-resource language modeling

Resumen

Acceder al documento

Otros archivos y enlaces

Huella

Citar esto