Resumen
Machine learning gives to the systems the ability to learn from experience. This is achieved throughthe generation of machine learning models. One of the most widely used models is supervisedlearning, which employs classification models that allow a computer program to learn from theinput data to obtain classifications. Input and output data are labelled for classification, providinga learning base for future data processing. The C4.5 algorithm is used to obtain classificationmodels (from a database), called decision trees. This algorithm uses the concept of entropy definedby Shannon to calculate the gain ratio. In this study Tsallis and Renyi entropies (instead ofShannon) are used to construct a decision tree. In previous works, these entropies have shownbetter results than Shannon. These entropies have an additional parameter q that is used to affectthe probability distributions. This research focuses on developing a method that obtains the valueof q that will be applied to compute the information gain ratio in the C4.5 algorithm using Tsallisand Renyi entropies. The method obtains a network representation of the database; then, the box-covering algorithm is computed to obtain the minimum number of boxes to cover the network. Thecalculation of the parameter q will depend on the minimum network coverage. https://www.researchgate.net/publication/365686804_Analisis_comparativo_del_Indice_Entropico_en_Bases_de_Datos_Utilizando_la_Entropia_de_Tsallis_y_Renyi_en_Arboles_de_clasificacion_C45 [accessed Jan 21 2023].
Título traducido de la contribución | Comparative Analysis of Entropic Index in Databases Using Tsallis and Renyi Entropy in C4.5 Classification Trees |
---|---|
Idioma original | Español |
Publicación | Revista Internacional de Investigación e Innovación Tecnológica |
Volumen | 10 |
N.º | 59 |
Estado | Publicada - 1 nov. 2022 |