TY - JOUR
T1 - Ancone
T2 - An interactive system for mining and visualization of students’ information in the context of planea 2015
AU - Márquez, Arturo Heredia
AU - Poot, Angel Chi
AU - Arenas, Adolfo Guzmán
AU - Luna, Gilberto Martínez
N1 - Publisher Copyright:
© 2020 Instituto Politecnico Nacional. All rights reserved.
PY - 2020
Y1 - 2020
N2 - Data mining has been widely used in different areas of knowledge, and education is no exception. Data mining uses computer models to analyze data and answer research questions to help in decision making. This article uses data from the PLANEA 2015 Mathematics in Middle school (last year of Middle school) test, which measures the academic achievement and provides a personal, family and school context, in order to find those characteristics that are related to the academic level of the tested students. In this article, an interactive visualization system was developed that allows observing interesting patterns and association rules by combining relevant attributes (variables) and the States. To reduce the analysis space, the Correlation-Based Feature Selection method was used to reduce categorical and numerical attributes. The results show a significant reduction (93%) in the number of attributes, with very little loss of information, when certain attributes are eliminated. Particularly, the 232 categorical attributes obtained from each student are reduced to only 18 attributes, which are correlated with the results of students in the PLANEA test. In addition, empirically it was discovered that choosing the mode from the labels of plausible values as the target class increases the accuracy in classifiers used to show the goodness of the reduction obtained. Some of the relevant attributes are the "AcademicAspiration", "FamilyResources", "MotherStudies" and "FatherStudies". From the 30 States with information, only 8 are in the Basic level, the other States are in the Below Basic level.
AB - Data mining has been widely used in different areas of knowledge, and education is no exception. Data mining uses computer models to analyze data and answer research questions to help in decision making. This article uses data from the PLANEA 2015 Mathematics in Middle school (last year of Middle school) test, which measures the academic achievement and provides a personal, family and school context, in order to find those characteristics that are related to the academic level of the tested students. In this article, an interactive visualization system was developed that allows observing interesting patterns and association rules by combining relevant attributes (variables) and the States. To reduce the analysis space, the Correlation-Based Feature Selection method was used to reduce categorical and numerical attributes. The results show a significant reduction (93%) in the number of attributes, with very little loss of information, when certain attributes are eliminated. Particularly, the 232 categorical attributes obtained from each student are reduced to only 18 attributes, which are correlated with the results of students in the PLANEA test. In addition, empirically it was discovered that choosing the mode from the labels of plausible values as the target class increases the accuracy in classifiers used to show the goodness of the reduction obtained. Some of the relevant attributes are the "AcademicAspiration", "FamilyResources", "MotherStudies" and "FatherStudies". From the 30 States with information, only 8 are in the Basic level, the other States are in the Below Basic level.
KW - ANCONE
KW - Data mining
KW - Dimensionality reduction
KW - Education
KW - INEE
KW - PLANEA
UR - http://www.scopus.com/inward/record.url?scp=85086671802&partnerID=8YFLogxK
U2 - 10.13053/CyS-24-1-3113
DO - 10.13053/CyS-24-1-3113
M3 - Artículo
AN - SCOPUS:85086671802
SN - 1405-5546
VL - 24
SP - 151
EP - 176
JO - Computacion y Sistemas
JF - Computacion y Sistemas
IS - 1
ER -