TY - JOUR
T1 - Climate patterns of political division units obtained using automatic classification trees
AU - Coria, Sergio R.
AU - Gay-García, Carlos
AU - Villers-Ruiz, Lourdes
AU - Guzmán-Arenas, Adolfo
AU - Sánchez-Meneses, Oscar
AU - Ávila-Barrón, Oswaldo R.
AU - Pérez-Meza, Mónica
AU - Cruz-Núñez, Xóchitl
AU - Martínez-Luna, Gilberto Lorenzo
N1 - Publisher Copyright:
© 2016 Universidad Nacional Autónoma de México
PY - 2016/10/1
Y1 - 2016/10/1
N2 - This article proposes a methodology to discover patterns in observed climatologic data, particularly temperatures and rainfall, in subnational political division units using an automatic classification algorithm (a decision tree produced by the C4.5 algorithm). Thus, the patterns represent classification trees, assuming that: (1) every political division unit contains at least one climatological station, and (2) the recording periods of the stations are relatively similar in duration and in their initial and ending years. A series of classification models are produced by using different subsets from an experimental dataset. This dataset contains information from 3606 climatological stations in Mexico with recording periods whose durations, initial and ending years are diverse. The target (dependent) variable in all these models is the name of the political unit (i.e., the state). The predictors are 36 monthly features per each climatological station: 12 features corresponding to a minimum temperature, 12 to a maximum temperature, and 12 to cumulative rainfall. The altitude feature is also used as one of the predictors, in addition to the other 36; however, it is used only to quantify its additional contribution to the modelling. The results show that classification trees are effective models for describing and representing non-trivial patterns to characterize the political division units based on their monthly temperatures and rainfalls. One of the remarkable findings is that the cumulative rainfall of May is the feature with highest discrimination capability to the characterization task, which is consistent with the theoretical background on Mexican climatology. In addition, classification trees offer higher expressivity to non-experts in machine learning.
AB - This article proposes a methodology to discover patterns in observed climatologic data, particularly temperatures and rainfall, in subnational political division units using an automatic classification algorithm (a decision tree produced by the C4.5 algorithm). Thus, the patterns represent classification trees, assuming that: (1) every political division unit contains at least one climatological station, and (2) the recording periods of the stations are relatively similar in duration and in their initial and ending years. A series of classification models are produced by using different subsets from an experimental dataset. This dataset contains information from 3606 climatological stations in Mexico with recording periods whose durations, initial and ending years are diverse. The target (dependent) variable in all these models is the name of the political unit (i.e., the state). The predictors are 36 monthly features per each climatological station: 12 features corresponding to a minimum temperature, 12 to a maximum temperature, and 12 to cumulative rainfall. The altitude feature is also used as one of the predictors, in addition to the other 36; however, it is used only to quantify its additional contribution to the modelling. The results show that classification trees are effective models for describing and representing non-trivial patterns to characterize the political division units based on their monthly temperatures and rainfalls. One of the remarkable findings is that the cumulative rainfall of May is the feature with highest discrimination capability to the characterization task, which is consistent with the theoretical background on Mexican climatology. In addition, classification trees offer higher expressivity to non-experts in machine learning.
KW - C4.5 algorithm
KW - Climate patterns
KW - Mexico climate
KW - classification algorithms
KW - classification trees
KW - data mining
KW - data science
KW - political division
UR - http://www.scopus.com/inward/record.url?scp=85015357578&partnerID=8YFLogxK
U2 - 10.20937/ATM.2016.29.04.06
DO - 10.20937/ATM.2016.29.04.06
M3 - Artículo
SN - 0187-6236
VL - 29
SP - 359
EP - 377
JO - Atmosfera
JF - Atmosfera
IS - 4
ER -