TY - GEN
T1 - Hot Spots & Hot Regions Detection Using Classification Algorithms in BMPs Complexes at the Protein-Protein Interface with the Ground-State Energy Feature
AU - Chaparro-Amaro, O.
AU - Martínez-Felipe, M.
AU - Martínez-Castro, J.
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2022
Y1 - 2022
N2 - We present the results of the application of some machine learning algorithms to predict the hot spots & hot regions residues in protein complexes at the protein-protein interface between their polypeptide chains. The dataset consisted of twenty-nine bone morphogenetic proteins (BMPs) obtained from the Protein Data Bank (PDB). The training features were selected from biochemical and biophysical properties such as B-factor, hydrophobicity index, prevalence score, accessible surface area (ASA), conservation score, and the ground-state energy (using Density Functional Theory (DFT)) of each amino acid of these interfaces. Also, we implemented parallel CPU/GPU hardware acceleration techniques during the preprocessing in order to speed up the ASA and DFT calculations with more efficient execution times. We evaluated the performance of the classifiers with several metrics. The random forest classifier obtained the best performance, achieving an average of 90 % of well-classified residues in both the true negative and true positive rates.
AB - We present the results of the application of some machine learning algorithms to predict the hot spots & hot regions residues in protein complexes at the protein-protein interface between their polypeptide chains. The dataset consisted of twenty-nine bone morphogenetic proteins (BMPs) obtained from the Protein Data Bank (PDB). The training features were selected from biochemical and biophysical properties such as B-factor, hydrophobicity index, prevalence score, accessible surface area (ASA), conservation score, and the ground-state energy (using Density Functional Theory (DFT)) of each amino acid of these interfaces. Also, we implemented parallel CPU/GPU hardware acceleration techniques during the preprocessing in order to speed up the ASA and DFT calculations with more efficient execution times. We evaluated the performance of the classifiers with several metrics. The random forest classifier obtained the best performance, achieving an average of 90 % of well-classified residues in both the true negative and true positive rates.
KW - BMPs
KW - DFT
KW - Hot regions
KW - Hot spots
UR - http://www.scopus.com/inward/record.url?scp=85132999260&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-07750-0_1
DO - 10.1007/978-3-031-07750-0_1
M3 - Contribución a la conferencia
AN - SCOPUS:85132999260
SN - 9783031077494
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 3
EP - 14
BT - Pattern Recognition - 14th Mexican Conference, MCPR 2022, Proceedings
A2 - Vergara-Villegas, Osslan Osiris
A2 - Cruz-Sánchez, Vianey Guadalupe
A2 - Sossa-Azuela, Juan Humberto
A2 - Carrasco-Ochoa, Jesús Ariel
A2 - Martínez-Trinidad, José Francisco
A2 - Olvera-López, José Arturo
PB - Springer Science and Business Media Deutschland GmbH
T2 - 14th Mexican Conference on Pattern Recognition, MCPR 2022
Y2 - 22 June 2022 through 25 June 2022
ER -