TY - JOUR
T1 - ReinforSec
T2 - An Automatic Generator of Synthetic Malware Samples and Denial-of-Service Attacks through Reinforcement Learning
AU - Hernandez-Suarez, Aldo
AU - Sanchez-Perez, Gabriel
AU - Toscano-Medina, Linda K.
AU - Perez-Meana, Hector
AU - Olivares-Mercado, Jesus
AU - Portillo-Portillo, Jose
AU - Benitez-Garcia, Gibran
AU - Sandoval Orozco, Ana Lucila
AU - García Villalba, Luis Javier
N1 - Publisher Copyright:
© 2023 by the authors.
PY - 2023/2
Y1 - 2023/2
N2 - In recent years, cybersecurity has been strengthened through the adoption of processes, mechanisms and rapid sources of indicators of compromise in critical areas. Among the most latent challenges are the detection, classification and eradication of malware and Denial of Service Cyber-Attacks (DoS). The literature has presented different ways to obtain and evaluate malware- and DoS-cyber-attack-related instances, either from a technical point of view or by offering ready-to-use datasets. However, acquiring fresh, up-to-date samples requires an arduous process of exploration, sandbox configuration and mass storage, which may ultimately result in an unbalanced or under-represented set. Synthetic sample generation has shown that the cost associated with setting up controlled environments and time spent on sample evaluation can be reduced. Nevertheless, the process is performed when the observations already belong to a characterized set, totally detached from a real environment. In order to solve the aforementioned, this work proposes a methodology for the generation of synthetic samples of malicious Portable Executable binaries and DoS cyber-attacks. The task is performed via a Reinforcement Learning engine, which learns from a baseline of different malware families and DoS cyber-attack network properties, resulting in new, mutated and highly functional samples. Experimental results demonstrate the high adaptability of the outputs as new input datasets for different Machine Learning algorithms.
AB - In recent years, cybersecurity has been strengthened through the adoption of processes, mechanisms and rapid sources of indicators of compromise in critical areas. Among the most latent challenges are the detection, classification and eradication of malware and Denial of Service Cyber-Attacks (DoS). The literature has presented different ways to obtain and evaluate malware- and DoS-cyber-attack-related instances, either from a technical point of view or by offering ready-to-use datasets. However, acquiring fresh, up-to-date samples requires an arduous process of exploration, sandbox configuration and mass storage, which may ultimately result in an unbalanced or under-represented set. Synthetic sample generation has shown that the cost associated with setting up controlled environments and time spent on sample evaluation can be reduced. Nevertheless, the process is performed when the observations already belong to a characterized set, totally detached from a real environment. In order to solve the aforementioned, this work proposes a methodology for the generation of synthetic samples of malicious Portable Executable binaries and DoS cyber-attacks. The task is performed via a Reinforcement Learning engine, which learns from a baseline of different malware families and DoS cyber-attack network properties, resulting in new, mutated and highly functional samples. Experimental results demonstrate the high adaptability of the outputs as new input datasets for different Machine Learning algorithms.
KW - artificial intelligence
KW - cybersecurity
KW - cybersecurity datasets
KW - denial-of-service
KW - machine learning
KW - malware
KW - q-learning
KW - reinforcement learning
KW - synthetic sampling
UR - http://www.scopus.com/inward/record.url?scp=85147854683&partnerID=8YFLogxK
U2 - 10.3390/s23031231
DO - 10.3390/s23031231
M3 - Artículo
C2 - 36772270
AN - SCOPUS:85147854683
SN - 1424-8220
VL - 23
JO - Sensors
JF - Sensors
IS - 3
M1 - 1231
ER -