ReinforSec: An Automatic Generator of Synthetic Malware Samples and Denial-of-Service Attacks through Reinforcement Learning

Aldo Hernandez-Suarez; Gabriel Sanchez-Perez; Linda K. Toscano-Medina; Hector Perez-Meana; Jesus Olivares-Mercado; Jose Portillo-Portillo; Gibran Benitez-Garcia; Ana Lucila Sandoval Orozco; Luis Javier García Villalba

doi:10.3390/s23031231

ReinforSec: An Automatic Generator of Synthetic Malware Samples and Denial-of-Service Attacks through Reinforcement Learning

Aldo Hernandez-Suarez, Gabriel Sanchez-Perez, Linda K. Toscano-Medina, Hector Perez-Meana, Jesus Olivares-Mercado, Jose Portillo-Portillo, Gibran Benitez-Garcia, Ana Lucila Sandoval Orozco, Luis Javier García Villalba

Escuela Superior de Ingeniería Mecánica y Eléctrica (ESIME), Unidad Culhuacán

Research output: Contribution to journal › Article › peer-review

3 Scopus citations

Abstract

In recent years, cybersecurity has been strengthened through the adoption of processes, mechanisms and rapid sources of indicators of compromise in critical areas. Among the most latent challenges are the detection, classification and eradication of malware and Denial of Service Cyber-Attacks (DoS). The literature has presented different ways to obtain and evaluate malware- and DoS-cyber-attack-related instances, either from a technical point of view or by offering ready-to-use datasets. However, acquiring fresh, up-to-date samples requires an arduous process of exploration, sandbox configuration and mass storage, which may ultimately result in an unbalanced or under-represented set. Synthetic sample generation has shown that the cost associated with setting up controlled environments and time spent on sample evaluation can be reduced. Nevertheless, the process is performed when the observations already belong to a characterized set, totally detached from a real environment. In order to solve the aforementioned, this work proposes a methodology for the generation of synthetic samples of malicious Portable Executable binaries and DoS cyber-attacks. The task is performed via a Reinforcement Learning engine, which learns from a baseline of different malware families and DoS cyber-attack network properties, resulting in new, mutated and highly functional samples. Experimental results demonstrate the high adaptability of the outputs as new input datasets for different Machine Learning algorithms.

Original language	English
Article number	1231
Journal	Sensors
Volume	23
Issue number	3
DOIs	https://doi.org/10.3390/s23031231
State	Published - Feb 2023

Keywords

artificial intelligence
cybersecurity
cybersecurity datasets
denial-of-service
machine learning
malware
q-learning
reinforcement learning
synthetic sampling

Access to Document

10.3390/s23031231

Cite this

Hernandez-Suarez, A., Sanchez-Perez, G., Toscano-Medina, L. K., Perez-Meana, H., Olivares-Mercado, J., Portillo-Portillo, J., Benitez-Garcia, G., Sandoval Orozco, A. L., & García Villalba, L. J. (2023). ReinforSec: An Automatic Generator of Synthetic Malware Samples and Denial-of-Service Attacks through Reinforcement Learning. Sensors, 23(3), Article 1231. https://doi.org/10.3390/s23031231

@article{71dba8aead27452a8e844cc0166fc6df,

title = "ReinforSec: An Automatic Generator of Synthetic Malware Samples and Denial-of-Service Attacks through Reinforcement Learning",

abstract = "In recent years, cybersecurity has been strengthened through the adoption of processes, mechanisms and rapid sources of indicators of compromise in critical areas. Among the most latent challenges are the detection, classification and eradication of malware and Denial of Service Cyber-Attacks (DoS). The literature has presented different ways to obtain and evaluate malware- and DoS-cyber-attack-related instances, either from a technical point of view or by offering ready-to-use datasets. However, acquiring fresh, up-to-date samples requires an arduous process of exploration, sandbox configuration and mass storage, which may ultimately result in an unbalanced or under-represented set. Synthetic sample generation has shown that the cost associated with setting up controlled environments and time spent on sample evaluation can be reduced. Nevertheless, the process is performed when the observations already belong to a characterized set, totally detached from a real environment. In order to solve the aforementioned, this work proposes a methodology for the generation of synthetic samples of malicious Portable Executable binaries and DoS cyber-attacks. The task is performed via a Reinforcement Learning engine, which learns from a baseline of different malware families and DoS cyber-attack network properties, resulting in new, mutated and highly functional samples. Experimental results demonstrate the high adaptability of the outputs as new input datasets for different Machine Learning algorithms.",

keywords = "artificial intelligence, cybersecurity, cybersecurity datasets, denial-of-service, machine learning, malware, q-learning, reinforcement learning, synthetic sampling",

author = "Aldo Hernandez-Suarez and Gabriel Sanchez-Perez and Toscano-Medina, {Linda K.} and Hector Perez-Meana and Jesus Olivares-Mercado and Jose Portillo-Portillo and Gibran Benitez-Garcia and {Sandoval Orozco}, {Ana Lucila} and {Garc{\'i}a Villalba}, {Luis Javier}",

note = "Publisher Copyright: {\textcopyright} 2023 by the authors.",

year = "2023",

month = feb,

doi = "10.3390/s23031231",

language = "Ingl{\'e}s",

volume = "23",

journal = "Sensors",

issn = "1424-8220",

number = "3",

}

Hernandez-Suarez, A , Sanchez-Perez, G , Toscano-Medina, LK , Perez-Meana, H , Olivares-Mercado, J , Portillo-Portillo, J, Benitez-Garcia, G, Sandoval Orozco, AL & García Villalba, LJ 2023, 'ReinforSec: An Automatic Generator of Synthetic Malware Samples and Denial-of-Service Attacks through Reinforcement Learning', Sensors, vol. 23, no. 3, 1231. https://doi.org/10.3390/s23031231

TY - JOUR

T1 - ReinforSec

T2 - An Automatic Generator of Synthetic Malware Samples and Denial-of-Service Attacks through Reinforcement Learning

AU - Hernandez-Suarez, Aldo

AU - Sanchez-Perez, Gabriel

AU - Toscano-Medina, Linda K.

AU - Perez-Meana, Hector

AU - Olivares-Mercado, Jesus

AU - Portillo-Portillo, Jose

AU - Benitez-Garcia, Gibran

AU - Sandoval Orozco, Ana Lucila

AU - García Villalba, Luis Javier

PY - 2023/2

Y1 - 2023/2

N2 - In recent years, cybersecurity has been strengthened through the adoption of processes, mechanisms and rapid sources of indicators of compromise in critical areas. Among the most latent challenges are the detection, classification and eradication of malware and Denial of Service Cyber-Attacks (DoS). The literature has presented different ways to obtain and evaluate malware- and DoS-cyber-attack-related instances, either from a technical point of view or by offering ready-to-use datasets. However, acquiring fresh, up-to-date samples requires an arduous process of exploration, sandbox configuration and mass storage, which may ultimately result in an unbalanced or under-represented set. Synthetic sample generation has shown that the cost associated with setting up controlled environments and time spent on sample evaluation can be reduced. Nevertheless, the process is performed when the observations already belong to a characterized set, totally detached from a real environment. In order to solve the aforementioned, this work proposes a methodology for the generation of synthetic samples of malicious Portable Executable binaries and DoS cyber-attacks. The task is performed via a Reinforcement Learning engine, which learns from a baseline of different malware families and DoS cyber-attack network properties, resulting in new, mutated and highly functional samples. Experimental results demonstrate the high adaptability of the outputs as new input datasets for different Machine Learning algorithms.

AB - In recent years, cybersecurity has been strengthened through the adoption of processes, mechanisms and rapid sources of indicators of compromise in critical areas. Among the most latent challenges are the detection, classification and eradication of malware and Denial of Service Cyber-Attacks (DoS). The literature has presented different ways to obtain and evaluate malware- and DoS-cyber-attack-related instances, either from a technical point of view or by offering ready-to-use datasets. However, acquiring fresh, up-to-date samples requires an arduous process of exploration, sandbox configuration and mass storage, which may ultimately result in an unbalanced or under-represented set. Synthetic sample generation has shown that the cost associated with setting up controlled environments and time spent on sample evaluation can be reduced. Nevertheless, the process is performed when the observations already belong to a characterized set, totally detached from a real environment. In order to solve the aforementioned, this work proposes a methodology for the generation of synthetic samples of malicious Portable Executable binaries and DoS cyber-attacks. The task is performed via a Reinforcement Learning engine, which learns from a baseline of different malware families and DoS cyber-attack network properties, resulting in new, mutated and highly functional samples. Experimental results demonstrate the high adaptability of the outputs as new input datasets for different Machine Learning algorithms.

KW - artificial intelligence

KW - cybersecurity

KW - cybersecurity datasets

KW - denial-of-service

KW - machine learning

KW - malware

KW - q-learning

KW - reinforcement learning

KW - synthetic sampling

UR - http://www.scopus.com/inward/record.url?scp=85147854683&partnerID=8YFLogxK

U2 - 10.3390/s23031231

DO - 10.3390/s23031231

M3 - Artículo

C2 - 36772270

AN - SCOPUS:85147854683

SN - 1424-8220

VL - 23

JO - Sensors

JF - Sensors

IS - 3

M1 - 1231

ER -

ReinforSec: An Automatic Generator of Synthetic Malware Samples and Denial-of-Service Attacks through Reinforcement Learning

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this