A Semi-supervised learning methodology for malware categorization using weighted word embeddings

Hugo Leonardo Duarte-Garcia, Carlos Domenick Morales-Medina, Aldo Hernandez-Suarez, Gabriel Sanchez-Perez, Karina Toscano-Medina, Hector Perez-Meana, Victor Sanchez, Ana Lucila Sandoval Orozco

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

9 Citas (Scopus)

Resumen

Due to the vertiginous growth of malicious actors, malware has been crafted, distributed and propagated around the world with new and sophisticated techniques. Classical malware detection procedures, mostly based on signatures and heuristic searches, are now being replaced with machine learning-based (ML) solutions. However, some challenges are still present. Firstly, supervised approaches use anti-virus tags to create hand-crafted datasets, resulting in a lack of taxonomy and uncertainty if a given observation is classified with a proper label. Secondly, off-line and feed-forward approaches may result in complex and time consuming feature extraction tasks. In this work, we propose a novel method that reinforces malware characterization by capturing rich relevance and contextual patterns into an n-dimensional weighted word embedding vector (WEV) space. Results prove that by clustering similar WEVs via unsupervised learning, malware can be categorized into four major families, improving detection with less resources.

Idioma originalInglés
Título de la publicación alojadaProceedings - 4th IEEE European Symposium on Security and Privacy Workshops, EUROS and PW 2019
EditorialInstitute of Electrical and Electronics Engineers Inc.
Páginas238-246
Número de páginas9
ISBN (versión digital)9781728130262
DOI
EstadoPublicada - jun. 2019
Evento4th IEEE European Symposium on Security and Privacy Workshops, EUROS and PW 2019 - Stockholm, Suecia
Duración: 17 jun. 201919 jun. 2019

Serie de la publicación

NombreProceedings - 4th IEEE European Symposium on Security and Privacy Workshops, EUROS and PW 2019

Conferencia

Conferencia4th IEEE European Symposium on Security and Privacy Workshops, EUROS and PW 2019
País/TerritorioSuecia
CiudadStockholm
Período17/06/1919/06/19

Huella

Profundice en los temas de investigación de 'A Semi-supervised learning methodology for malware categorization using weighted word embeddings'. En conjunto forman una huella única.

Citar esto