Dense Captioning of Natural Scenes in Spanish

Alejandro Gomez-Garay, Bogdan Raducanu, Joaquín Salas

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

2 Citas (Scopus)

Resumen

The inclusion of visually impaired people to daily life is a challenging and active area of research. This work studies how to bring information about the surroundings to people delivered as verbal descriptions in Spanish using wearable devices. We use a neural network (DenseCap) for both identifying objects and generating phrases about them. DenseCap is running on a server to describe an image fed from a smartphone application, and its output is the text which a smartphone verbalizes. Our implementation achieves a mean Average Precision (mAP) of 5.0 in object recognition and quality of captions and takes an average of 7.5 s from the moment one grabs a picture until one receives the verbalization in Spanish.

Idioma originalInglés
Título de la publicación alojadaPattern Recognition - 10th Mexican Conference, MCPR 2018, Proceedings
EditoresJose Francisco Martinez-Trinidad, Jesus Ariel Carrasco-Ochoa, Jose Arturo Olvera-Lopez, Sudeep Sarkar
EditorialSpringer Verlag
Páginas145-154
Número de páginas10
ISBN (versión impresa)9783319921976
DOI
EstadoPublicada - 2018
Evento10th Mexican Conference on Pattern Recognition, MCPR 2018 - Puebla, México
Duración: 27 jun. 201830 jun. 2018

Serie de la publicación

NombreLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volumen10880 LNCS
ISSN (versión impresa)0302-9743
ISSN (versión digital)1611-3349

Conferencia

Conferencia10th Mexican Conference on Pattern Recognition, MCPR 2018
País/TerritorioMéxico
CiudadPuebla
Período27/06/1830/06/18

Huella

Profundice en los temas de investigación de 'Dense Captioning of Natural Scenes in Spanish'. En conjunto forman una huella única.

Citar esto