Dense Captioning of Natural Scenes in Spanish

Alejandro Gomez-Garay; Bogdan Raducanu; Joaquín Salas

doi:10.1007/978-3-319-92198-3_15

Dense Captioning of Natural Scenes in Spanish

Alejandro Gomez-Garay, Bogdan Raducanu, Joaquín Salas

Centro de Investigación en Ciencia Aplicada y Tecnología Avanzada (CICATA), Unidad Querétaro

Producción científica: Capítulo del libro/informe/acta de congreso › Contribución a la conferencia › revisión exhaustiva

2 Citas (Scopus)

Resumen

The inclusion of visually impaired people to daily life is a challenging and active area of research. This work studies how to bring information about the surroundings to people delivered as verbal descriptions in Spanish using wearable devices. We use a neural network (DenseCap) for both identifying objects and generating phrases about them. DenseCap is running on a server to describe an image fed from a smartphone application, and its output is the text which a smartphone verbalizes. Our implementation achieves a mean Average Precision (mAP) of 5.0 in object recognition and quality of captions and takes an average of 7.5 s from the moment one grabs a picture until one receives the verbalization in Spanish.

Idioma original	Inglés
Título de la publicación alojada	Pattern Recognition - 10th Mexican Conference, MCPR 2018, Proceedings
Editores	Jose Francisco Martinez-Trinidad, Jesus Ariel Carrasco-Ochoa, Jose Arturo Olvera-Lopez, Sudeep Sarkar
Editorial	Springer Verlag
Páginas	145-154
Número de páginas	10
ISBN (versión impresa)	9783319921976
DOI	https://doi.org/10.1007/978-3-319-92198-3_15
Estado	Publicada - 2018
Evento	10th Mexican Conference on Pattern Recognition, MCPR 2018 - Puebla, México Duración: 27 jun. 2018 → 30 jun. 2018

Serie de la publicación

Nombre	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volumen	10880 LNCS
ISSN (versión impresa)	0302-9743
ISSN (versión digital)	1611-3349

Conferencia

Conferencia	10th Mexican Conference on Pattern Recognition, MCPR 2018
País/Territorio	México
Ciudad	Puebla
Período	27/06/18 → 30/06/18

Acceder al documento

10.1007/978-3-319-92198-3_15

Otros archivos y enlaces

Enlace a la publicación en Scopus

Citar esto

Gomez-Garay, A., Raducanu, B., & Salas, J. (2018). Dense Captioning of Natural Scenes in Spanish. En J. F. Martinez-Trinidad, J. A. Carrasco-Ochoa, J. A. Olvera-Lopez, & S. Sarkar (Eds.), Pattern Recognition - 10th Mexican Conference, MCPR 2018, Proceedings (pp. 145-154). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10880 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-319-92198-3_15

Gomez-Garay, Alejandro ; Raducanu, Bogdan ; Salas, Joaquín. / Dense Captioning of Natural Scenes in Spanish. Pattern Recognition - 10th Mexican Conference, MCPR 2018, Proceedings. editor / Jose Francisco Martinez-Trinidad ; Jesus Ariel Carrasco-Ochoa ; Jose Arturo Olvera-Lopez ; Sudeep Sarkar. Springer Verlag, 2018. pp. 145-154 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{99a1dc5747a04ca4bcca9d7c28b5776f,

title = "Dense Captioning of Natural Scenes in Spanish",

abstract = "The inclusion of visually impaired people to daily life is a challenging and active area of research. This work studies how to bring information about the surroundings to people delivered as verbal descriptions in Spanish using wearable devices. We use a neural network (DenseCap) for both identifying objects and generating phrases about them. DenseCap is running on a server to describe an image fed from a smartphone application, and its output is the text which a smartphone verbalizes. Our implementation achieves a mean Average Precision (mAP) of 5.0 in object recognition and quality of captions and takes an average of 7.5 s from the moment one grabs a picture until one receives the verbalization in Spanish.",

keywords = "Computer vision, Deep learning, Image captioning, Spanish language",

author = "Alejandro Gomez-Garay and Bogdan Raducanu and Joaqu{\'i}n Salas",

note = "Publisher Copyright: {\textcopyright} 2018, Springer International Publishing AG, part of Springer Nature.; 10th Mexican Conference on Pattern Recognition, MCPR 2018 ; Conference date: 27-06-2018 Through 30-06-2018",

year = "2018",

doi = "10.1007/978-3-319-92198-3_15",

language = "Ingl{\'e}s",

isbn = "9783319921976",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Verlag",

pages = "145--154",

editor = "Martinez-Trinidad, {Jose Francisco} and Carrasco-Ochoa, {Jesus Ariel} and Olvera-Lopez, {Jose Arturo} and Sudeep Sarkar",

booktitle = "Pattern Recognition - 10th Mexican Conference, MCPR 2018, Proceedings",

address = "Alemania",

}

Gomez-Garay, A, Raducanu, B & Salas, J 2018, Dense Captioning of Natural Scenes in Spanish. En JF Martinez-Trinidad, JA Carrasco-Ochoa, JA Olvera-Lopez & S Sarkar (eds.), Pattern Recognition - 10th Mexican Conference, MCPR 2018, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 10880 LNCS, Springer Verlag, pp. 145-154, 10th Mexican Conference on Pattern Recognition, MCPR 2018, Puebla, México, 27/06/18. https://doi.org/10.1007/978-3-319-92198-3_15

Dense Captioning of Natural Scenes in Spanish. / Gomez-Garay, Alejandro; Raducanu, Bogdan; Salas, Joaquín.
Pattern Recognition - 10th Mexican Conference, MCPR 2018, Proceedings. ed. / Jose Francisco Martinez-Trinidad; Jesus Ariel Carrasco-Ochoa; Jose Arturo Olvera-Lopez; Sudeep Sarkar. Springer Verlag, 2018. p. 145-154 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10880 LNCS).

Producción científica: Capítulo del libro/informe/acta de congreso › Contribución a la conferencia › revisión exhaustiva

TY - GEN

T1 - Dense Captioning of Natural Scenes in Spanish

AU - Gomez-Garay, Alejandro

AU - Raducanu, Bogdan

AU - Salas, Joaquín

PY - 2018

Y1 - 2018

N2 - The inclusion of visually impaired people to daily life is a challenging and active area of research. This work studies how to bring information about the surroundings to people delivered as verbal descriptions in Spanish using wearable devices. We use a neural network (DenseCap) for both identifying objects and generating phrases about them. DenseCap is running on a server to describe an image fed from a smartphone application, and its output is the text which a smartphone verbalizes. Our implementation achieves a mean Average Precision (mAP) of 5.0 in object recognition and quality of captions and takes an average of 7.5 s from the moment one grabs a picture until one receives the verbalization in Spanish.

AB - The inclusion of visually impaired people to daily life is a challenging and active area of research. This work studies how to bring information about the surroundings to people delivered as verbal descriptions in Spanish using wearable devices. We use a neural network (DenseCap) for both identifying objects and generating phrases about them. DenseCap is running on a server to describe an image fed from a smartphone application, and its output is the text which a smartphone verbalizes. Our implementation achieves a mean Average Precision (mAP) of 5.0 in object recognition and quality of captions and takes an average of 7.5 s from the moment one grabs a picture until one receives the verbalization in Spanish.

KW - Computer vision

KW - Deep learning

KW - Image captioning

KW - Spanish language

UR - http://www.scopus.com/inward/record.url?scp=85049321962&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-92198-3_15

DO - 10.1007/978-3-319-92198-3_15

M3 - Contribución a la conferencia

AN - SCOPUS:85049321962

SN - 9783319921976

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 145

EP - 154

BT - Pattern Recognition - 10th Mexican Conference, MCPR 2018, Proceedings

A2 - Martinez-Trinidad, Jose Francisco

A2 - Carrasco-Ochoa, Jesus Ariel

A2 - Olvera-Lopez, Jose Arturo

A2 - Sarkar, Sudeep

PB - Springer Verlag

T2 - 10th Mexican Conference on Pattern Recognition, MCPR 2018

Y2 - 27 June 2018 through 30 June 2018

ER -

Gomez-Garay A, Raducanu B, Salas J. Dense Captioning of Natural Scenes in Spanish. En Martinez-Trinidad JF, Carrasco-Ochoa JA, Olvera-Lopez JA, Sarkar S, editores, Pattern Recognition - 10th Mexican Conference, MCPR 2018, Proceedings. Springer Verlag. 2018. p. 145-154. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-319-92198-3_15

Dense Captioning of Natural Scenes in Spanish

Resumen

Serie de la publicación

Conferencia

Acceder al documento

Otros archivos y enlaces

Huella

Citar esto