Semantic segmentation in egocentric video frames with deep learning for recognition of activities of daily living

José A. Zamorano Raya; Mireya S. García Vázquez; Juan C. Jaimes Méndez; Abraham Montoya Obeso; Jorge L. Compean Aguirre; Alejandro A. Ramírez Acosta

doi:10.1117/12.2529834

Semantic segmentation in egocentric video frames with deep learning for recognition of activities of daily living

José A. Zamorano Raya, Mireya S. García Vázquez, Juan C. Jaimes Méndez, Abraham Montoya Obeso, Jorge L. Compean Aguirre, Alejandro A. Ramírez Acosta

Centro de Investigación y Desarrollo de Tecnología Digital (CITEDI)

Producción científica: Capítulo del libro/informe/acta de congreso › Contribución a la conferencia › revisión exhaustiva

Resumen

The analysis of videos for the recognition of Instrumental Activities of Daily Living (IADL) through the detection of objects and the context analysis, applied for the evaluation of patient's capacity with Alzheimer's disease and age related dementia, has recently gained a lot of interest. The incorporation of human perception in the recognition tasks, search, detection and visual content understanding has become one of the main tools for the development of systems and technologies that support the performance of people in their daily life activities. In this paper we propose a model of automatic segmentation of the saliency region where the objects of interest are found in egocentric video using fully convolutional networks (FCN). The segmentation is performed with the information regarding to human perception, obtaining a better segmentation at pixel level. This segmentation involves objects of interest and the salient region in egocentric videos, providing precise information to detection systems and automatic indexing of objects in video, where these systems have improved their performance in the recognition of IADL. To measure models segmentation performance of the salient region, we benchmark two databases; first, Georgia-Tech-Egocentric-Activity database and second, our own database.

Idioma original	Inglés
Título de la publicación alojada	Applications of Machine Learning
Editores	Michael E. Zelinski, Tarek M. Taha, Jonathan Howe, Abdul A. S. Awwal, Khan M. Iftekharuddin
Editorial	SPIE
ISBN (versión digital)	9781510629714
DOI	https://doi.org/10.1117/12.2529834
Estado	Publicada - 2019
Evento	Applications of Machine Learning 2019 - San Diego, Estados Unidos Duración: 13 ago. 2019 → 14 ago. 2019

Serie de la publicación

Nombre	Proceedings of SPIE - The International Society for Optical Engineering
Volumen	11139
ISSN (versión impresa)	0277-786X
ISSN (versión digital)	1996-756X

Conferencia

Conferencia	Applications of Machine Learning 2019
País/Territorio	Estados Unidos
Ciudad	San Diego
Período	13/08/19 → 14/08/19

ODS de las Naciones Unidas

Este resultado contribuye a los siguientes Objetivos de Desarrollo Sostenible

Acceder al documento

10.1117/12.2529834

Otros archivos y enlaces

Enlace a la publicación en Scopus

Citar esto

Zamorano Raya, J. A., García Vázquez, M. S., Jaimes Méndez, J. C., Montoya Obeso, A., Compean Aguirre, J. L., & Ramírez Acosta, A. A. (2019). Semantic segmentation in egocentric video frames with deep learning for recognition of activities of daily living. En M. E. Zelinski, T. M. Taha, J. Howe, A. A. S. Awwal, & K. M. Iftekharuddin (Eds.), Applications of Machine Learning Artículo 1113909 (Proceedings of SPIE - The International Society for Optical Engineering; Vol. 11139). SPIE. https://doi.org/10.1117/12.2529834

Zamorano Raya, José A. ; García Vázquez, Mireya S. ; Jaimes Méndez, Juan C. et al. / Semantic segmentation in egocentric video frames with deep learning for recognition of activities of daily living. Applications of Machine Learning. editor / Michael E. Zelinski ; Tarek M. Taha ; Jonathan Howe ; Abdul A. S. Awwal ; Khan M. Iftekharuddin. SPIE, 2019. (Proceedings of SPIE - The International Society for Optical Engineering).

@inproceedings{4086e313192a4de3a33d11c4828d2fbc,

title = "Semantic segmentation in egocentric video frames with deep learning for recognition of activities of daily living",

abstract = "The analysis of videos for the recognition of Instrumental Activities of Daily Living (IADL) through the detection of objects and the context analysis, applied for the evaluation of patient's capacity with Alzheimer's disease and age related dementia, has recently gained a lot of interest. The incorporation of human perception in the recognition tasks, search, detection and visual content understanding has become one of the main tools for the development of systems and technologies that support the performance of people in their daily life activities. In this paper we propose a model of automatic segmentation of the saliency region where the objects of interest are found in egocentric video using fully convolutional networks (FCN). The segmentation is performed with the information regarding to human perception, obtaining a better segmentation at pixel level. This segmentation involves objects of interest and the salient region in egocentric videos, providing precise information to detection systems and automatic indexing of objects in video, where these systems have improved their performance in the recognition of IADL. To measure models segmentation performance of the salient region, we benchmark two databases; first, Georgia-Tech-Egocentric-Activity database and second, our own database.",

keywords = "Deep CNN, Dementia diseases, Egocentric video, FCN, Instrumental activities of daily living, Object recognition, Saliency, Semantic segmentation",

author = "{Zamorano Raya}, {Jos{\'e} A.} and {Garc{\'i}a V{\'a}zquez}, {Mireya S.} and {Jaimes M{\'e}ndez}, {Juan C.} and {Montoya Obeso}, Abraham and {Compean Aguirre}, {Jorge L.} and {Ram{\'i}rez Acosta}, {Alejandro A.}",

note = "Publisher Copyright: {\textcopyright} COPYRIGHT SPIE. Downloading of the abstract is permitted for personal use only.; Applications of Machine Learning 2019 ; Conference date: 13-08-2019 Through 14-08-2019",

year = "2019",

doi = "10.1117/12.2529834",

language = "Ingl{\'e}s",

series = "Proceedings of SPIE - The International Society for Optical Engineering",

publisher = "SPIE",

editor = "Zelinski, {Michael E.} and Taha, {Tarek M.} and Jonathan Howe and Awwal, {Abdul A. S.} and Iftekharuddin, {Khan M.}",

booktitle = "Applications of Machine Learning",

address = "Estados Unidos",

}

Zamorano Raya, JA, García Vázquez, MS, Jaimes Méndez, JC, Montoya Obeso, A, Compean Aguirre, JL & Ramírez Acosta, AA 2019, Semantic segmentation in egocentric video frames with deep learning for recognition of activities of daily living. En ME Zelinski, TM Taha, J Howe, AAS Awwal & KM Iftekharuddin (eds.), Applications of Machine Learning., 1113909, Proceedings of SPIE - The International Society for Optical Engineering, vol. 11139, SPIE, Applications of Machine Learning 2019, San Diego, Estados Unidos, 13/08/19. https://doi.org/10.1117/12.2529834

Semantic segmentation in egocentric video frames with deep learning for recognition of activities of daily living. / Zamorano Raya, José A.; García Vázquez, Mireya S.; Jaimes Méndez, Juan C. et al.
Applications of Machine Learning. ed. / Michael E. Zelinski; Tarek M. Taha; Jonathan Howe; Abdul A. S. Awwal; Khan M. Iftekharuddin. SPIE, 2019. 1113909 (Proceedings of SPIE - The International Society for Optical Engineering; Vol. 11139).

Producción científica: Capítulo del libro/informe/acta de congreso › Contribución a la conferencia › revisión exhaustiva

TY - GEN

T1 - Semantic segmentation in egocentric video frames with deep learning for recognition of activities of daily living

AU - Zamorano Raya, José A.

AU - García Vázquez, Mireya S.

AU - Jaimes Méndez, Juan C.

AU - Montoya Obeso, Abraham

AU - Compean Aguirre, Jorge L.

AU - Ramírez Acosta, Alejandro A.

N1 - Publisher Copyright: © COPYRIGHT SPIE. Downloading of the abstract is permitted for personal use only.

PY - 2019

Y1 - 2019

N2 - The analysis of videos for the recognition of Instrumental Activities of Daily Living (IADL) through the detection of objects and the context analysis, applied for the evaluation of patient's capacity with Alzheimer's disease and age related dementia, has recently gained a lot of interest. The incorporation of human perception in the recognition tasks, search, detection and visual content understanding has become one of the main tools for the development of systems and technologies that support the performance of people in their daily life activities. In this paper we propose a model of automatic segmentation of the saliency region where the objects of interest are found in egocentric video using fully convolutional networks (FCN). The segmentation is performed with the information regarding to human perception, obtaining a better segmentation at pixel level. This segmentation involves objects of interest and the salient region in egocentric videos, providing precise information to detection systems and automatic indexing of objects in video, where these systems have improved their performance in the recognition of IADL. To measure models segmentation performance of the salient region, we benchmark two databases; first, Georgia-Tech-Egocentric-Activity database and second, our own database.

AB - The analysis of videos for the recognition of Instrumental Activities of Daily Living (IADL) through the detection of objects and the context analysis, applied for the evaluation of patient's capacity with Alzheimer's disease and age related dementia, has recently gained a lot of interest. The incorporation of human perception in the recognition tasks, search, detection and visual content understanding has become one of the main tools for the development of systems and technologies that support the performance of people in their daily life activities. In this paper we propose a model of automatic segmentation of the saliency region where the objects of interest are found in egocentric video using fully convolutional networks (FCN). The segmentation is performed with the information regarding to human perception, obtaining a better segmentation at pixel level. This segmentation involves objects of interest and the salient region in egocentric videos, providing precise information to detection systems and automatic indexing of objects in video, where these systems have improved their performance in the recognition of IADL. To measure models segmentation performance of the salient region, we benchmark two databases; first, Georgia-Tech-Egocentric-Activity database and second, our own database.

KW - Deep CNN

KW - Dementia diseases

KW - Egocentric video

KW - FCN

KW - Instrumental activities of daily living

KW - Object recognition

KW - Saliency

KW - Semantic segmentation

UR - http://www.scopus.com/inward/record.url?scp=85075751605&partnerID=8YFLogxK

U2 - 10.1117/12.2529834

DO - 10.1117/12.2529834

M3 - Contribución a la conferencia

AN - SCOPUS:85075751605

T3 - Proceedings of SPIE - The International Society for Optical Engineering

BT - Applications of Machine Learning

A2 - Zelinski, Michael E.

A2 - Taha, Tarek M.

A2 - Howe, Jonathan

A2 - Awwal, Abdul A. S.

A2 - Iftekharuddin, Khan M.

PB - SPIE

T2 - Applications of Machine Learning 2019

Y2 - 13 August 2019 through 14 August 2019

ER -

Zamorano Raya JA, García Vázquez MS, Jaimes Méndez JC, Montoya Obeso A, Compean Aguirre JL, Ramírez Acosta AA. Semantic segmentation in egocentric video frames with deep learning for recognition of activities of daily living. En Zelinski ME, Taha TM, Howe J, Awwal AAS, Iftekharuddin KM, editores, Applications of Machine Learning. SPIE. 2019. 1113909. (Proceedings of SPIE - The International Society for Optical Engineering). doi: 10.1117/12.2529834

Semantic segmentation in egocentric video frames with deep learning for recognition of activities of daily living

Resumen

Serie de la publicación

Conferencia

ODS de las Naciones Unidas

Acceder al documento

Otros archivos y enlaces

Huella

Citar esto