Improving Depth Estimation by Embedding Semantic Segmentation: A Hybrid CNN Model

José E. Valdez-Rodríguez; Hiram Calvo; Edgardo Felipe-Riverón; Marco A. Moreno-Armendáriz

doi:10.3390/s22041669

Improving Depth Estimation by Embedding Semantic Segmentation: A Hybrid CNN Model

José E. Valdez-Rodríguez, Hiram Calvo, Edgardo Felipe-Riverón, Marco A. Moreno-Armendáriz

Centro de Investigación en Computación (CIC)

Producción científica: Contribución a una revista › Artículo › revisión exhaustiva

12 Citas (Scopus)

Resumen

Single image depth estimation works fail to separate foreground elements because they can easily be confounded with the background. To alleviate this problem, we propose the use of a semantic segmentation procedure that adds information to a depth estimator, in this case, a 3D Convolutional Neural Network (CNN)—segmentation is coded as one-hot planes representing categories of objects. We explore 2D and 3D models. Particularly, we propose a hybrid 2D–3D CNN architecture capable of obtaining semantic segmentation and depth estimation at the same time. We tested our procedure on the SYNTHIA-AL dataset and obtained σ₃ = 0.95, which is an improvement of 0.14 points (compared with the state of the art of σ₃ = 0.81) by using manual segmentation, and σ₃ = 0.89 using automatic semantic segmentation, proving that depth estimation is improved when the shape and position of objects in a scene are known.

Idioma original	Inglés
Número de artículo	1669
Publicación	Sensors
Volumen	22
N.º	4
DOI	https://doi.org/10.3390/s22041669
Estado	Publicada - 1 feb. 2022

Acceder al documento

10.3390/s22041669

Otros archivos y enlaces

Enlace a la publicación en Scopus

Citar esto

@article{61efc3c6345641f3b00a382d9a9402a9,

title = "Improving Depth Estimation by Embedding Semantic Segmentation: A Hybrid CNN Model",

abstract = "Single image depth estimation works fail to separate foreground elements because they can easily be confounded with the background. To alleviate this problem, we propose the use of a semantic segmentation procedure that adds information to a depth estimator, in this case, a 3D Convolutional Neural Network (CNN)—segmentation is coded as one-hot planes representing categories of objects. We explore 2D and 3D models. Particularly, we propose a hybrid 2D–3D CNN architecture capable of obtaining semantic segmentation and depth estimation at the same time. We tested our procedure on the SYNTHIA-AL dataset and obtained σ3 = 0.95, which is an improvement of 0.14 points (compared with the state of the art of σ3 = 0.81) by using manual segmentation, and σ3 = 0.89 using automatic semantic segmentation, proving that depth estimation is improved when the shape and position of objects in a scene are known.",

keywords = "3D CNN, Depth estimation, Hybrid convolutional neural networks, Semantic segmentation",

author = "Valdez-Rodr{\'i}guez, {Jos{\'e} E.} and Hiram Calvo and Edgardo Felipe-River{\'o}n and Moreno-Armend{\'a}riz, {Marco A.}",

note = "Publisher Copyright: {\textcopyright} 2022 by the authors. Licensee MDPI, Basel, Switzerland.",

year = "2022",

month = feb,

day = "1",

doi = "10.3390/s22041669",

language = "Ingl{\'e}s",

volume = "22",

journal = "Sensors",

issn = "1424-8220",

number = "4",

}

TY - JOUR

T1 - Improving Depth Estimation by Embedding Semantic Segmentation

T2 - A Hybrid CNN Model

AU - Valdez-Rodríguez, José E.

AU - Calvo, Hiram

AU - Felipe-Riverón, Edgardo

AU - Moreno-Armendáriz, Marco A.

PY - 2022/2/1

Y1 - 2022/2/1

N2 - Single image depth estimation works fail to separate foreground elements because they can easily be confounded with the background. To alleviate this problem, we propose the use of a semantic segmentation procedure that adds information to a depth estimator, in this case, a 3D Convolutional Neural Network (CNN)—segmentation is coded as one-hot planes representing categories of objects. We explore 2D and 3D models. Particularly, we propose a hybrid 2D–3D CNN architecture capable of obtaining semantic segmentation and depth estimation at the same time. We tested our procedure on the SYNTHIA-AL dataset and obtained σ3 = 0.95, which is an improvement of 0.14 points (compared with the state of the art of σ3 = 0.81) by using manual segmentation, and σ3 = 0.89 using automatic semantic segmentation, proving that depth estimation is improved when the shape and position of objects in a scene are known.

AB - Single image depth estimation works fail to separate foreground elements because they can easily be confounded with the background. To alleviate this problem, we propose the use of a semantic segmentation procedure that adds information to a depth estimator, in this case, a 3D Convolutional Neural Network (CNN)—segmentation is coded as one-hot planes representing categories of objects. We explore 2D and 3D models. Particularly, we propose a hybrid 2D–3D CNN architecture capable of obtaining semantic segmentation and depth estimation at the same time. We tested our procedure on the SYNTHIA-AL dataset and obtained σ3 = 0.95, which is an improvement of 0.14 points (compared with the state of the art of σ3 = 0.81) by using manual segmentation, and σ3 = 0.89 using automatic semantic segmentation, proving that depth estimation is improved when the shape and position of objects in a scene are known.

KW - 3D CNN

KW - Depth estimation

KW - Hybrid convolutional neural networks

KW - Semantic segmentation

UR - http://www.scopus.com/inward/record.url?scp=85125012408&partnerID=8YFLogxK

U2 - 10.3390/s22041669

DO - 10.3390/s22041669

M3 - Artículo

C2 - 35214571

AN - SCOPUS:85125012408

SN - 1424-8220

VL - 22

JO - Sensors

JF - Sensors

IS - 4

M1 - 1669

ER -

Improving Depth Estimation by Embedding Semantic Segmentation: A Hybrid CNN Model

Resumen

Acceder al documento

Otros archivos y enlaces

Huella

Citar esto