TY - JOUR
T1 - FASSD-Net
T2 - Fast and Accurate Real-Time Semantic Segmentation for Embedded Systems
AU - Rosas-Arias, Leonel
AU - Benitez-Garcia, Gibran
AU - Portillo-Portillo, Jose
AU - Olivares-Mercado, Jesus
AU - Sanchez-Perez, Gabriel
AU - Yanai, Keiji
N1 - Publisher Copyright:
IEEE
PY - 2021
Y1 - 2021
N2 - Recent works of real-time semantic segmentation, remove or make use of light decoders from dense deep neural networks to achieve fast inference speed. This strategy helps to achieve real-time performance; however, the accuracy is significantly compromised in comparison to non-real-time methods. In this paper, we introduce two key modules aimed to design a high-performance decoder for real-time semantic segmentation, which also reduces the accuracy gap between real-time and non-real-time networks. The first module, Dilated Asymmetric Pyramidal Fusion (DAPF), is designed to increase the receptive field on the top of the last stage of the encoder, obtaining richer contextual features. The second module, Multi-resolution Dilated Asymmetric (MDA) module, fuses and refines detail and contextual information from multi-scale feature maps coming from early and deeper stages of the network. Both modules are designed to keep a low computational complexity by using asymmetric convolutions. With these modules, we propose a network entitled ``FASSD-Net,'' which is based on a light-weight CNN backbone. Running on a single Nvidia GTX 1080Ti, our model reaches 77.5% and 69.3% of mIoU, at 41 and 80 FPS on the Cityscapes and CamVid datasets, respectively. We present an extensive analysis of the accuracy-speed tradeoffs of three FASSD-Net variations on different embedded systems, demonstrating that a light version of our network can run on the low-power consumption Jetson Xavier NX, at 32 FPS reaching 74% of mIoU with full resolution (1024x 2048). The source code and pre-trained models are available at github.com/GibranBenitez/FASSD-Net.
AB - Recent works of real-time semantic segmentation, remove or make use of light decoders from dense deep neural networks to achieve fast inference speed. This strategy helps to achieve real-time performance; however, the accuracy is significantly compromised in comparison to non-real-time methods. In this paper, we introduce two key modules aimed to design a high-performance decoder for real-time semantic segmentation, which also reduces the accuracy gap between real-time and non-real-time networks. The first module, Dilated Asymmetric Pyramidal Fusion (DAPF), is designed to increase the receptive field on the top of the last stage of the encoder, obtaining richer contextual features. The second module, Multi-resolution Dilated Asymmetric (MDA) module, fuses and refines detail and contextual information from multi-scale feature maps coming from early and deeper stages of the network. Both modules are designed to keep a low computational complexity by using asymmetric convolutions. With these modules, we propose a network entitled ``FASSD-Net,'' which is based on a light-weight CNN backbone. Running on a single Nvidia GTX 1080Ti, our model reaches 77.5% and 69.3% of mIoU, at 41 and 80 FPS on the Cityscapes and CamVid datasets, respectively. We present an extensive analysis of the accuracy-speed tradeoffs of three FASSD-Net variations on different embedded systems, demonstrating that a light version of our network can run on the low-power consumption Jetson Xavier NX, at 32 FPS reaching 74% of mIoU with full resolution (1024x 2048). The source code and pre-trained models are available at github.com/GibranBenitez/FASSD-Net.
KW - Convolutional codes
KW - Decoding
KW - Embedded systems
KW - HarDNet
KW - Image segmentation
KW - Jetson Xavier NX.
KW - Real-time systems
KW - Semantic segmentation
KW - Semantics
KW - Task analysis
KW - embedded systems
KW - fully convolutional networks
KW - spatial pyramid pooling
UR - http://www.scopus.com/inward/record.url?scp=85107475654&partnerID=8YFLogxK
U2 - 10.1109/TITS.2021.3127553
DO - 10.1109/TITS.2021.3127553
M3 - Artículo
AN - SCOPUS:85107475654
SN - 1524-9050
JO - IEEE Transactions on Intelligent Transportation Systems
JF - IEEE Transactions on Intelligent Transportation Systems
ER -