TY - GEN
T1 - Comparison of Real-time CNN-based Methods for Finger-level Hand Segmentation
AU - Benitez-Garcia, Gibran
AU - Takayama, Natsuki
AU - Olivares-Mercado, Jesus
AU - Sanchez-Perez, Gabriel
AU - Takahashi, Hiroki
N1 - Publisher Copyright:
© 2022 SPIE.
PY - 2022
Y1 - 2022
N2 - Hand segmentation is usually considered a pixel-wise binary classification problem, where the foreground hand is meant to be recognized in an input image. However, we envision that finger-level hand segmentation is more useful for applications like hand gesture and sign language recognition. Therefore, in this paper, we compare five state-of-the-art (SOTA) real-time semantic segmentation methods for the task of finger-level hand segmentation. To do that, we introduce two subsets consisted of 1,000 images manually annotated pixel-wise selected from new proposed datasets of hand gesture and world-level sign language recognition. With these subsets, we evaluate the accuracy of the recent SOTA methods of DABNet, FastSCNN, FC-HardNet, FASSDNet, and DDRNet. Since each subset has relatively few images (500), we introduce a simple yet effective loss function to train with synthetic data that includes the same annotations. Finally, we present a real-time performance evaluation of the five algorithms on the NVIDIA Jetson family of GPU-powered embedded systems, including Jetson Xavier NX, Jetson TX2, and Jetson Nano.
AB - Hand segmentation is usually considered a pixel-wise binary classification problem, where the foreground hand is meant to be recognized in an input image. However, we envision that finger-level hand segmentation is more useful for applications like hand gesture and sign language recognition. Therefore, in this paper, we compare five state-of-the-art (SOTA) real-time semantic segmentation methods for the task of finger-level hand segmentation. To do that, we introduce two subsets consisted of 1,000 images manually annotated pixel-wise selected from new proposed datasets of hand gesture and world-level sign language recognition. With these subsets, we evaluate the accuracy of the recent SOTA methods of DABNet, FastSCNN, FC-HardNet, FASSDNet, and DDRNet. Since each subset has relatively few images (500), we introduce a simple yet effective loss function to train with synthetic data that includes the same annotations. Finally, we present a real-time performance evaluation of the five algorithms on the NVIDIA Jetson family of GPU-powered embedded systems, including Jetson Xavier NX, Jetson TX2, and Jetson Nano.
KW - Hang segmentation
KW - finger segmentation
KW - real-time CNN
UR - http://www.scopus.com/inward/record.url?scp=85131790686&partnerID=8YFLogxK
U2 - 10.1117/12.2626091
DO - 10.1117/12.2626091
M3 - Contribución a la conferencia
AN - SCOPUS:85131790686
T3 - Proceedings of SPIE - The International Society for Optical Engineering
BT - International Workshop on Advanced Imaging Technology, IWAIT 2022
A2 - Nakajima, Masayuki
A2 - Muramatsu, Shogo
A2 - Kim, Jae-Gon
A2 - Guo, Jing-Ming
A2 - Kemao, Qian
PB - SPIE
T2 - 2022 International Workshop on Advanced Imaging Technology, IWAIT 2022
Y2 - 4 January 2022 through 6 January 2022
ER -