TY - JOUR
T1 - Topic-Based Image Caption Generation
AU - Dash, Sandeep Kumar
AU - Acharya, Shantanu
AU - Pakray, Partha
AU - Das, Ranjita
AU - Gelbukh, Alexander
N1 - Publisher Copyright:
© 2019, King Fahd University of Petroleum & Minerals.
PY - 2020/4/1
Y1 - 2020/4/1
N2 - Image captioning is to generate captions for a given image based on the content of the image. To describe an image efficiently, it requires extracting as much information from it as possible. Apart from detecting the presence of objects and their relative orientation, the respective purpose intending the topic of the image is another vital information which can be incorporated with the model to improve the efficiency of the caption generation system. The sole aim is to put extra thrust on the context of the image imitating human approach, as the mere presence of objects which may not be related to the context representing the image should not be a part of the generated caption. In this work, the focus is on detecting the topic concerning the image so as to guide a novel deep learning-based encoder–decoder framework to generate captions for the image. The method is compared with some of the earlier state-of-the-art models based on the result obtained from MSCOCO 2017 training data set. BLEU, CIDEr, ROGUE-L, METEOR scores are used to measure the efficacy of the model which show improvement in performance of the caption generation process.
AB - Image captioning is to generate captions for a given image based on the content of the image. To describe an image efficiently, it requires extracting as much information from it as possible. Apart from detecting the presence of objects and their relative orientation, the respective purpose intending the topic of the image is another vital information which can be incorporated with the model to improve the efficiency of the caption generation system. The sole aim is to put extra thrust on the context of the image imitating human approach, as the mere presence of objects which may not be related to the context representing the image should not be a part of the generated caption. In this work, the focus is on detecting the topic concerning the image so as to guide a novel deep learning-based encoder–decoder framework to generate captions for the image. The method is compared with some of the earlier state-of-the-art models based on the result obtained from MSCOCO 2017 training data set. BLEU, CIDEr, ROGUE-L, METEOR scores are used to measure the efficacy of the model which show improvement in performance of the caption generation process.
KW - Deep learning
KW - Image caption generation
KW - Topic modelling
UR - http://www.scopus.com/inward/record.url?scp=85075899332&partnerID=8YFLogxK
U2 - 10.1007/s13369-019-04262-2
DO - 10.1007/s13369-019-04262-2
M3 - Artículo
AN - SCOPUS:85075899332
SN - 2193-567X
VL - 45
SP - 3025
EP - 3034
JO - Arabian Journal for Science and Engineering
JF - Arabian Journal for Science and Engineering
IS - 4
ER -