Saliency-based selection of visual content for deep convolutional neural networks: Application to architectural style classification

A. Montoya Obeso, J. Benois-Pineau, M. S.García Vázquez, A. A.Ramírez Acosta

Research output: Contribution to journalArticlepeer-review

10 Scopus citations

Abstract

The automatic description of digital multimedia content was mainly developed for classification tasks, retrieval systems and massive ordering of data. Preservation of cultural heritage is a field of high importance of application of these methods. We address classification problem in cultural heritage such as classification of architectural styles in digital photographs of Mexican cultural heritage. In general, the selection of relevant content in the scene for training classification models makes the models more efficient in terms of accuracy and training time. Here we use a saliency-driven approach to predict visual attention in images and use it to train a Deep Convolutional Neural Network. Also, we present an analysis of the behavior of the models trained under the state-of-the-art image cropping and the saliency maps. To train invariant models to rotations, data augmentation of training set is required, which posses problems of filling normalization of crops, we study were different padding techniques and we find an optimal solution. The results are compared with the state-of-the-art in terms of accuracy and training time. Furthermore, we are studying saliency cropping in training and generalization for another classical task such as weak labeling of massive collections of images containing objects of interest. Here the experiments are conducted on a large subset of ImageNet database. This work is an extension of preliminary research in terms of image padding methods and generalization on large scale generic database.

Original languageEnglish
Pages (from-to)9553-9576
Number of pages24
JournalMultimedia Tools and Applications
Volume78
Issue number8
DOIs
StatePublished - 1 Apr 2019

Keywords

  • Cultural heritage
  • Data selection
  • Deep learning
  • Visual attention prediction

Fingerprint

Dive into the research topics of 'Saliency-based selection of visual content for deep convolutional neural networks: Application to architectural style classification'. Together they form a unique fingerprint.

Cite this