Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis

Soujanya Poria; Erik Cambria; Alexander Gelbukh

doi:10.18653/v1/d15-1303

Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis

Soujanya Poria, Erik Cambria, Alexander Gelbukh

Centro de Investigación en Computación (CIC)

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

395 Scopus citations

Abstract

We present a novel way of extracting features from short texts, based on the activation values of an inner layer of a deep convolutional neural network. We use the extracted features in multimodal sentiment analysis of short video clips representing one sentence each. We use the combined feature vectors of textual, visual, and audio modalities to train a classifier based on multiple kernel learning, which is known to be good at heterogeneous data. We obtain 14% performance improvement over the state of the art and present a parallelizable decision-level data fusion method, which is much faster, though slightly less accurate.

Original language	English
Title of host publication	Conference Proceedings - EMNLP 2015
Subtitle of host publication	Conference on Empirical Methods in Natural Language Processing
Publisher	Association for Computational Linguistics (ACL)
Pages	2539-2544
Number of pages	6
ISBN (Electronic)	9781941643327
DOIs	https://doi.org/10.18653/v1/d15-1303
State	Published - 2015
Event	Conference on Empirical Methods in Natural Language Processing, EMNLP 2015 - Lisbon, Portugal Duration: 17 Sep 2015 → 21 Sep 2015

Publication series

Name	Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing

Conference

Conference	Conference on Empirical Methods in Natural Language Processing, EMNLP 2015
Country/Territory	Portugal
City	Lisbon
Period	17/09/15 → 21/09/15

Access to Document

10.18653/v1/d15-1303

Cite this

Poria, S., Cambria, E., & Gelbukh, A. (2015). Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis. In Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing (pp. 2539-2544). (Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/d15-1303

Poria, Soujanya ; Cambria, Erik ; Gelbukh, Alexander. / Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis. Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics (ACL), 2015. pp. 2539-2544 (Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing).

@inproceedings{237d8855ea2f4941865fb28550b52ee9,

title = "Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis",

abstract = "We present a novel way of extracting features from short texts, based on the activation values of an inner layer of a deep convolutional neural network. We use the extracted features in multimodal sentiment analysis of short video clips representing one sentence each. We use the combined feature vectors of textual, visual, and audio modalities to train a classifier based on multiple kernel learning, which is known to be good at heterogeneous data. We obtain 14% performance improvement over the state of the art and present a parallelizable decision-level data fusion method, which is much faster, though slightly less accurate.",

author = "Soujanya Poria and Erik Cambria and Alexander Gelbukh",

note = "Publisher Copyright: {\textcopyright} 2015 Association for Computational Linguistics.; Conference on Empirical Methods in Natural Language Processing, EMNLP 2015 ; Conference date: 17-09-2015 Through 21-09-2015",

year = "2015",

doi = "10.18653/v1/d15-1303",

language = "Ingl{\'e}s",

series = "Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing",

publisher = "Association for Computational Linguistics (ACL)",

pages = "2539--2544",

booktitle = "Conference Proceedings - EMNLP 2015",

}

Poria, S, Cambria, E & Gelbukh, A 2015, Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis. in Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing. Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics (ACL), pp. 2539-2544, Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, 17/09/15. https://doi.org/10.18653/v1/d15-1303

Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis. / Poria, Soujanya; Cambria, Erik; Gelbukh, Alexander.
Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics (ACL), 2015. p. 2539-2544 (Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis

AU - Poria, Soujanya

AU - Cambria, Erik

AU - Gelbukh, Alexander

PY - 2015

Y1 - 2015

N2 - We present a novel way of extracting features from short texts, based on the activation values of an inner layer of a deep convolutional neural network. We use the extracted features in multimodal sentiment analysis of short video clips representing one sentence each. We use the combined feature vectors of textual, visual, and audio modalities to train a classifier based on multiple kernel learning, which is known to be good at heterogeneous data. We obtain 14% performance improvement over the state of the art and present a parallelizable decision-level data fusion method, which is much faster, though slightly less accurate.

AB - We present a novel way of extracting features from short texts, based on the activation values of an inner layer of a deep convolutional neural network. We use the extracted features in multimodal sentiment analysis of short video clips representing one sentence each. We use the combined feature vectors of textual, visual, and audio modalities to train a classifier based on multiple kernel learning, which is known to be good at heterogeneous data. We obtain 14% performance improvement over the state of the art and present a parallelizable decision-level data fusion method, which is much faster, though slightly less accurate.

UR - http://www.scopus.com/inward/record.url?scp=84943617823&partnerID=8YFLogxK

U2 - 10.18653/v1/d15-1303

DO - 10.18653/v1/d15-1303

M3 - Contribución a la conferencia

AN - SCOPUS:84943617823

T3 - Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing

SP - 2539

EP - 2544

BT - Conference Proceedings - EMNLP 2015

PB - Association for Computational Linguistics (ACL)

T2 - Conference on Empirical Methods in Natural Language Processing, EMNLP 2015

Y2 - 17 September 2015 through 21 September 2015

ER -

Poria S, Cambria E, Gelbukh A. Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis. In Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics (ACL). 2015. p. 2539-2544. (Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing). doi: 10.18653/v1/d15-1303

Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis

Abstract

Publication series

Conference

Access to Document

Other files and links

Fingerprint

Cite this