Multimodal sentiment analysis using hierarchical fusion with context modeling

N. Majumder; D. Hazarika; A. Gelbukh; E. Cambria; S. Poria

doi:10.1016/j.knosys.2018.07.041

Multimodal sentiment analysis using hierarchical fusion with context modeling

N. Majumder, D. Hazarika, A. Gelbukh, E. Cambria, S. Poria

Centro de Investigación en Computación (CIC)

Research output: Contribution to journal › Article › peer-review

256 Scopus citations

Abstract

Multimodal sentiment analysis is a very actively growing field of research. A promising area of opportunity in this field is to improve the multimodal fusion mechanism. We present a novel feature fusion strategy that proceeds in a hierarchical fashion, first fusing the modalities two in two and only then fusing all three modalities. On multimodal sentiment analysis of individual utterances, our strategy outperforms conventional concatenation of features by 1%, which amounts to 5% reduction in error rate. On utterance-level multimodal sentiment analysis of multi-utterance video clips, for which current state-of-the-art techniques incorporate contextual information from other utterances of the same clip, our hierarchical fusion gives up to 2.4% (almost 10% error rate reduction) over currently used concatenation. The implementation of our method is publicly available in the form of open-source code.

Original language	English
Pages (from-to)	124-133
Number of pages	10
Journal	Knowledge-Based Systems
Volume	161
DOIs	https://doi.org/10.1016/j.knosys.2018.07.041
State	Published - 1 Dec 2018

Keywords

Multimodal fusion
Sentiment analysis

Access to Document

10.1016/j.knosys.2018.07.041

Cite this

@article{f77b5bc467da4c02990e8d364b224b6e,

title = "Multimodal sentiment analysis using hierarchical fusion with context modeling",

abstract = "Multimodal sentiment analysis is a very actively growing field of research. A promising area of opportunity in this field is to improve the multimodal fusion mechanism. We present a novel feature fusion strategy that proceeds in a hierarchical fashion, first fusing the modalities two in two and only then fusing all three modalities. On multimodal sentiment analysis of individual utterances, our strategy outperforms conventional concatenation of features by 1%, which amounts to 5% reduction in error rate. On utterance-level multimodal sentiment analysis of multi-utterance video clips, for which current state-of-the-art techniques incorporate contextual information from other utterances of the same clip, our hierarchical fusion gives up to 2.4% (almost 10% error rate reduction) over currently used concatenation. The implementation of our method is publicly available in the form of open-source code.",

keywords = "Multimodal fusion, Sentiment analysis",

author = "N. Majumder and D. Hazarika and A. Gelbukh and E. Cambria and S. Poria",

note = "Publisher Copyright: {\textcopyright} 2018",

year = "2018",

month = dec,

day = "1",

doi = "10.1016/j.knosys.2018.07.041",

language = "Ingl{\'e}s",

volume = "161",

pages = "124--133",

journal = "Knowledge-Based Systems",

issn = "0950-7051",

}

TY - JOUR

T1 - Multimodal sentiment analysis using hierarchical fusion with context modeling

AU - Majumder, N.

AU - Hazarika, D.

AU - Gelbukh, A.

AU - Cambria, E.

AU - Poria, S.

PY - 2018/12/1

Y1 - 2018/12/1

N2 - Multimodal sentiment analysis is a very actively growing field of research. A promising area of opportunity in this field is to improve the multimodal fusion mechanism. We present a novel feature fusion strategy that proceeds in a hierarchical fashion, first fusing the modalities two in two and only then fusing all three modalities. On multimodal sentiment analysis of individual utterances, our strategy outperforms conventional concatenation of features by 1%, which amounts to 5% reduction in error rate. On utterance-level multimodal sentiment analysis of multi-utterance video clips, for which current state-of-the-art techniques incorporate contextual information from other utterances of the same clip, our hierarchical fusion gives up to 2.4% (almost 10% error rate reduction) over currently used concatenation. The implementation of our method is publicly available in the form of open-source code.

AB - Multimodal sentiment analysis is a very actively growing field of research. A promising area of opportunity in this field is to improve the multimodal fusion mechanism. We present a novel feature fusion strategy that proceeds in a hierarchical fashion, first fusing the modalities two in two and only then fusing all three modalities. On multimodal sentiment analysis of individual utterances, our strategy outperforms conventional concatenation of features by 1%, which amounts to 5% reduction in error rate. On utterance-level multimodal sentiment analysis of multi-utterance video clips, for which current state-of-the-art techniques incorporate contextual information from other utterances of the same clip, our hierarchical fusion gives up to 2.4% (almost 10% error rate reduction) over currently used concatenation. The implementation of our method is publicly available in the form of open-source code.

KW - Multimodal fusion

KW - Sentiment analysis

UR - http://www.scopus.com/inward/record.url?scp=85050999093&partnerID=8YFLogxK

U2 - 10.1016/j.knosys.2018.07.041

DO - 10.1016/j.knosys.2018.07.041

M3 - Artículo

SN - 0950-7051

VL - 161

SP - 124

EP - 133

JO - Knowledge-Based Systems

JF - Knowledge-Based Systems

ER -

Multimodal sentiment analysis using hierarchical fusion with context modeling

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this