Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal Sentiment Analysis

Wei Han; Hui Chen; Alexander Gelbukh; Amir Zadeh; Louis Philippe Morency; Soujanya Poria

doi:10.1145/3462244.3479919

Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal Sentiment Analysis

Wei Han, Hui Chen, Alexander Gelbukh, Amir Zadeh, Louis Philippe Morency, Soujanya Poria

Centro de Investigación en Computación (CIC)

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

91 Scopus citations

Abstract

Multimodal sentiment analysis aims to extract and integrate semantic information collected from multiple modalities to recognize the expressed emotions and sentiment in multimodal data. This research area's major concern lies in developing an extraordinary fusion scheme that can extract and integrate key information from various modalities. However, previous work is restricted by the lack of leveraging dynamics of independence and correlation between modalities to reach top performance. To mitigate this, we propose the Bi-Bimodal Fusion Network (BBFN), a novel end-to-end network that performs fusion (relevance increment) and separation (difference increment) on pairwise modality representations. The two parts are trained simultaneously such that the combat between them is simulated. The model takes two bimodal pairs as input due to the known information imbalance among modalities. In addition, we leverage a gated control mechanism in the Transformer architecture to further improve the final output. Experimental results on three datasets (CMU-MOSI, CMU-MOSEI, and UR-FUNNY) verifies that our model significantly outperforms the SOTA. The implementation of this work is available at https://github.com/declare-lab/multimodal-deep-learning and https://github.com/declare-lab/BBFN.

Original language	English
Title of host publication	ICMI 2021 - Proceedings of the 2021 International Conference on Multimodal Interaction
Publisher	Association for Computing Machinery, Inc
Pages	6-15
Number of pages	10
ISBN (Electronic)	9781450384810
DOIs	https://doi.org/10.1145/3462244.3479919
State	Published - 18 Oct 2021
Event	23rd ACM International Conference on Multimodal Interaction, ICMI 2021 - Virtual, Online, Canada Duration: 18 Oct 2021 → 22 Oct 2021

Publication series

Name	ICMI 2021 - Proceedings of the 2021 International Conference on Multimodal Interaction

Conference

Conference	23rd ACM International Conference on Multimodal Interaction, ICMI 2021
Country/Territory	Canada
City	Virtual, Online
Period	18/10/21 → 22/10/21

Keywords

cross-modal processing
multimodal fusion
multimodal representations

Access to Document

10.1145/3462244.3479919

Cite this

Han, W., Chen, H., Gelbukh, A., Zadeh, A., Morency, L. P., & Poria, S. (2021). Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal Sentiment Analysis. In ICMI 2021 - Proceedings of the 2021 International Conference on Multimodal Interaction (pp. 6-15). (ICMI 2021 - Proceedings of the 2021 International Conference on Multimodal Interaction). Association for Computing Machinery, Inc. https://doi.org/10.1145/3462244.3479919

Han, Wei ; Chen, Hui ; Gelbukh, Alexander et al. / Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal Sentiment Analysis. ICMI 2021 - Proceedings of the 2021 International Conference on Multimodal Interaction. Association for Computing Machinery, Inc, 2021. pp. 6-15 (ICMI 2021 - Proceedings of the 2021 International Conference on Multimodal Interaction).

@inproceedings{19cbc025d690482bbd9d4cdcd830fed4,

title = "Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal Sentiment Analysis",

abstract = "Multimodal sentiment analysis aims to extract and integrate semantic information collected from multiple modalities to recognize the expressed emotions and sentiment in multimodal data. This research area's major concern lies in developing an extraordinary fusion scheme that can extract and integrate key information from various modalities. However, previous work is restricted by the lack of leveraging dynamics of independence and correlation between modalities to reach top performance. To mitigate this, we propose the Bi-Bimodal Fusion Network (BBFN), a novel end-to-end network that performs fusion (relevance increment) and separation (difference increment) on pairwise modality representations. The two parts are trained simultaneously such that the combat between them is simulated. The model takes two bimodal pairs as input due to the known information imbalance among modalities. In addition, we leverage a gated control mechanism in the Transformer architecture to further improve the final output. Experimental results on three datasets (CMU-MOSI, CMU-MOSEI, and UR-FUNNY) verifies that our model significantly outperforms the SOTA. The implementation of this work is available at https://github.com/declare-lab/multimodal-deep-learning and https://github.com/declare-lab/BBFN.",

keywords = "cross-modal processing, multimodal fusion, multimodal representations",

author = "Wei Han and Hui Chen and Alexander Gelbukh and Amir Zadeh and Morency, {Louis Philippe} and Soujanya Poria",

note = "Publisher Copyright: {\textcopyright} 2021 ACM.; 23rd ACM International Conference on Multimodal Interaction, ICMI 2021 ; Conference date: 18-10-2021 Through 22-10-2021",

year = "2021",

month = oct,

day = "18",

doi = "10.1145/3462244.3479919",

language = "Ingl{\'e}s",

series = "ICMI 2021 - Proceedings of the 2021 International Conference on Multimodal Interaction",

publisher = "Association for Computing Machinery, Inc",

pages = "6--15",

booktitle = "ICMI 2021 - Proceedings of the 2021 International Conference on Multimodal Interaction",

}

Han, W, Chen, H, Gelbukh, A, Zadeh, A, Morency, LP & Poria, S 2021, Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal Sentiment Analysis. in ICMI 2021 - Proceedings of the 2021 International Conference on Multimodal Interaction. ICMI 2021 - Proceedings of the 2021 International Conference on Multimodal Interaction, Association for Computing Machinery, Inc, pp. 6-15, 23rd ACM International Conference on Multimodal Interaction, ICMI 2021, Virtual, Online, Canada, 18/10/21. https://doi.org/10.1145/3462244.3479919

Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal Sentiment Analysis. / Han, Wei; Chen, Hui; Gelbukh, Alexander et al.
ICMI 2021 - Proceedings of the 2021 International Conference on Multimodal Interaction. Association for Computing Machinery, Inc, 2021. p. 6-15 (ICMI 2021 - Proceedings of the 2021 International Conference on Multimodal Interaction).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal Sentiment Analysis

AU - Han, Wei

AU - Chen, Hui

AU - Gelbukh, Alexander

AU - Zadeh, Amir

AU - Morency, Louis Philippe

AU - Poria, Soujanya

PY - 2021/10/18

Y1 - 2021/10/18

N2 - Multimodal sentiment analysis aims to extract and integrate semantic information collected from multiple modalities to recognize the expressed emotions and sentiment in multimodal data. This research area's major concern lies in developing an extraordinary fusion scheme that can extract and integrate key information from various modalities. However, previous work is restricted by the lack of leveraging dynamics of independence and correlation between modalities to reach top performance. To mitigate this, we propose the Bi-Bimodal Fusion Network (BBFN), a novel end-to-end network that performs fusion (relevance increment) and separation (difference increment) on pairwise modality representations. The two parts are trained simultaneously such that the combat between them is simulated. The model takes two bimodal pairs as input due to the known information imbalance among modalities. In addition, we leverage a gated control mechanism in the Transformer architecture to further improve the final output. Experimental results on three datasets (CMU-MOSI, CMU-MOSEI, and UR-FUNNY) verifies that our model significantly outperforms the SOTA. The implementation of this work is available at https://github.com/declare-lab/multimodal-deep-learning and https://github.com/declare-lab/BBFN.

AB - Multimodal sentiment analysis aims to extract and integrate semantic information collected from multiple modalities to recognize the expressed emotions and sentiment in multimodal data. This research area's major concern lies in developing an extraordinary fusion scheme that can extract and integrate key information from various modalities. However, previous work is restricted by the lack of leveraging dynamics of independence and correlation between modalities to reach top performance. To mitigate this, we propose the Bi-Bimodal Fusion Network (BBFN), a novel end-to-end network that performs fusion (relevance increment) and separation (difference increment) on pairwise modality representations. The two parts are trained simultaneously such that the combat between them is simulated. The model takes two bimodal pairs as input due to the known information imbalance among modalities. In addition, we leverage a gated control mechanism in the Transformer architecture to further improve the final output. Experimental results on three datasets (CMU-MOSI, CMU-MOSEI, and UR-FUNNY) verifies that our model significantly outperforms the SOTA. The implementation of this work is available at https://github.com/declare-lab/multimodal-deep-learning and https://github.com/declare-lab/BBFN.

KW - cross-modal processing

KW - multimodal fusion

KW - multimodal representations

UR - http://www.scopus.com/inward/record.url?scp=85119017553&partnerID=8YFLogxK

U2 - 10.1145/3462244.3479919

DO - 10.1145/3462244.3479919

M3 - Contribución a la conferencia

AN - SCOPUS:85119017553

T3 - ICMI 2021 - Proceedings of the 2021 International Conference on Multimodal Interaction

SP - 6

EP - 15

BT - ICMI 2021 - Proceedings of the 2021 International Conference on Multimodal Interaction

PB - Association for Computing Machinery, Inc

T2 - 23rd ACM International Conference on Multimodal Interaction, ICMI 2021

Y2 - 18 October 2021 through 22 October 2021

ER -

Han W, Chen H, Gelbukh A, Zadeh A, Morency LP, Poria S. Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal Sentiment Analysis. In ICMI 2021 - Proceedings of the 2021 International Conference on Multimodal Interaction. Association for Computing Machinery, Inc. 2021. p. 6-15. (ICMI 2021 - Proceedings of the 2021 International Conference on Multimodal Interaction). doi: 10.1145/3462244.3479919

Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal Sentiment Analysis

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this