The laughing machine: Predicting humor in video

概要

Humor is a very important communication tool; yet, it is an open problem for machines to understand humor. In this paper, we build a new multimodal dataset for humor prediction that includes subtitles and video frames, as well as humor labels associated with video’s timestamps. On top of it, we present a model to predict whether a subtitle causes laughter. Our model uses the visual modality through facial expression and character name recognition, together with the verbal modality, to explore how the visual modality helps. In addition, we use an attention mechanism to adjust the weight for each modality to facilitate humor prediction. Interestingly, our experimental results show that the performance boost by combinations of different modalities, and the attention mechanism and the model mostly relies on the verbal modality.

収録
Proceedings - IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
Zekun Yang
Zekun Yang
博士後期課程学生
Noa Garcia
Noa Garcia
特任助教

Her research interests lie in computer vision and machine learning applied to visual retrieval and joint models of vision and language for high-level understanding tasks.

Chenhui Chu
Chenhui Chu
招へい准教授
中島悠太
中島悠太
准教授

コンピュータビジョン・パターン認識などの研究。ディープニューラルネットワークなどを用いた画像・映像の認識・理解を主に、自然言語処理を援用した応用研究などに従事。

関連項目