语音情感识别论文:Emotion Recognition From Speech With Recurrent Neural Networks

问题:Should it be one emotion per whole recording or per one utterance? If one chooses utterance-based solution then how the split should be done? Is it possible for the utterance to have multiple emotions?

论文提出的方法:CTC损失函数

                    什么是CTC损失函数?https://blog.****.net/luodongri/article/details/77005948,我的理解是预测序列与标签序列的长度肯定是不一样的,所以用CTC损失函数根据预测序列来计算真实标签序列的概率。

 

论文模型:

语音情感识别论文:Emotion Recognition From Speech With Recurrent Neural Networks

 

数据集:IEMOCAP

实验结果:

语音情感识别论文:Emotion Recognition From Speech With Recurrent Neural Networks