语音情感识别论文:Emotion Recognition From Speech With Recurrent Neural Networks
问题:Should it be one emotion per whole recording or per one utterance? If one chooses utterance-based solution then how the split should be done? Is it possible for the utterance to have multiple emotions?
论文提出的方法:CTC损失函数
什么是CTC损失函数?https://blog.****.net/luodongri/article/details/77005948,我的理解是预测序列与标签序列的长度肯定是不一样的,所以用CTC损失函数根据预测序列来计算真实标签序列的概率。
论文模型:
数据集:IEMOCAP
实验结果: