[1]吴俊清,倪建成,魏媛媛.语音情感识别中面向小数据集的 CGRU 方法[J].计算机技术与发展,2020,30(12):77-82.[doi:10. 3969 / j. issn. 1673-629X. 2020. 12. 014]
 WU Jun-qing,NI Jian-cheng,WEI Yuan-yuan.CGRU Method for Small Datasets in Speech Emotion Recognition[J].,2020,30(12):77-82.[doi:10. 3969 / j. issn. 1673-629X. 2020. 12. 014]
点击复制

语音情感识别中面向小数据集的 CGRU 方法()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
30
期数:
2020年12期
页码:
77-82
栏目:
智能、算法、系统工程
出版日期:
2020-12-10

文章信息/Info

Title:
CGRU Method for Small Datasets in Speech Emotion Recognition
文章编号:
1673-629X(2020)12-0077-06
作者:
吴俊清倪建成魏媛媛
曲阜师范大学 软件学院,山东 曲阜 273100
Author(s):
WU Jun-qingNI Jian-chengWEI Yuan-yuan
School of Software,Qufu Normal University,Qufu 273100,China
关键词:
语音情感识别卷积神经网络门控循环单元小数据集谱图特征
Keywords:
speech emotion recognition (SER)convolutional neural networksgated recurrent unitsmall datasetsspectral feature
分类号:
TP18
DOI:
10. 3969 / j. issn. 1673-629X. 2020. 12. 014
摘要:
为增强人机交互的和谐,提升语音情感识别的精度,提出一种面向小数据集的 CGRU 深度学习方法。该方法将原始音频通过上移和下移操作进行语音增强,将增强后的语音信号映射到 Mel 尺度并生成 Mel 功率谱图,然后对其做旋转、切角、偏移等图像增强操作,并结合卷积神经网络(CNN)对频域特征的捕捉能力和门控循环单元(GRU)网络对时序信息的特性获取能力构成融合模型 CGRU,该模型通过自动学习深度谱特征进行情感识别。实验分别验证了利用谱图特征与手工特征在 Emo-DB 上的识别效果,并比较了 CLSTM 与 CGRU 的时间性能。 结果表明,利用谱图特征在 CGRU 方法上的情感识别精度达到 98. 39% ,超过传统手工特征 eGeMAPS 在该数据库上的识别效果,提出的方法在语音情感识别任务上获得有竞争力的表现。 另外,在相同的训练参数下,CGRU 要比 CLSTM 具有更加良好的时间性能。
Abstract:
In order to enhance the harmony of human-computer interaction and improve the accuracy of speech emotion recognition,a CGRU deep learning method for small datasets is proposed.In this method, the original audio is enhanced by moving up and down. The enhanced speech signal is mapped to the Mel scale to generate a Mel power spectrum,and then image enhancement operations such as rotation,chamfering and shifting are performed.Combining the ability of the convolutional neural network (CNN) to capture the frequency domain features and the ability of the gated recurrent unit (GRU) network to acquire the time sequence information,a fusion model CGRU is formed. This model automatically learns deep spectrum features to perform emotion recognition. The experiments verify the recognition effect of spectral features and manual features on Emo-DB,and compare the time perfo-rmance of CLSTM and CGRU. The results show that the accuracy of emotion recognition using the CGRU method of spectral features reaches 98.39%, which exceeds the recognition effect of the traditional manual feature eGeMAPS. The proposed method achieves competitive performance on speech emotion recognition tasks. In addition,under the same training parameters,CGRU has better performance of time than CLSTM.

相似文献/References:

[1]石瑛 胡学钢 方磊.基于决策树的多特征语音情感识别[J].计算机技术与发展,2009,(01):147.
 SHI Ying,HU Xue-gang,FANG Lei.Research of Speech Emotion Recognition Based on Decision Tree and Acoustic Features[J].,2009,(12):147.
[2]王健,韩志艳.基于正交实验设计的语音情感识别参数优化[J].计算机技术与发展,2013,(03):109.
 WANG Jian,HAN Zhi-yan.Parameter Optimization of Speech Emotion Recognition Based on Orthogonal Test Design[J].,2013,(12):109.
[3]崔凤焦.表情识别算法研究进展与性能比较[J].计算机技术与发展,2018,28(02):145.[doi:10.3969/j.issn.1673-629X.2018.02.031]
 CUI Feng-jiao.Research and Performance Comparison of Facial Expression Recognition Algorithm[J].,2018,28(12):145.[doi:10.3969/j.issn.1673-629X.2018.02.031]
[4]张丹丹,李雷. 基于PCANet-RF的人脸检测系统[J].计算机技术与发展,2016,26(02):31.
 ZHANG Dan-dan,LI Lei. Face Detection System Based on PCANet-RF[J].,2016,26(12):31.
[5]陈强锐,谢世朋.基于深度学习的肺部肿瘤检测方法[J].计算机技术与发展,2018,28(04):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
 CHEN Qiang-rui,XIE Shi-peng.Lung Cancer Detection Method Based on Deep Learning[J].,2018,28(12):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
[6]郭子琰,舒心,刘常燕,等.基于ReLU 函数的卷积神经网络的花卉识别算法[J].计算机技术与发展,2018,28(05):154.[doi:10.3969/j.issn.1673-629X.2018.05.035]
 GUO Ziyan,SHU Xin,LIU Changyan,et al.A Recognition Algorithm of Flower Based on Convolution Neural Network with ReLU Function[J].,2018,28(12):154.[doi:10.3969/j.issn.1673-629X.2018.05.035]
[7]缪宇杰,吴智钧,宫 婧.基于3D 卷积的视频错帧筛选方法[J].计算机技术与发展,2018,28(05):179.[doi:10.3969/ j. issn.1673-629X.2018.05.040]
 MIAO Yu-jie,WU Zhi-jun,GONG Jing.A Wrong Temporal-order Frames Identification Method Based on 3D Convolution[J].,2018,28(12):179.[doi:10.3969/ j. issn.1673-629X.2018.05.040]
[8]吴玉枝,吴志红,熊运余.基于卷积神经网络的小样本车辆检测与识别[J].计算机技术与发展,2018,28(06):1.[doi:10.3969/ j. issn.1673-629X.2018.06.001]
 WU Yu-zhi,WU Zhi-hong,XIONG Yun-yu.Vehicle Detection and Recognition of a Few Samples Based on Convolutional Neural Network[J].,2018,28(12):1.[doi:10.3969/ j. issn.1673-629X.2018.06.001]
[9]李相桥,李晨,田丽华,等.卷积神经网络并行训练的优化研究[J].计算机技术与发展,2018,28(08):12.[doi:10.3969/ j. issn.1673-629X.2018.08.003]
 LI Xiang-qiao,LI Chen,TIAN Li-hua,et al.Research on Optimization of Parallel Training for Convolution Neural Network[J].,2018,28(12):12.[doi:10.3969/ j. issn.1673-629X.2018.08.003]
[10]邓宗平,赵启军,陈虎. 基于深度学习的人脸姿态分类方法[J].计算机技术与发展,2016,26(07):11.
 DEND Zong-ping,ZHAO Qi-jun,CHEN Hu. Face Pose Classification Method Based on Deep Learning[J].,2016,26(12):11.

更新日期/Last Update: 2020-12-10