[1]万 苗,任 杰 *,马 苗,等.多任务学习在中国方言分类中的应用研究[J].计算机技术与发展,2022,32(04):109-115.[doi:10. 3969 / j. issn. 1673-629X. 2022. 04. 019]
 WAN Miao,REN Jie *,MA Miao,et al.Chinese Dialect Classification via Multi-task Learning[J].,2022,32(04):109-115.[doi:10. 3969 / j. issn. 1673-629X. 2022. 04. 019]
点击复制

多任务学习在中国方言分类中的应用研究()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
32
期数:
2022年04期
页码:
109-115
栏目:
应用前沿与综合
出版日期:
2022-04-10

文章信息/Info

Title:
Chinese Dialect Classification via Multi-task Learning
文章编号:
1673-629X(2022)04-0109-07
作者:
万 苗1 任 杰1 * 马 苗1 曹 瑞2
1. 陕西师范大学 计算机科学学院,陕西 西安 710119;
2. 西北大学 信息科学与技术学院,陕西 西安 710127
Author(s):
WAN Miao1 REN Jie1 * MA Miao1 CAO Rui2
1. School of Computer Science,Shaanxi Normal University,Xi’an 710119,China;
2. School of Information and Technology,Northwest University,Xi’an 710127,China
关键词:
中国方言分类多任务学习神经网络MFCC神经网络参数共享
Keywords:
Chinese dialect recognitionmulti-task learningneural networkMFCCparameter sharing of neural network
分类号:
TP391
DOI:
10. 3969 / j. issn. 1673-629X. 2022. 04. 019
摘要:
近年来,随着深度学习技术在语音识别领域的出色表现,基于深度学习的语音识别系统被广泛应用于智能家居、智能客服、会议纪要、实时字幕等多个应用场景。 但由于中国民族众多,语言文化差异大、方言多样复杂等特点,给语音识别系统带来了很大的挑战,特别针对短时语音段方言识别任务,已有的中国方言分类系统性能依然较差。 针对特征参数梅尔倒谱系数( mel-scale frequency cepstral coefficients,MFCC) 进行研究分析,面向中国十种方言数据集构建基于深度学习的方言分类模型。 首先,针对 MFCC 构建基于短期记忆网络( long short-term memory,LSTM) 的单任务学习模型,准确率可达 79. 04% ;然后,深入挖掘方言地域特征,提出以方言所在区域为辅助任务的多任务模型,构建基于参数硬共享的多任务学习模型,实验结果显示,分类准确率最高可达 79. 96% ;最后,针对参数硬共享无法有效挖掘子任务间关联性的问题,首次提出基于参数稀疏共享的多任务学习模型,该模型通过联合训练,自动挖掘子任务间相关性,裁剪多余网络,并进行网络参数共享,实验结果显示,提出的基于 MFCC 特征的参数稀疏共享的多任务分类模型性能最优,分类准确率最高可达83. 59% 。
Abstract:
Recently,with the outstanding performance of deep learning technology in the field of speech recognition,deep learning-based speech recognition systems have been widely used in multiple application scenarios such as smart homes,intelligent customer service,meeting minutes,and real- time translation. However,due to a large number of ethnic groups in China,there are many differences in language and culture, which is a big challenge to the speech recognition system, especially for short - term speech segment dialect recognition tasks,the performance of the existing Chinese dialect classification system is still poor. We research and analyze MFCC and build the dialect classification model based on deep neural network for a data set of 10 dialects in China. First,we propose a single-task learning model based on the LSTM ( long short-term memory) for the three speech features. The single-task model achieves the highest accuracy of 79. 04% compared with the other two feature - based models. Second,we study the geographical features of dialects and propose a multi-task model that uses the area of the dialect as an auxiliary task and builds a hard parameter sharing based multi-task learning model. The results show that the performance of this model can achieve up to 79. 96% . Finally, due to the hard parameter sharing cannot effectively lean the correlation between sub - tasks, we propose a sparse parameter sharing based multi - task learning model. The model uses joint training to automatically learn the correlation between sub - tasks, prune redundant networks, and share network parameters. Experimental results show that compared with other SOTA method,the multi - task classification model based on MFCC features with sparse parameter sharing performs best,with a classification accuracy of 83. 59% .

相似文献/References:

[1]许棣华 王志坚.基于多任务学习的邮件过滤系统的研究[J].计算机技术与发展,2010,(10):137.
 XU Di-hua,WANG Zhi-jian.Research of Spam Filter System Based on Multitask Learning[J].,2010,(04):137.
[2]沈佳敏,鲍秉坤.基于深度学习的广告布局图片美学属性评价[J].计算机技术与发展,2021,31(03):39.[doi:10. 3969 / j. issn. 1673-629X. 2021. 03. 007]
 SHEN Jia-min,BAO Bing-kun.Aesthetic Attribute Evaluation of Advertising Layout Images Based on Deep Learning[J].,2021,31(04):39.[doi:10. 3969 / j. issn. 1673-629X. 2021. 03. 007]
[3]郭 辉,郭静纯,张 甜.基于梯度优化的多任务混合学习方法[J].计算机技术与发展,2021,31(10):7.[doi:10. 3969 / j. issn. 1673-629X. 2021. 10. 002]
 GUO Hui,GUO Jing-chun,ZHANG Tian.An Approach of Mixed Multi-task Learning Based on Gradient Optimization[J].,2021,31(04):7.[doi:10. 3969 / j. issn. 1673-629X. 2021. 10. 002]

更新日期/Last Update: 2022-04-10