[1]林舒都,邵曦.基于i-vector和深度学习的说话人识别[J].计算机技术与发展,2017,27(06):66-71.
 LIN Shu-du,SHAO Xi. Speaker Recognition with i-vector and Deep Learning[J].,2017,27(06):66-71.
点击复制

基于i-vector和深度学习的说话人识别()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
27
期数:
2017年06期
页码:
66-71
栏目:
智能、算法、系统工程
出版日期:
2017-06-10

文章信息/Info

Title:
 Speaker Recognition with i-vector and Deep Learning
文章编号:
1673-629X(2017)06-0066-06
作者:
 林舒都邵曦
 南京邮电大学 通信与信息工程学院
Author(s):
 LIN Shu-duSHAO Xi
关键词:
 说话人识别深度神经网络i-vector声纹特征
Keywords:
 speaker recognitionDNNi-vectorvoiceprint
分类号:
TP301
文献标志码:
A
摘要:
 为了提高说话人识别系统的性能,在研究基础上提出了一种将深度神经网络(Deep Neural Nerwork,DNN)模型成果与i-vector模型相结合的新方案.该方案通过有效的神经网络构建,准确地提取了说话人语音里的隐藏信息.尽管DNN模型可以帮助挖掘很多信息,但是i-vector特征并没有完全覆盖住声纹特征的所有维度.为此,在i-vector特征的基础上继续提取维数更高的i-supervector特征,有效地避免了信息的不必要损失.为证明提出方案的可行性,采用对TIMIT等语音数据库630个说话人的语音进行了训练、验证和测试.验证实验结果表明,在提取i-vector特征的基础上提取i-supervector特征的说话人识别同等错误率有30%的降低,是一种有效的识别方法.
Abstract:
 To improve the performance of speaker recognition systems,a novel scheme combined DNN (Deep Neural Network) model with the i-vector model has been proposed.Via construction of network,the hidden information in the voice of speakers has been extracted accurately.Although DNN model can help dig a lot of information,the i-vector features have not completely cover all dimensions of voiceprint.Thus i-supervector characteristics of higher dimension have been drawn with the i-vector features,which have effectively avoided the unnecessary loss of information.Experiments on TIMIT and other speech databases which contain 630 the speaker’’s voices for training,validation and testing have been conducted to verify the proposed scheme.The results illustrate that the i-supervector features with i-vector features for speaker recognition have achieved 30% reduction of equal error rate that implies effectiveness of the identification method proposed.

相似文献/References:

[1]吴庆棋 林江云.基于聚类优化GMM提高说话人识别性能的研究[J].计算机技术与发展,2009,(04):35.
 WU Qing-qi,LIN Jiang-yun.A Study on GMM Optimization with Clustering for Improving Speaker Recognition[J].,2009,(06):35.
[2]但志平 郑胜.最小二乘向量机在说话人识别中的应用[J].计算机技术与发展,2007,(05):30.
 DAN Zhi-ping,ZHENG Sheng.Application of LS - SVM in Speaker Recognition[J].,2007,(06):30.
[3]张华 裘雪红.说话人识别中LPCCEP倒谱分量的相对重要性[J].计算机技术与发展,2006,(04):67.
 ZHANG Hua,QIU Xue-hong.On the Importance of Components of LPCCEP in Speaker Recognition[J].,2006,(06):67.
[4]张志宏,吴庆波,邵立松,等.基于飞腾平台TOE协议栈的设计与实现[J].计算机技术与发展,2014,24(07):1.
 ZHANG Zhi-hong,WU Qing-bo,SHAO Li-song,et al. Design and Implementation of TCP/IP Offload Engine Protocol Stack Based on FT Platform[J].,2014,24(06):1.
[5]梁文快,李毅. 改进的基因表达算法对航班优化排序问题研究[J].计算机技术与发展,2014,24(07):5.
 LIANG Wen-kuai,LI Yi. Research on Optimization of Flight Scheduling Problem Based on Improved Gene Expression Algorithm[J].,2014,24(06):5.
[6]黄静,王枫,谢志新,等. EAST文档管理系统的设计与实现[J].计算机技术与发展,2014,24(07):13.
 HUANG Jing,WANG Feng,XIE Zhi-xin,et al. Design and Implementation of EAST Document Management System[J].,2014,24(06):13.
[7]侯善江[],张代远[][][]. 基于样条权函数神经网络P2P流量识别方法[J].计算机技术与发展,2014,24(07):21.
 HOU Shan-jiang[],ZHANG Dai-yuan[][][]. P2P Traffic Identification Based on Spline Weight Function Neural Network[J].,2014,24(06):21.
[8]李璨,耿国华,李康,等. 一种基于三维模型的文物碎片线图生成方法[J].计算机技术与发展,2014,24(07):25.
 LI Can,GENG Guo-hua,LI Kang,et al. A Method of Obtaining Cultural Debris’ s Line Chart Based on Three-dimensional Model[J].,2014,24(06):25.
[9]翁鹤,皮德常. 混沌RBF神经网络异常检测算法[J].计算机技术与发展,2014,24(07):29.
 WENG He,PI De-chang. Chaotic RBF Neural Network Anomaly Detection Algorithm[J].,2014,24(06):29.
[10]刘茜[],荆晓远[],李文倩[],等. 基于流形学习的正交稀疏保留投影[J].计算机技术与发展,2014,24(07):34.
 LIU Qian[],JING Xiao-yuan[,LI Wen-qian[],et al. Orthogonal Sparsity Preserving Projections Based on Manifold Learning[J].,2014,24(06):34.
[11]李燕,陶定元,林乐. 基于DTW模型补偿的伪装语音说话人识别研究[J].计算机技术与发展,2017,27(01):93.
 LI Yan-ping,TAO Ding-yuan,LIN Le. Study on Electronic Disguised Voice Speaker Recognition Based on DTW Model Compensation[J].,2017,27(06):93.

更新日期/Last Update: 2017-07-26