[1]腾潇琦[],冯祥[],张翼飞[][]. 一种自适应建模的VAD方法[J].计算机技术与发展,2016,26(09):26-29.
 TENG Xiao-qi[],FENG Xiang[],ZHANG Yi-fei[][]. An Voice Activity Detection of Adaptive Modeling[J].,2016,26(09):26-29.
点击复制

 一种自适应建模的VAD方法()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
26
期数:
2016年09期
页码:
26-29
栏目:
应用开发研究
出版日期:
2016-09-10

文章信息/Info

Title:
 An Voice Activity Detection of Adaptive Modeling
文章编号:
1673-629X(2016)09-0026-04
作者:
 腾潇琦[1] 冯祥[2] 张翼飞[2][3]
1.北京市互联网信息办公室;2.讯飞智元信息科技有限公司;3.上海大学 机电工程与自动化学院
Author(s):
 TENG Xiao-qi[1] FENG Xiang[2] ZHANG Yi-fei[2][3]
关键词:
 语音活动检测  能量VAD 模型VAD 自适应建模
Keywords:
 voice activity detection energy VAD model VAD adaptive modeling
分类号:
TP301
文献标志码:
A
摘要:
 语音活动检测( Voice Activity Detection,VAD)是语音前端特征处理的一个重要环节,它直接影响到后续处理的效果和效率。主流的模型VAD对训练数据的依赖度过高,在不同场景下需要重新训练不同的模型,这带来的数据标注的工作量是非常惊人的。一种自适应建模的VAD方法结合了能量VAD和模型VAD的优点,成功地解决了这个问题。它对每一条语音在线地训练出语音和非语音模型,根据每一帧在模型上的似然度得分给它们打上标签,经过平滑后就可以很好地找到语音的起点和终点。实验结果表明,该方法取得了很好的效果, F1指标相比传统能量VAD提升了0.031,说话人分离错误率下降了0.45%。
Abstract:
 Voice Activity Detection ( VAD) is an important part of speech front-end features processing which directly affects the effec-tiveness and efficiency of subsequent processing. Because of over-dependence on training data,the model VAD must train different model in different scenarios that will bring many tasks of data labeling. A VAD method of adaptive modeling,which combines with the advanta-ges of energy VAD and model VAD,solves the problem successfully. It trains speech model and non-speech model online to each voice and labels each frame according to the likelihood score of different model,then the endpoint of voice can be get. The experiments show thatthis method has achieved a good result. It makes the F1 parameters increased 0. 031 and error rate of speaker separation decreased by 0. 45% compared with the traditional energy VAD.

相似文献/References:

[1]张志宏,吴庆波,邵立松,等.基于飞腾平台TOE协议栈的设计与实现[J].计算机技术与发展,2014,24(07):1.
 ZHANG Zhi-hong,WU Qing-bo,SHAO Li-song,et al. Design and Implementation of TCP/IP Offload Engine Protocol Stack Based on FT Platform[J].,2014,24(09):1.
[2]梁文快,李毅. 改进的基因表达算法对航班优化排序问题研究[J].计算机技术与发展,2014,24(07):5.
 LIANG Wen-kuai,LI Yi. Research on Optimization of Flight Scheduling Problem Based on Improved Gene Expression Algorithm[J].,2014,24(09):5.
[3]黄静,王枫,谢志新,等. EAST文档管理系统的设计与实现[J].计算机技术与发展,2014,24(07):13.
 HUANG Jing,WANG Feng,XIE Zhi-xin,et al. Design and Implementation of EAST Document Management System[J].,2014,24(09):13.
[4]侯善江[],张代远[][][]. 基于样条权函数神经网络P2P流量识别方法[J].计算机技术与发展,2014,24(07):21.
 HOU Shan-jiang[],ZHANG Dai-yuan[][][]. P2P Traffic Identification Based on Spline Weight Function Neural Network[J].,2014,24(09):21.
[5]李璨,耿国华,李康,等. 一种基于三维模型的文物碎片线图生成方法[J].计算机技术与发展,2014,24(07):25.
 LI Can,GENG Guo-hua,LI Kang,et al. A Method of Obtaining Cultural Debris’ s Line Chart Based on Three-dimensional Model[J].,2014,24(09):25.
[6]翁鹤,皮德常. 混沌RBF神经网络异常检测算法[J].计算机技术与发展,2014,24(07):29.
 WENG He,PI De-chang. Chaotic RBF Neural Network Anomaly Detection Algorithm[J].,2014,24(09):29.
[7]刘茜[],荆晓远[],李文倩[],等. 基于流形学习的正交稀疏保留投影[J].计算机技术与发展,2014,24(07):34.
 LIU Qian[],JING Xiao-yuan[,LI Wen-qian[],et al. Orthogonal Sparsity Preserving Projections Based on Manifold Learning[J].,2014,24(09):34.
[8]尚福华,李想,巩淼. 基于模糊框架-产生式知识表示及推理研究[J].计算机技术与发展,2014,24(07):38.
 SHANG Fu-hua,LI Xiang,GONG Miao. Research on Knowledge Representation and Inference Based on Fuzzy Framework-production[J].,2014,24(09):38.
[9]叶偲,李良福,肖樟树. 一种去除运动目标重影的图像镶嵌方法研究[J].计算机技术与发展,2014,24(07):43.
 YE Si,LI Liang-fu,XIAO Zhang-shu. Research of an Image Mosaic Method for Removing Ghost of Moving Targets[J].,2014,24(09):43.
[10]余松平[][],蔡志平[],吴建进[],等. GSM-R信令监测选择录音系统设计与实现[J].计算机技术与发展,2014,24(07):47.
 YU Song-ping[][],CAI Zhi-ping[] WU Jian-jin[],GU Feng-zhi[]. Design and Implementation of an Optional Voice Recording System Based on GSM-R Signaling Monitoring[J].,2014,24(09):47.

更新日期/Last Update: 2016-10-24