[1]刘大鹏 尾关和彦 朱庆生.添加音素持续时间信息到频谱模型的说话人辨认研究[J].计算机技术与发展,2007,(05):156-159.
 LIU Da-peng,Kazuhiko Ozeki,ZHU Qing-sheng.Adding Phoneme Duration Information to Spectral Model : in Speaker Identification[J].,2007,(05):156-159.
点击复制

添加音素持续时间信息到频谱模型的说话人辨认研究()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
期数:
2007年05期
页码:
156-159
栏目:
智能、算法、系统工程
出版日期:
1900-01-01

文章信息/Info

Title:
Adding Phoneme Duration Information to Spectral Model : in Speaker Identification
文章编号:
1673-629X(2007)05-0156-04
作者:
刘大鹏12 尾关和彦2 朱庆生1
[1]重庆大学计算机学院[2]电气通信大学信息通信工程系
Author(s):
LIU Da-peng Kazuhiko Ozeki ZHU Qing-sheng
[1]Sch. of Computer Science, Chongqing University[2]Dept: of Information and Communication Engineering,The University of Electro- Communications
关键词:
说话人声音辨识高斯混合模型音素持续时间信息
Keywords:
speaker identification GMM phoneme duration information
分类号:
TP391.42 TN912.3
文献标志码:
A
摘要:
传统的声音识别系统通过短时声音频谱信息来辨识说话人.这种方法在某些条件下具有较好的性能。但是由于有些说话人特征隐藏在较长的语音片段中,通过添加长时信息可能会进一步提高系统的性能。在文中.音素持续时间信息被添加到传统模型上,以提高说话人辨识率。频谱信息是通过短时分析获得的,但音素持续时间的提取却属于长时分析,它需要更多的语音数据。通过大量语音数据探讨了音素持续时间信息对说话人辨识的有效性,提出2种方法来解决数据量小所引起的问题。实验结果表明,当说话人的声音模型被恰当建立时,即使在语音数据量小的情况下,音素持
Abstract:
Conventional speaker recognition systems use short - term spectral information to identify speakers. They perform well on some conditions. However, since a. part of speaker characteristics is. hidden in longer speech segments, the performance may be further improved by adding this long -term information. In this paper, phoneme duration information is added to the conventional model to improve the recognition rate. While spectral information is extracted by short - term analysis, extraeting phoneme duration information requires long - term analysis. Thus phoneme duration analysis usually needs more speech data than spectral analysis does. In the first part of this work, effectiveness of phoneme duration information is investigated by using a large amount of speech data. Then two methods are presented to solve the problem caused by only using a small amount of data. Results of the experiments show that phoneme duration information is effective to improve speaker identification performance even when using a small amount of speech data, if the speaker models are built .appropriately

相似文献/References:

[1]吴庆棋 林江云.基于聚类优化GMM提高说话人识别性能的研究[J].计算机技术与发展,2009,(04):35.
 WU Qing-qi,LIN Jiang-yun.A Study on GMM Optimization with Clustering for Improving Speaker Recognition[J].,2009,(05):35.
[2]翟继友 张鹏.高斯混合模型参数估值算法的优化[J].计算机技术与发展,2011,(11):145.
 ZHAI Ji-you,ZHANG Peng.Optimization of Parameter Estimation Based on Gaussian Mixture Model[J].,2011,(05):145.
[3]赵青 成谢锋 朱冬梅.基于改进MFCC和短时能量的咳嗽音身份识别[J].计算机技术与发展,2012,(06):82.
 ZHAO Qing,CHENG Xie-feng,ZHU Dong-mei.Cough Sound Identification Based on Improved MFCC and Short-time Energy[J].,2012,(05):82.
[4]李燕萍 张玲华.基于多时间尺度韵律特征分析的语音转换研究[J].计算机技术与发展,2012,(12):67.
 LI Yan-ping,ZHANG Ling-hua.Voice Conversion Research Based on Multi-time Scale Prosodic Feature Analysis[J].,2012,(05):67.
[5]辛月兰.基于超像素的Grabcut彩色图像分割[J].计算机技术与发展,2013,(07):48.
 XIN Yue-lan.Superpixel-based Grabcut Color Image Segmentation[J].,2013,(05):48.
[6]黄景星,吴伟隆,龙楚君,等.基于OpenCV的视频运动目标检测及其应用研究[J].计算机技术与发展,2014,24(03):15.
 HUANG Jing-xing,WU Wei-long,Long Chu-jun,et al.Study of Moving Object Detection in Video and Its Application Based on OpenCV[J].,2014,24(05):15.
[7]高蕾[],曹建忠[]. 基于可穿戴传感器的行为识别随机逼近模型[J].计算机技术与发展,2014,24(12):83.
 GAO Lei[],CAO Jian-zhong[]. Activity Recognition Using Stochastic Approximation Model Based on Wearable Sensor[J].,2014,24(05):83.
[8]蒋翠清,邵宏波. 基于MFCC与改进ACF的汽车声音识别算法研究[J].计算机技术与发展,2015,25(02):140.
 JIANG Cui-qing,SHAO Hong-bo. Research on Vehicle Audio Recognition Algorithm Based on MFCC and Improved ACF[J].,2015,25(05):140.
[9]李燕萍,林乐,陶定元. 基于GMM统计特性的电子伪装语音鉴定研究[J].计算机技术与发展,2017,27(01):103.
 LI Yan-ping,LIN Le,TAO Ding-yuan. Research on Identification of Electronic Disguised Voice Based on GMM Statistical Parameters[J].,2017,27(05):103.
[10]张小东,杜 宁,王 莉,等.一种高斯混合模型组合分类的机载 LiDAR 城区道路提取方法[J].计算机技术与发展,2021,31(02):60.[doi:10. 3969 / j. issn. 1673-629X. 2021. 02. 011]
 ZHANG Xiao-dong,DU Ning,WANG Li,et al.An Urban Road Extraction Method from Airborne LiDAR Based on Gaussian Mixture Model Combination Classification[J].,2021,31(05):60.[doi:10. 3969 / j. issn. 1673-629X. 2021. 02. 011]

备注/Memo

备注/Memo:
刘大鹏(19807-),男,山东莱州人,硕士研究生,研究方向为说话人识别;尾关和彦,教授,重庆大学顾问教授,研究方向为语言处理;朱庆生,教授,博士生导师,研究方向为图像及多媒体技术
更新日期/Last Update: 1900-01-01