[1]拉毛吉,蔡郁青,仁增多杰,等.基于知识融合的藏语方言语体转换方法研究[J].计算机技术与发展,2025,(06):137-144.[doi:10.20165/j.cnki.ISSN1673-629X.2025.0019]
 Lhamao Kyi,CAI Yu-qing,Renzeng Duojie,et al.Research on Tibetan Dialect Style Conversion Method Based on Knowledge Fusion[J].,2025,(06):137-144.[doi:10.20165/j.cnki.ISSN1673-629X.2025.0019]
点击复制

基于知识融合的藏语方言语体转换方法研究()

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
期数:
2025年06期
页码:
137-144
栏目:
人工智能
出版日期:
2025-06-10

文章信息/Info

Title:
Research on Tibetan Dialect Style Conversion Method Based on Knowledge Fusion
文章编号:
1673-629X(2025)06-0137-08
作者:
拉毛吉123蔡郁青123仁增多杰123钱木吉123尼玛扎西123*
1. 西藏大学 信息科学技术学院,西藏 拉萨 850000;
2. 省部共建藏语智能信息处理及应用国家重点实验室,西藏 拉萨 850000;
3. 西藏大学 藏文信息技术教育部工程研究中心,西藏 拉萨 850000
Author(s):
Lhamao Kyi123CAI Yu-qing123Renzeng Duojie123Zom Kyi123Nyima Tashi123*
1. School of Information Science and Technology,Tibet University,Lhasa 850000,China;
2. State Key Laboratory of Tibetan Intelligent Information Processing and Application,Lhasa 850000,China;
3. Engineering Research Center of Tibetan Language Information Technology of Ministry of Education,Tibet University,Lhasa 850000,China
关键词:
藏语方言知识融合语体转换知识库
Keywords:
Tibetan languagedialectknowledge fusionstyle conversionknowledge base
分类号:
TP391.1;H272
DOI:
10.20165/j.cnki.ISSN1673-629X.2025.0019
摘要:
藏语具有丰富的方言体系,主要包括卫藏方言、安多方言和康巴方言,以及许多具有地域特色的次方言。 各方言之间存在的用词和语法差异增加了语音识别、机器翻译和语音数据增强等自然语言处理任务的难度。 为此,该文提出了一种基于知识融合的藏语方言文本到书面语的语体转换方法。 该方法首先分析了藏语方言的语法和词汇研究现状,以及藏文虚词在不同方言中的应用特征,并构建了方言虚词知识库。 其次,基于已有的藏语三大方言的口语词典,经过筛选构建了藏语方言实词与对应书面语实词的口语词典。 最后,将已有的规则、知识库和口语词典进行知识融合,完成了藏语方言语体转换。 实验结果显示,平均准确率为 71. 45% ,其中卫藏方言、安多方言、康巴方言的准确率分别为 76. 59% 、73. 16% 、64. 62% 。 这些结果不仅证明了该方法的有效性,也证实了所构建知识库的实用性和可行性。
Abstract:
Tibetan has a rich dialect system,mainly including the Tibetan dialect,Ando dialect and Kangba dialect,as well as many sub-dects with regional characteristics. Differences in words and grammar between dialects increase the difficulty of natural language processing tasks such as speech recognition,machine translation and voice data enhancement. To this end,we propose a method of style conversion from Tibetan dialect text to written language based on knowledge fusion. This method first analyzes the current situation of grammar and vocabulary research in Tibetan dialects, as well as the application characteristics of Tibetan virtual words in different dialects,and builds a knowledge base of dialect virtual words. Secondly,based on the existing oral dictionaries of the three major Tibetan dialects,a spoken dictionary of Tibetan dialect real words and corresponding written real words has been selected and constructed. Finally,the knowledge of existing rules, knowledge base and spoken dictionaries is integrated to complete the Tibetan dialect transformation. The experimental results showed that the average accuracy was 71. 45% ,including 76. 59% in the Wei-Tibetan dialect, 73. 16% in the Ando dialect,and 64. 62% in the Kangba dialect. These results not only prove the effectiveness of the proposed method, but also confirm the practicability and feasibility of the knowledge base constructed.

相似文献/References:

[1]步寅硕,仁增多杰,格桑加措,等.基于端到端的藏汉语音翻译[J].计算机技术与发展,2025,(06):166.[doi:10.20165/j.cnki.ISSN1673-629X.2025.0042]
 BU Yin-shuo,Renzeng Duojie,Kalzang Gyatso,et al.Tibetan-Chinese Speech-to-speech Translation Based on End-to-end[J].,2025,(06):166.[doi:10.20165/j.cnki.ISSN1673-629X.2025.0042]

更新日期/Last Update: 2025-06-10