[1]周文卓,廖光忠.基于RBIEGP的中文医疗实体识别[J].计算机技术与发展,2025,(06):124-130.[doi:10.20165/j.cnki.ISSN1673-629X.2025.0024]
 ZHOU Wen-zhuo,LIAO Guang-zhong.Chinese Medical Entity Recognition Based on RBIEGP[J].,2025,(06):124-130.[doi:10.20165/j.cnki.ISSN1673-629X.2025.0024]
点击复制

基于RBIEGP的中文医疗实体识别()

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
期数:
2025年06期
页码:
124-130
栏目:
人工智能
出版日期:
2025-06-10

文章信息/Info

Title:
Chinese Medical Entity Recognition Based on RBIEGP
文章编号:
1673-629X(2025)06-0124-07
作者:
周文卓1廖光忠2
1. 武汉科技大学 计算机科学与技术学院,湖北 武汉 430065;
2. 武汉科技大学 智能信息处理与实时工业系统湖北省重点实验室,湖北 武汉 430065
Author(s):
ZHOU Wen-zhuo1LIAO Guang-zhong2
1. School of Computer Science and Technology,Wuhan University of Science and Technology,Wuhan 430065,China;
2. Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System,Wuhan University of Science and Technology,Wuhan 430065,China
关键词:
实体识别预训练全局指针网络注意力机制感受野
Keywords:
entity recognitionpre-trainingglobal pointer networkattention mechanismreceptive field
分类号:
TP391
DOI:
10.20165/j.cnki.ISSN1673-629X.2025.0024
摘要:
中文医疗文本的实体识别是自然语言处理领域的重点研究方向,文本的内在复杂性,包括术语的歧义性、实体的层级性以及对上下文信息的高度依赖,均有可能对实体识别任务的结果产生显著影响。 为此,提出一种基于 RBIEGP 模型的中文实体识别方法。 该方法首先利用 RoBERTa-wwm-ext 预训练模型对输入的中文医疗文本进行编码处理,以生成包含丰富语义信息的词向量序列;然后,将这些词向量序列送入 BiGRU 网络和集成了注意力机制的迭代扩张卷积神经网络,以捕获输入文本的上下文信息以及扩展感受野;最后,将这些融合了语法语义特征、上下文信息以及扩展感受野的特征一起输入到全局指针网络(Efficient Global Pointer,EGP),以进行实体类别的判定,并输出具有高准确度的实体类别序列。 实验结果表明,RBIEGP 模型在 CMeEE/ Yidu-S4k数据集上的F1分数分别达到了70.47%和83.02%,相较于一些现有的主流模型,分别提升了 2. 72 百分点和 1. 99 百分点。
Abstract:
Entity recognition in Chinese medical texts is a pivotal research area within the domain of natural language processing. The inherent complexity of texts, including the ambiguity of terminology, the hierarchical nature of entities, and the high dependence on contextual information,can significantly affect the outcomes of entity recognition tasks. To address these challenges, a novel entity recognition method based on the RBIEGP model is proposed. Initially,the input Chinese medical text is encoded using the RoBERTa-wwm-ext pre - trained model to generate a sequence of word vectors rich in semantic information. Subsequently, these word vector sequences are fed into a BiGRU network and an iteratively expanded convolutional neural network integrated with attention mechanisms to capture contextual text information and extend the receptive field. Finally,the features that integrate syntactic and semantic features, contextual information,and an extended receptive field are input into the Efficient Global Pointer (EGP) network for entity category de-termination,yielding a high-accuracy sequence of entity categories. Experimental results demonstrate that the proposed RBIEGP model a-chieves an F1 score of 70. 47% and 83. 02% on the CMeEE/ Yidu - S4k dataset, respectively, representing improvements of 2. 72 percentage points and 1. 99 percentage points over some of the current mainstream models.

相似文献/References:

[1]赵震,张龙昌. XML文档实体识别技术研究[J].计算机技术与发展,2014,24(10):84.
 ZHAO Zhen,ZHANG Long-chang. Research on Entity Identification Technology on XML Documents[J].,2014,24(06):84.
[2]赵君珂,张振宇,蔡开裕.基于自然语言处理的医学实体识别与标签提取[J].计算机技术与发展,2019,29(09):18.[doi:10. 3969 / j. issn. 1673-629X. 2019. 09. 004]
 ZHAO Jun-ke,ZHANG Zhen-yu,CAI Kai-yu.Medical Entity Recognition and Label Extraction Based on Natural Language Processing[J].,2019,29(06):18.[doi:10. 3969 / j. issn. 1673-629X. 2019. 09. 004]
[3]赵 伟,邓叶勋,赵建强*,等.基于强化语义的中文广告文本识别技术研究[J].计算机技术与发展,2021,31(03):65.[doi:10. 3969 / j. issn. 1673-629X. 2021. 03. 011]
 ZHAO Wei,DENG Ye-xun,ZHAO Jian-qiang*,et al.Research on Chinese Advertisement Text Recognition Based on Enhanced Semantic[J].,2021,31(06):65.[doi:10. 3969 / j. issn. 1673-629X. 2021. 03. 011]
[4]彭 怀,宋井宽,唐向红.基于信息匹配方法的中文知识库问答系统[J].计算机技术与发展,2022,32(02):14.[doi:10. 3969 / j. issn. 1673-629X. 2022. 02. 002]
 PENG Huai,SONG Jing-kuan,TANG Xiang-hong.Question Answering System of Chinese Knowledge Base Based on Information Matching Method[J].,2022,32(06):14.[doi:10. 3969 / j. issn. 1673-629X. 2022. 02. 002]
[5]毛宏亮,艾孜尔古丽,陈德刚.基于多头注意力的电网调度领域命名实体识别[J].计算机技术与发展,2023,33(02):181.[doi:10. 3969 / j. issn. 1673-629X. 2023. 02. 027]
 MAO Hong-liang,Azragul,CHEN De-gang.Named Entity Recognition in Grid Dispatch Domain Based on Multi-headed Attention[J].,2023,33(06):181.[doi:10. 3969 / j. issn. 1673-629X. 2023. 02. 027]
[6]张 鑫,冼广铭*,梅灏洋,等.基于 Span 方法和多叉解码树的实体关系抽取[J].计算机技术与发展,2023,33(05):152.[doi:10. 3969 / j. issn. 1673-629X. 2023. 05. 023]
 ZHANG Xin,XIAN Guang-ming*,MEI Hao-yang,et al.Entity Relation Extraction Based on Span Method and Multi-fork Decoding Tree[J].,2023,33(06):152.[doi:10. 3969 / j. issn. 1673-629X. 2023. 05. 023]
[7]卜意磊,庞文迪,吴甜甜,等.面向食品监管领域的知识图谱构建研究[J].计算机技术与发展,2023,33(06):202.[doi:10. 3969 / j. issn. 1673-629X. 2023. 06. 030]
 BU Yi-lei,PANG Wen-di,WU Tian-tian,et al.Research on Knowledge Graph Construction for Food Supervision[J].,2023,33(06):202.[doi:10. 3969 / j. issn. 1673-629X. 2023. 06. 030]
[8]李正辉,廖光忠.基于多层次特征提取的中文医疗实体识别[J].计算机技术与发展,2023,33(09):119.[doi:10. 3969 / j. issn. 1673-629X. 2023. 09. 018]
 LI Zheng-hui,LIAO Guang-zhong.Chinese Medical Entity Recognition Based on Multi-level Feature Extraction[J].,2023,33(06):119.[doi:10. 3969 / j. issn. 1673-629X. 2023. 09. 018]
[9]王萌,刘春刚,赵华.基于字符注意力与词典特征的教育领域实体识别[J].计算机技术与发展,2024,34(07):168.[doi:10.20165/j.cnki.ISSN1673-629X.2024.0109]
 WANG Meng,LIU Chun-gang,ZHAO Hua.Entity Recognition in Education Domain Based on Character Attention and Dictionary Feature[J].,2024,34(06):168.[doi:10.20165/j.cnki.ISSN1673-629X.2024.0109]

更新日期/Last Update: 2025-06-10