[1]王萌,刘春刚,赵华.基于字符注意力与词典特征的教育领域实体识别[J].计算机技术与发展,2024,34(07):168-174.[doi:10.20165/j.cnki.ISSN1673-629X.2024.0109]
 WANG Meng,LIU Chun-gang,ZHAO Hua.Entity Recognition in Education Domain Based on Character Attention and Dictionary Feature[J].,2024,34(07):168-174.[doi:10.20165/j.cnki.ISSN1673-629X.2024.0109]
点击复制

基于字符注意力与词典特征的教育领域实体识别

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
34
期数:
2024年07期
页码:
168-174
栏目:
人工智能
出版日期:
2024-07-10

文章信息/Info

Title:
Entity Recognition in Education Domain Based on Character Attention and Dictionary Feature
文章编号:
1673-629X(2024)07-0168-07
作者:
王萌12刘春刚12赵华12
1. 河北师范大学 职业技术、中燃工学院,河北 石家庄 050024; 2. 河北省信息融合与智能控制重点实验室,河北 石家庄 050024
Author(s):
WANG Meng12LIU Chun-gang12ZHAO Hua12
1. School of Vocational Technology and Combustion Engineering,Hebei Normal University,Shijiazhuang 050024,China; 2. Hebei Key Laboratory of Information Fusion and Intelligent Control,Shijiazhuang 050024,China
关键词:
实体识别词典特征字符注意力IDCNN条件随机场
Keywords:
entity recognitiondictionary featurecharacter attentionIDCNNconditional random field
分类号:
TP391
DOI:
10.20165/j.cnki.ISSN1673-629X.2024.0109
摘要:
针对现有的实体识别方法未考虑教育领域术语对模型识别性能的影响,导致模型性能不佳以及知识实体边界模 糊问题,提出了一种基于字符注意力与词典特征的教育领域实体识别方法。 该方法首先通过 BERT 预处理语言模型根据上下文语义信息生成字向量,提出基于词性的字符注意力机制重新分配句子中字的权重。 然后与构建的教育领域词典特征拼接融合,将其输入到 BiLSTM 网络与 IDCNN 网络提取特征,通过注意力机制将两层的输出动态组合,对两层的输出进行加权,从而融合新的特征。 最后通过条件随机场进行计算,得到实体对应的标签序列。 与现有方法相比,该方法在教育学科领域文本库中获得了更高的精度,识别结果的准确率、召回率、F1 值分别为 90. 71% ,91. 37% ,91. 04% 。
Abstract:
Aiming at the problem that the existing entity recognition methods do not consider the influence of education terms on the model recognition performance,which leads to poor model performance and fuzzy knowledge entity boundary,a new entity recognition method based on character attention and dictionary feature is proposed. In this method,word vectors are generated according to contextual semantic information through BERT preprocessing language model, and a character attention mechanism based on part of speech is proposed to redistribute the weight of words in sentences. Then,it is spliced and fused with the features of the educational field dictionary constructed,and input into BiLSTM network and IDCNN network to extract features. The output of the two layers is dynamically combined through the attention mechanism,and the output of the two layers is weighted to fuse new features. Finally,the label sequence corresponding to the entity is obtained through conditional random field calculation. Compared to existing methods,the proposed method achieves higher accuracy in an educational domain text corpus. The precision,recall,and F1 score of the recognition results are 90. 71% ,91. 37% ,and 91. 04% ,respectively.

相似文献/References:

[1]赵震,张龙昌. XML文档实体识别技术研究[J].计算机技术与发展,2014,24(10):84.
 ZHAO Zhen,ZHANG Long-chang. Research on Entity Identification Technology on XML Documents[J].,2014,24(07):84.
[2]赵君珂,张振宇,蔡开裕.基于自然语言处理的医学实体识别与标签提取[J].计算机技术与发展,2019,29(09):18.[doi:10. 3969 / j. issn. 1673-629X. 2019. 09. 004]
 ZHAO Jun-ke,ZHANG Zhen-yu,CAI Kai-yu.Medical Entity Recognition and Label Extraction Based on Natural Language Processing[J].,2019,29(07):18.[doi:10. 3969 / j. issn. 1673-629X. 2019. 09. 004]
[3]彭 怀,宋井宽,唐向红.基于信息匹配方法的中文知识库问答系统[J].计算机技术与发展,2022,32(02):14.[doi:10. 3969 / j. issn. 1673-629X. 2022. 02. 002]
 PENG Huai,SONG Jing-kuan,TANG Xiang-hong.Question Answering System of Chinese Knowledge Base Based on Information Matching Method[J].,2022,32(07):14.[doi:10. 3969 / j. issn. 1673-629X. 2022. 02. 002]
[4]毛宏亮,艾孜尔古丽,陈德刚.基于多头注意力的电网调度领域命名实体识别[J].计算机技术与发展,2023,33(02):181.[doi:10. 3969 / j. issn. 1673-629X. 2023. 02. 027]
 MAO Hong-liang,Azragul,CHEN De-gang.Named Entity Recognition in Grid Dispatch Domain Based on Multi-headed Attention[J].,2023,33(07):181.[doi:10. 3969 / j. issn. 1673-629X. 2023. 02. 027]
[5]张 鑫,冼广铭*,梅灏洋,等.基于 Span 方法和多叉解码树的实体关系抽取[J].计算机技术与发展,2023,33(05):152.[doi:10. 3969 / j. issn. 1673-629X. 2023. 05. 023]
 ZHANG Xin,XIAN Guang-ming*,MEI Hao-yang,et al.Entity Relation Extraction Based on Span Method and Multi-fork Decoding Tree[J].,2023,33(07):152.[doi:10. 3969 / j. issn. 1673-629X. 2023. 05. 023]
[6]卜意磊,庞文迪,吴甜甜,等.面向食品监管领域的知识图谱构建研究[J].计算机技术与发展,2023,33(06):202.[doi:10. 3969 / j. issn. 1673-629X. 2023. 06. 030]
 BU Yi-lei,PANG Wen-di,WU Tian-tian,et al.Research on Knowledge Graph Construction for Food Supervision[J].,2023,33(07):202.[doi:10. 3969 / j. issn. 1673-629X. 2023. 06. 030]
[7]李正辉,廖光忠.基于多层次特征提取的中文医疗实体识别[J].计算机技术与发展,2023,33(09):119.[doi:10. 3969 / j. issn. 1673-629X. 2023. 09. 018]
 LI Zheng-hui,LIAO Guang-zhong.Chinese Medical Entity Recognition Based on Multi-level Feature Extraction[J].,2023,33(07):119.[doi:10. 3969 / j. issn. 1673-629X. 2023. 09. 018]
[8]周文卓,廖光忠.基于RBIEGP的中文医疗实体识别[J].计算机技术与发展,2025,(06):124.[doi:10.20165/j.cnki.ISSN1673-629X.2025.0024]
 ZHOU Wen-zhuo,LIAO Guang-zhong.Chinese Medical Entity Recognition Based on RBIEGP[J].,2025,(07):124.[doi:10.20165/j.cnki.ISSN1673-629X.2025.0024]

更新日期/Last Update: 2024-07-10