[1]贵向泉,郭 亮,李 立.基于 MRC 和 ERNIE 的有色冶金命名实体识别模型[J].计算机技术与发展,2023,33(10):93-100.[doi:10. 3969 / j. issn. 1673-629X. 2023. 10. 015]
 GUI Xiang-quan,GUO Liang,LI Li.Nonferrous Metallurgical Named Entity Recognition Model Based on MRC and ERNIE[J].,2023,33(10):93-100.[doi:10. 3969 / j. issn. 1673-629X. 2023. 10. 015]
点击复制

基于 MRC 和 ERNIE 的有色冶金命名实体识别模型()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
33
期数:
2023年10期
页码:
93-100
栏目:
人工智能
出版日期:
2023-10-10

文章信息/Info

Title:
Nonferrous Metallurgical Named Entity Recognition Model Based on MRC and ERNIE
文章编号:
1673-629X(2023)10-0093-08
作者:
贵向泉郭 亮李 立
兰州理工大学 计算机与通信学院,甘肃 兰州 730050
Author(s):
GUI Xiang-quanGUO LiangLI Li
School of Computer and Communication,Lanzhou University of Technology,Lanzhou 730050,China
关键词:
有色冶金产业自然语言处理命名实体识别MRCERNIE
Keywords:
nonferrous metallurgy industry natural language processing named entity recognition machine reading comprehensionenhanced representation through knowledge integration
分类号:
TP391
DOI:
10. 3969 / j. issn. 1673-629X. 2023. 10. 015
摘要:
命名实体是构建产业企业画像和产业知识图谱的重要依据,为解决现有方法在有色冶金领域命名实体识别任务当中无法充分提取文本语义特征、没有充分利用标签当中的先验知识和嵌套命名实体识别效果不佳的问题,提出了一种基于机器阅读理解框架( MRC) 和知识增强语义表示模型( ERNIE) 的 MEAB ( MRC-ERNIE-Attention-BiLSTM) 模型结构。该模型在 MRC 框架的基础上,引入了基于 Attention 的信息融合策略,将两种不同结构的数据在 ERNIE 预训练模型进行特征提取之后转换为向量,并在信息融合层进行向量融合,使模型能够学习到标签当中的先验知识。 随后 BiLSTM 模型对具有语义信息的向量从两个方向进行特征提取,并在一种多层嵌套命名实体识别器中进行输出,提高嵌套命名实体的识别准确率。在构建的有色冶金领域命名实体识别数据集上的实验表明,MEAB 模型的精确率、召回率和 F1 值分别达到了78. 77% 、79. 76% 和 79. 26% ,证明了该模型的有效性。
Abstract:
Named entities are an important basis for building industrial enterprise portraits and industrial knowledge maps. To solve theproblems that existing methods cannot fully extract text semantic features,do not make full use of prior knowledge in labels,and do notperform well in nested named entity recognition tasks in nonferrous metallurgy industry,we propose a MEAB ( MRC-ERNIE-Attention-BiLSTM) model structure based on Machine Reading Comprehension ( MRC ) and Enhanced Representation Through KnowledgeIntegration ( ERNIE) . On the basis of MRC,the information fusion strategy is introduced to convert the data of two different structuresinto vectors after feature extraction in the ERNIE pre training model,and carry out vector fusion at the information fusion level,so that themodel can learn?
the prior knowledge in the tag. Then the BiLSTM model extracts the features of vectors with semantic information fromtwo directions and outputs them in a multi-layer nested named entity recognizer to improve the recognition accuracy of nested namedentities. Experiments on the data set of named entity recognition in the field of nonferrous metallurgy industry show that the accuracy,recall and F1 value of MEAB model reach 78. 77% ,79. 76% and 79. 26% respectively,which proves the effectiveness of the model.

相似文献/References:

[1]陈国华 赵克 李亚涛 易帅.自然语言处理系统中的事件类名词的耦合处理[J].计算机技术与发展,2008,(06):60.
 CHEN Guo-hua,ZHAO Ke,LI Ya-tao,et al.Coupling Processing of Event Noun in NLP Systems[J].,2008,(10):60.
[2]程节华.基于FAQ的智能答疑系统中分词模块的设计[J].计算机技术与发展,2008,(07):181.
 CHENG Jie-hua.Design of Words Module in Intelligent Q/A System Based on FAQ[J].,2008,(10):181.
[3]杨欢 许威 赵克 陈余.动词属性在自然语言处理当中的研究与应用[J].计算机技术与发展,2008,(07):233.
 YANG Huan,XU Wei,ZHAO Ke,et al.Research and Application of Verb Attributes in Natural Language Processing[J].,2008,(10):233.
[4]孙超 张仰森.面向综合语言知识库的知识融合与获取研究[J].计算机技术与发展,2010,(08):25.
 SUN Chao,ZHANG Yang-sen.Research of Knowledge Integration and Obtaining Oriented Comprehensive Language Knowledge System[J].,2010,(10):25.
[5]党建 亿珍珍 赵克 殷鸿.数学领域集体词结构形式化处理研究[J].计算机技术与发展,2007,(05):121.
 DANG Jian,YI Zhen-zhen,ZHAO Ke,et al.Research of Formalization Processing for Collective Structures in Mathematics Domain[J].,2007,(10):121.
[6]江有福 郑庆华.自然语言网络答疑系统中倒排索引技术的研究[J].计算机技术与发展,2006,(02):126.
 JIANG You-fu,ZHENG Qing-hua.Research of Inverted Index in NLWAS[J].,2006,(10):126.
[7]刘亚清 张瑾 于纯妍.基于义原同现频率的汉语词义排歧系统[J].计算机技术与发展,2006,(05):184.
 LIU Ya-qing,ZHANG Jin,YU Chun-yan.A Chinese Word Sense Disambiguation System Based on Primitive CO- Occurrence Data[J].,2006,(10):184.
[8]刘政怡 李炜 吴建国.基于IMM—IME的汉字键盘输入法编程技术研究[J].计算机技术与发展,2006,(12):43.
 LIU Zheng-yi,LI Wei,WU Jian-guo.Research of Programming Technology of Chinese Input Method Based on IMM- IME[J].,2006,(10):43.
[9]赵鹏 何留进 孙凯 方薇[].基于情感计算的网络中文信息分析技术[J].计算机技术与发展,2010,(11):146.
 ZHAO Peng,HE Liu-jin,SUN Kai,et al.Analyzing Technologies of Internet Chinese Information Based on Affective Computing[J].,2010,(10):146.
[10]徐远方 李成城.基于SVM和词间特征的新词识别研究[J].计算机技术与发展,2012,(05):134.
 XU Yuan-fang,LI Cheng-cheng.Research on New Word Identification Based on SVM and Word Characteristics[J].,2012,(10):134.

更新日期/Last Update: 2023-10-10