[1]杜睿山,陈思路,刘文豪.基于岩石文本信息的命名实体识别[J].计算机技术与发展,2022,32(09):188-192.[doi:10. 3969 / j. issn. 1673-629X. 2022. 09. 029]
 DU Rui-shan,CHEN Si-lu,LIU Wen-hao.Named Entity Recognition Based on Rock Text Information[J].,2022,32(09):188-192.[doi:10. 3969 / j. issn. 1673-629X. 2022. 09. 029]
点击复制

基于岩石文本信息的命名实体识别()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
32
期数:
2022年09期
页码:
188-192
栏目:
新型计算应用系统
出版日期:
2022-09-10

文章信息/Info

Title:
Named Entity Recognition Based on Rock Text Information
文章编号:
1673-629X(2022)09-0188-05
作者:
杜睿山陈思路刘文豪
东北石油大学 计算机与信息技术学院,黑龙江 大庆 163318
Author(s):
DU Rui-shanCHEN Si-luLIU Wen-hao
School of Computer and Information Technology,Northeast Petroleum University,Daqing 163318,China
关键词:
命名实体识别Lexicon岩石非结构化文本条件随机场知识抽取
Keywords:
named entity recognitionLexiconrockunstructured textconditional random fieldknowledge extraction
分类号:
TP391. 1
DOI:
10. 3969 / j. issn. 1673-629X. 2022. 09. 029
摘要:
命名实体识别技术是自然语言处理领域的重要任务之一。 但岩石文本信息中的命名实体存在边界不清、分词困难、误差传播、计算效率慢等问题。 基于岩石文本信息进行知识抽取对油气勘探领域的研究具有重大意义。 为此,该文首先构建岩石文本数据集,并提出 Lexicon-BiLSTM-CRF 网络模型应用于非结构化的岩石文本上,该模型首先经过 Lexicon机制获得每个字符的所有匹配词,从而解决了边界不清、分词困难的问题,在此基础上提升了计算效率。 然后通过双向长短期记忆网络( BiLSTM) 提取上下文语义特征,将语义向量传入条件随机场(CRF) 层并采用维特比算法解码,降低了错误标签的输出概率并预测实体标注标签,最终实现岩石文本的命名实体抽取任务。 在构建的岩石文本数据集的基础上进行几组对比实验,验证了该方法在准确率和召回率上具有一定提升。
Abstract:
Named entity recognition technology is one of the important tasks in the field of natural language processing. However,the named entities in the rock text information have problems such as unclear boundaries,difficult word segmentation,error propagation,and slow calculation efficiency. Knowledge extraction based on rock text information is of great significance to the research in the field of oiland gas exploration. To this end,we first build a rock text data set,and then propose a Lexicon - LSTM - CRF network model to be applied to unstructured rock text. Firstly,the Lexicon mechanism is used to obtain all matching words of each character,so as to solve theproblem of unclear boundary and difficult word segmentation, and on this basis, improve the computational efficiency. Then the contextual semantic features are extracted through the bidirectional long-term short-term memory network ( BiLSTM) ,and the semanticvector is passed into the Conditional Random Field ( CRF) layer and decoded by the Viterbi algorithm to reduce the output probability of the error label and predict the entity annotation label,and finally realize the rock text Named entity extraction task. Through several comparative experiments on the rock text data set constructed,it is verified that the proposed method has a certain improvement in accuracyand recall.

相似文献/References:

[1]陈 琛,刘小云,方玉华.融合注意力机制的电子病历命名实体识别[J].计算机技术与发展,2020,30(10):216.[doi:10. 3969 / j. issn. 1673-629X. 2020. 10. 038]
 CHEN Chen,LIU Xiao-yun,FANG Yu-hua.Named Entity Recognition in Electronic Medical Record Introducing Attention Mechanisms[J].,2020,30(09):216.[doi:10. 3969 / j. issn. 1673-629X. 2020. 10. 038]
[2]王卫红,吕红燕,曹玉辉,等.基于 BERT 的混合神经网络实体识别方法[J].计算机技术与发展,2021,31(08):100.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 017]
 WANG Wei-hong,LYU Hong-yan,CAO Yu-hui,et al.A Hybrid Neural Network Entity Recognition Method Based on BERT Model[J].,2021,31(09):100.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 017]
[3]王 俊,王修来*,栾伟先,等.基于 BERT 模型的科研人才领域命名实体识别[J].计算机技术与发展,2021,31(11):21.[doi:10. 3969 / j. issn. 1673-629X. 2021. 11. 004]
 WANG Jun,WANG Xiu-lai*,LUAN Wei-xian,et al.Research on Named Entity Recognition of Scientific Research Talents Field Based on BERT Model[J].,2021,31(09):21.[doi:10. 3969 / j. issn. 1673-629X. 2021. 11. 004]
[4]潘理虎,赵彭彭,龚大立,等.煤矿事故案例命名实体识别方法研究[J].计算机技术与发展,2022,32(02):154.[doi:10. 3969 / j. issn. 1673-629X. 2022. 02. 025]
 PAN Li-hu,ZHAO Peng-peng,GONG Da-li,et al.Combined ALBERT for Named Entity Recognition in Coal Mine Accident Cases[J].,2022,32(09):154.[doi:10. 3969 / j. issn. 1673-629X. 2022. 02. 025]
[5]刘华玲,孙 毅.基于实体识别和信息融合的知识图谱研究[J].计算机技术与发展,2022,32(09):107.[doi:10. 3969 / j. issn. 1673-629X. 2022. 09. 017]
 LIU Hua-ling,SUN Yi.Knowledge Graph Based on Entity Recognition and Information Fusion--A Case Study of COVID-19[J].,2022,32(09):107.[doi:10. 3969 / j. issn. 1673-629X. 2022. 09. 017]
[6]孙安亮,时宏伟,王金策.基于字符与单词嵌入的航空安全命名实体识别[J].计算机技术与发展,2022,32(09):148.[doi:10. 3969 / j. issn. 1673-629X. 2022. 09. 023]
 SUN An-liang,SHI Hong-wei,WANG Jin-ce.Named Entity Recognition Based on Character and Word Embedding in Aviation Safety[J].,2022,32(09):148.[doi:10. 3969 / j. issn. 1673-629X. 2022. 09. 023]
[7]胡慧婷,李建平,董振荣,等.基于 BERT 模型的教育技术学领域实体抽取[J].计算机技术与发展,2022,32(10):164.[doi:10. 3969 / j. issn. 1673-629X. 2022. 10. 027]
 HU Hui-ting,LI Jian-ping,DONG Zhen-rong,et al.Named Entity Recognition Method in Educational Technology Field Based on BERT[J].,2022,32(09):164.[doi:10. 3969 / j. issn. 1673-629X. 2022. 10. 027]
[8]罗 峦,夏骄雄.融合 ERNIE 与改进 Transformer 的中文 NER 模型[J].计算机技术与发展,2022,32(10):120.[doi:10. 3969 / j. issn. 1673-629X. 2022. 10. 020]
 LUO Luan,XIA Jiao-xiong.Research on Chinese Named Entity Recognition Combining ERNIE with Improved Transformer[J].,2022,32(09):120.[doi:10. 3969 / j. issn. 1673-629X. 2022. 10. 020]
[9]赵建强,朱万彤,陈 诚.基于多重卷积神经网络模型的命名实体识别[J].计算机技术与发展,2023,33(01):187.[doi:10. 3969 / j. issn. 1673-629X. 2023. 01. 028]
 ZHAO Jian-qiang,ZHU Wan-tong,CHEN Cheng.Named Entity Recognition Based on Duplex Convolution Neural Network Model[J].,2023,33(09):187.[doi:10. 3969 / j. issn. 1673-629X. 2023. 01. 028]
[10]贵向泉,郭 亮,李 立.基于 MRC 和 ERNIE 的有色冶金命名实体识别模型[J].计算机技术与发展,2023,33(10):93.[doi:10. 3969 / j. issn. 1673-629X. 2023. 10. 015]
 GUI Xiang-quan,GUO Liang,LI Li.Nonferrous Metallurgical Named Entity Recognition Model Based on MRC and ERNIE[J].,2023,33(09):93.[doi:10. 3969 / j. issn. 1673-629X. 2023. 10. 015]

更新日期/Last Update: 2022-09-10