[1]尚福华,金 泉*,曹茂俊.基于 Senna-BiLSTM-CRF 的测井实体抽取方法研究[J].计算机技术与发展,2021,31(12):180-186.[doi:10. 3969 / j. issn. 1673-629X. 2021. 12. 030]
 SHANG Fu-hua,JIN Quan*,CAO Mao-jun.Research on Logging Named Entity Extraction Method Based onSenna-BiLSTM-CRF[J].,2021,31(12):180-186.[doi:10. 3969 / j. issn. 1673-629X. 2021. 12. 030]
点击复制

基于 Senna-BiLSTM-CRF 的测井实体抽取方法研究()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
31
期数:
2021年12期
页码:
180-186
栏目:
应用前沿与综合
出版日期:
2021-12-10

文章信息/Info

Title:
Research on Logging Named Entity Extraction Method Based onSenna-BiLSTM-CRF
文章编号:
1673-629X(2021)12-0180-07
作者:
尚福华金 泉* 曹茂俊
东北石油大学 计算机与信息技术学院,黑龙江 大庆 163318
Author(s):
SHANG Fu-huaJIN Quan* CAO Mao-jun
School of Computer and Information Technology,Northeast Petroleum University,Daqing 163318,China
关键词:
实体抽取知识图谱深度学习词向量测井
Keywords:
entity extractionknowledge mapdeep learningword vectorlog
分类号:
TP391
DOI:
10. 3969 / j. issn. 1673-629X. 2021. 12. 030
摘要:
实体抽取是构建知识图谱极为重要的过程,实体抽取的质量将直接决定构建的知识图谱的质量。 为了更好地构建测井领域知识图谱,该文对测井命名实体抽取的方法进行研究。 针对在测井领域知识图谱构建过程中尚无公开数据集可用的情况,收集了部分测井领域相关的非结构化文本数据,并对其中的测井实体进行人工标注,构建了测井领域知识图谱命名实体抽取数据集。 基于该数据集,提出使用 Senna 词向量-BiLSTM-CRF 的方法对测井非结构文本数据中的命名实体进行抽取,降低数据标注的难度,提高训练效率。 实验结果表明使用 Senna 词向量-BiLSTM-CRF 的方法能够比较有效地完成对测井领域实体抽取的任务,该方法在构建的测井命名实体抽取数据集上的准确率达到了 84. 87% ,召回率达到了81. 62% ,F1 值达到了 83. 22% ,优于对比的 BiLSTM-CRF 和词向量-BiLSTM-CRF。
Abstract:
Entity extraction is a quite important process to construct knowledge map. The quality of entity extraction will directly determine the quality of knowledge map. In order to better construct the log domain knowledge map,we study the method of logging named entity extraction. In view of the fact that there is no public data set available in the process of constructing the log domain knowledge map,some unstructured text data related to the log domain are collected,and the log entities are manually marked,and the named entity extraction data set of the log domain knowledge map is constructed. Based on this data set,the Senna word vector-BiLSTM-CRF method is proposed to extract named entities from logging unstructured text data to reduce the difficulty of data annotation and improve the training efficiency. The experiment shows that using the Senna word vector-BiLSTM-CRF method can effectively complete the task of logging entity extraction. The accuracy rate,recall rate and F1 value of this method are 84. 87% ,81. 62% and 83. 22% ,respectively,on the constructed logging named entity extraction data set,which is superior to the comparative models of BiLSTM-CRF and word vector-BiLSTM-CRF.

相似文献/References:

[1]孙艳,田丽梅. 基于多维尺度分析的舆情研究主题词知识图谱[J].计算机技术与发展,2016,26(04):187.
 SUN Yan,TIAN Li-mei. Mapping Knowledge Domain on Subject Headings of Public Sentiment Research Based on Multi-dimensional Scaling[J].,2016,26(12):187.
[2]刘申凯,周霁婷,朱永华,等.融合知识图谱和 ESA 方法的网络新词识别[J].计算机技术与发展,2019,29(03):12.[doi:10.3969/ j. issn.1673-629X.2019.03.003]
 LIU Shen-kai,ZHOU Ji-ting,ZHU Yong-hua,et al.Network New Word Recognition Based on Fusion of Knowledge Graph and ESA[J].,2019,29(12):12.[doi:10.3969/ j. issn.1673-629X.2019.03.003]
[3]戈其平,钟艳如.基于数学教学的知识图谱构建[J].计算机技术与发展,2019,29(03):187.[doi:10.3969/ j. issn.1673-629X.2019.03.039]
 GE Qi-ping,ZHONG Yan-ru.Construction of Knowledge Atlas Based on Mathematics Teaching[J].,2019,29(12):187.[doi:10.3969/ j. issn.1673-629X.2019.03.039]
[4]魏 瑾,李伟华,潘 炜.基于知识图谱的智能决策支持技术及应用研究[J].计算机技术与发展,2020,30(01):1.[doi:10. 3969 / j. issn. 1673-629X. 2020. 01. 001]
 WEI Jin,LI Wei-hua,PAN Wei.Research on Intelligent Decision Support Technology and Application Based on Knowledge Graph[J].,2020,30(12):1.[doi:10. 3969 / j. issn. 1673-629X. 2020. 01. 001]
[5]项 威,王 邦.中文事件抽取研究综述[J].计算机技术与发展,2020,30(02):1.[doi:10. 3969 / j. issn. 1673-629X. 2020. 02. 001]
 XIANG Wei,WANG Bang.Survey of Chinese Event Extraction Research[J].,2020,30(12):1.[doi:10. 3969 / j. issn. 1673-629X. 2020. 02. 001]
[6]刘家祝,郭 强,吴碧伟,等.基于子图相交的社交账号与知识图谱实体对齐[J].计算机技术与发展,2020,30(05):10.[doi:10. 3969 / j. issn. 1673-629X. 2020. 05. 003]
 LIU Jia-zhu,GUO Qiang,WU Bi-wei,et al.Subgraph Intersection Based Alignment between Social Media Account and Knowledge Graph Entity[J].,2020,30(12):10.[doi:10. 3969 / j. issn. 1673-629X. 2020. 05. 003]
[7]陆菁宇,张绍阳,黄文旎.学科发展状态的知识图谱构建[J].计算机技术与发展,2020,30(06):145.[doi:10. 3969 / j. issn. 1673-629X. 2020. 06. 028]
 LU Jing-yu,ZHANG Shao-yang,HUANG Wen-ni.Analysis of Development Status of Discipline Based on Knowledge Graph[J].,2020,30(12):145.[doi:10. 3969 / j. issn. 1673-629X. 2020. 06. 028]
[8]黄东晋,秦 汉,郭 昊.基于 BERT-CNN 的电影原声智能问答系统[J].计算机技术与发展,2020,30(11):158.[doi:10. 3969 / j. issn. 1673-629X. 2020. 11. 029]
 HUANG Dong-jin,QIN Han,GUO Hao.Movie Soundtrack Intelligent Question and Answer System Based on BERT-CNN[J].,2020,30(12):158.[doi:10. 3969 / j. issn. 1673-629X. 2020. 11. 029]
[9]任佳妮,杨 阳.全球医疗机器人技术领域创新态势分析[J].计算机技术与发展,2021,31(04):158.[doi:10. 3969 / j. issn. 1673-629X. 2021. 04. 027]
 REN Jia-ni,YANG Yang.Analysis of Innovation Situation in Field of Global MedicalRobot Technology[J].,2021,31(12):158.[doi:10. 3969 / j. issn. 1673-629X. 2021. 04. 027]
[10]卢 琪,谢艺菲,谢 钧,等.知识图谱在智能问答中的应用研究[J].计算机技术与发展,2021,31(07):13.[doi:10. 3969 / j. issn. 1673-629X. 2021. 07. 003]
 LU Qi,XIE Yi-fei,XIE Jun,et al.Research on Application of Knowledge Graphs in Intelligent Question Answering[J].,2021,31(12):13.[doi:10. 3969 / j. issn. 1673-629X. 2021. 07. 003]

更新日期/Last Update: 2021-12-10