[1]祝振赫,武 虹,高 洁,等.融合外部知识的生成式实体关系联合抽取方法[J].计算机技术与发展,2023,33(08):124-130.[doi:10. 3969 / j. issn. 1673-629X. 2023. 08. 018]
 ZHU Zhen-he,WU Hong,GAO Jie,et al.A Generative Entity Relation Extraction Method Based on External Knowledge[J].,2023,33(08):124-130.[doi:10. 3969 / j. issn. 1673-629X. 2023. 08. 018]
点击复制

融合外部知识的生成式实体关系联合抽取方法()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
33
期数:
2023年08期
页码:
124-130
栏目:
人工智能
出版日期:
2023-08-10

文章信息/Info

Title:
A Generative Entity Relation Extraction Method Based on External Knowledge
文章编号:
1673-629X(2023)08-0124-07
作者:
祝振赫1 武 虹2 高 洁2 周 玉34
1. 中国科学院大学 人工智能学院,北京 100049;
2. 中国科协创新战略研究院,北京 100038;
3. 中国科学院自动化研究所 模式识别国家重点实验室,北京 100190;
4. 北京中科凡语科技有限公司 凡语 AI 研究院,北京 100190
Author(s):
ZHU Zhen-he1 WU Hong2 GAO Jie2 ZHOU Yu34
1. School of Artificial Intelligence,University of Chinese Academy of Sciences,Beijing 100049,China;
2. National Academy of Innovation Strategy,Beijing 100038,China;
3. Institute of Automation,Chinese Academy of Sciences,Beijing 100190,China;
4. Fanyu AI Research,Beijing Fanyu Technology Ltd. ,Beijing 100190,China
关键词:
实体关系抽取编码-解码框架知识融合深度学习注意力机制
Keywords:
entity relation extractionencoder-decoder frameknowledge fusiondeep learningattention mechanism
分类号:
TP391. 1
DOI:
10. 3969 / j. issn. 1673-629X. 2023. 08. 018
摘要:
实体关系联合抽取作为各领域构建知识图谱不可或缺的任务,成为当今信息抽取任务中的热点。 现有的生成式实体关系联合抽取方法,多采用编码器-解码器框架,通过监督学习从非结构化文本中抽取特征来生成实体和关系序列。但这种方法属于数据驱动方法,在缺乏标注数据时存在质量较低的问题,而获取标注数据需要花费大量的成本。 基于远程监督的方法通过利用外部知识库对文本进行自动标注,能够解决缺少大规模标注数据的问题,但同时引入的错误标签也会影响模型的性能。 针对上述问题,提出了融合外部知识的生成式实体关系联合抽取方法,采用多编码器和知识注意力机制,将结构化信息和句法结构等外部知识融入模型。 具体来说,首先利用标注数据对模型进行预训练来学习实体关系表示,然后利用外部知识再次训练来学习句法结构等信息。 实验结果表明,所提方法能够通过融合外部知识,提升实体关系三元组的准确率,尤其提升模型在标注数据稀缺场景下的抽取准确率。
Abstract:
As an indispensable task to construct knowledge map in various fields,entity-relation joint extraction has become a hot topic ininformation extraction. The existing generative entity-relation joint extraction methods use the encoder-decoder framework to generatesequences of entity and relation. However,this method is data-driven,which has the problem of low quality in the absence of annotationdata,and it costs a lot to obtain annotation data. The method based on distant supervision can solve the problem of lack of large-scale annotation data by using external knowledge base to automatically annotate text,but the introduction of wrong labels will also affect the performance of the model. To solve these problems, we propose a generative entity - relation joint extraction method based on externalknowledge. The multiple encoders and knowledge attention mechanism is used to fuse the external knowledge such as structured information and syntactic structure to the model. Specifically,the model is first pre-trained with annotated data to learn entity relation representation,and then retrained with external knowledge to learn syntactic structure and other information. The experiment shows theproposed method can improve the accuracy of entity relationship triples by fusing external knowledge,especially the extraction accuracyof the model in the context of label data scarcity.

相似文献/References:

[1]何阳宇,易晓宇,唐 亮,等.基于BLSTM-ATT的老挝语军事领域实体关系抽取[J].计算机技术与发展,2021,31(05):31.[doi:10. 3969 / j. issn. 1673-629X. 2021. 05. 006]
 ,,et al.LaoEntityRelationExtractioninMilitaryDomainBasedonBLSTM andAttentionMechanism[J].,2021,31(08):31.[doi:10. 3969 / j. issn. 1673-629X. 2021. 05. 006]
[2]崔从敏,施运梅,袁 博,等.面向政府公文的关系抽取方法研究[J].计算机技术与发展,2021,31(12):26.[doi:10. 3969 / j. issn. 1673-629X. 2021. 12. 005]
 CUI Cong-min,SHI Yun-mei,YUAN Bo,et al.Research on Relation Extraction Method for Government Documents[J].,2021,31(08):26.[doi:10. 3969 / j. issn. 1673-629X. 2021. 12. 005]
[3]潘理虎,陈亭亭,闫慧敏,等.基于滑动窗口注意力网络的关系分类模型[J].计算机技术与发展,2022,32(06):21.[doi:10. 3969 / j. issn. 1673-629X. 2022. 06. 004]
 PAN Li-hu,CHEN Ting-ting,YAN Hui-min,et al.Relation Classification Model Based on Sliding Window Attention Network[J].,2022,32(08):21.[doi:10. 3969 / j. issn. 1673-629X. 2022. 06. 004]

更新日期/Last Update: 2023-08-10