[1]姚飞杨,刘晓静*.基于 RoBERTa-Effg-Adv 的实体关系联合抽取方法[J].计算机技术与发展,2024,34(03):147-154.[doi:10. 3969 / j. issn. 1673-629X. 2024. 03. 022]
 YAO Fei-yang,LIU Xiao-jing*.Entity and Relation Joint Extraction Method Based on RoBERTa-Effg-Adv[J].,2024,34(03):147-154.[doi:10. 3969 / j. issn. 1673-629X. 2024. 03. 022]
点击复制

基于 RoBERTa-Effg-Adv 的实体关系联合抽取方法()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
34
期数:
2024年03期
页码:
147-154
栏目:
人工智能
出版日期:
2024-03-10

文章信息/Info

Title:
Entity and Relation Joint Extraction Method Based on RoBERTa-Effg-Adv
文章编号:
1673-629X(2024)03-0148-08
作者:
姚飞杨刘晓静*
青海大学 计算机技术与应用系,青海 西宁 810016
Author(s):
YAO Fei-yangLIU Xiao-jing*
Department of Computer Technology and Application,Qinghai University,Xining 810016,China
关键词:
RoBERTa-wwm-ext对抗训练关系抽取Efficient GlobalPointer中文实体
Keywords:
RoBERTa-wwm-extadversarial trainingrelation extractionEfficient GlobalPointerChinese entity
分类号:
TP391. 1
DOI:
10. 3969 / j. issn. 1673-629X. 2024. 03. 022
摘要:
实体关系抽取是构建知识图谱的关键步骤,其目的是抽取文本中的关系三元组。 针对现有中文实体关系联合抽取模型无法有效抽取重叠关系三元组及提取性能不足的问题,该文提出了 RoBERTa-Effg-Adv 的实体关系联合抽取模型,其编码端采用 RoBERTa-wwm-ext 预训练模型对输入数据进行编码,并采用 Efficient GlobalPointer 模型来处理嵌套和非嵌套命名实体识别,将实体关系三元组拆分成五元组进行实体关系联合抽取。 再结合对抗训练,提升模型的鲁棒性。 为了获得机器可读的语料库,对相关文本书籍进行扫描,并进行光学字符识别,再通过人工标注数据的方式,形成该研究所需要的关系抽取数据集 REDQTTM,该数据集包含 18 种实体类型和 11 种关系类型。 实验结果验证了该方法在瞿昙寺壁画领域的中文实体关系联合抽取任务的有效性,在 REDQTTM 测试集上的精确率达到了 94. 0% ,召回率达到了 90. 7% ,F1 值达到了 92. 3% ,相比 GPLinker 模型,在精确率、召回率和 F1 值上分别提高了 2. 4 百分点、0. 9 百分点、1. 6 百分点。
Abstract:
Entity and relation extraction is a key step in constructing knowledge graph,its purpose is to extract the relation triples in thetext. Aiming at the problem that the current Chinese entity relation joint extraction model cannot effectively extract overlapping relationtriples and the extraction performance is insufficient,we propose a entity and relation joint extraction model based on RoBERTa-Effg-Adv. At the encoder,the RoBERTa - wwm - ext pre - training model is used to encode the input data, and the Efficient Global Pointer model is used to process nested and non-nested named entity recognition. The entity and relation triple is split into five tuples for entityand relation joint extraction. Combined?
with adversarial training,the robustness of the model is improved. In order to obtain machine-readable corpus,the relevant books are scanned,and optical character recognition?
is performed,and then the relation extraction datasetREDQTTM required by this study is formed by manually labeling the data. The dataset contains 18 entity types and 11 relationship types.The experimental results verify the effectiveness of the proposed method in the task of entity and relation joint extraction in the field of QuTan temple murals. The precision on the test set of REDQTTM reaches 94. 0% ,the recall reaches 90. 7% ,and the F1 value reaches92. 3% . Compared with the GPLinker model,the precision,recall and F1 value are improved by 2. 4% ,0. 9% and 1. 6% respectively.

相似文献/References:

[1]李宗阳,吉 源,沈志宏.面向多属性推荐系统的对抗深度分解模型[J].计算机技术与发展,2021,31(05):7.[doi:10. 3969 / j. issn. 1673-629X. 2021. 05. 002]
 ,Adversarial Deep Tensor Factorization for Multi-criteria Recommender Systems[J].,2021,31(03):7.[doi:10. 3969 / j. issn. 1673-629X. 2021. 05. 002]

更新日期/Last Update: 2024-03-10