[1]龚汝鑫,余肖生.基于 BERT-BILSTM 的医疗文本关系提取方法[J].计算机技术与发展,2022,32(04):186-192.[doi:10. 3969 / j. issn. 1673-629X. 2022. 04. 032]
 GONG Ru-xin,YU Xiao-sheng.Relation Extraction Method of Medical Texts Based on BERT-BILSTM[J].,2022,32(04):186-192.[doi:10. 3969 / j. issn. 1673-629X. 2022. 04. 032]
点击复制

基于 BERT-BILSTM 的医疗文本关系提取方法()

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
32
期数:
2022年04期
页码:
186-192
栏目:
应用前沿与综合
出版日期:
2022-04-10

文章信息/Info

Title:
Relation Extraction Method of Medical Texts Based on BERT-BILSTM
文章编号:
1673-629X(2022)04-0186-07
作者:
龚汝鑫余肖生
三峡大学 计算机与信息学院,湖北 宜昌 443002
Author(s):
GONG Ru-xinYU Xiao-sheng
School of Computer and Information,Three Gorges University,Yichang 443002,China
关键词:
关系提取双向长短期记忆神经网络注意力机制健康医疗文本BERT
Keywords:
relation extractionBILSTMattention mechanismhealthy medical textsBERT
分类号:
TP391.1
DOI:
10. 3969 / j. issn. 1673-629X. 2022. 04. 032
摘要:
健康医疗文本关系提取可充分利用医疗资源,为构建医院系统和相关知识图谱奠定基础,但健康医疗文本上下文联系紧密,内容结构复杂,使用传统的机器学习方法无法充分学习并利用文本中所包含的信息,且由于未针对文本中包含的医疗领域专业名词进行处理,使研究所需的重要实体流失,导致准确率不高。 因此,提出了一种基于 BERT 和 BILSTM融合的健康医疗文本关系提取方法,在预处理阶段进行医疗关键词提取,使用 BERT 语言模型进行词嵌入,再结合 BILSTM和注意力机制进行特征处理,最后使用 Softmax 分类器输出类别概率值,确定实体间关系类别。 基于两个临床医疗数据集的实验验证结果, 与单向 LSTM、 CNN、 BIGRU 等模型进行比较分析, BERT - BILSTM - ATT 模型表现最优, 精确率提高3.35% 以上、召回率提高 1. 28% 以上、F1 值提高 2. 58% 以上,基于 BERT 和 BILSTM 融合的健康医疗文本关系提取方法能准确有效地预测健康医疗文本中实体之间存在的关系类别。
Abstract:
Relation extraction method can make full use of medical resources in healthy medical texts and lay the foundation for the construction of hospital system and related knowledge graph. However, the context of healthy medical texts are closely related and the content structure is complex. Traditional machine learning methods cannot fully learn and use the information in the texts,and the medical domain terms are not processed in the texts. The important entities needed in the research are lost,resulting in low accuracy. Therefore,we propose a relation extraction method of healthy medical texts based on BERT and BILSTM. In the preprocessing stage, medical keywords are extracted, words are embedded by using the BERT language model, and then features are processed by BILSTM and attention mechanism. Finally, the Softmax classifier is used to output the probability value of the category to determine the relation category between entities. Based on the experimental results of two clinical data sets,compared with unidirectional LSTM,CNN,BIGRUand other models,BERT- BILSTM - ATT model showed the best performance,with the precision increased by more than 3. 35% ,the recall increased by more than 1. 28% ,and the F1 - Score increased by more than 2. 58% . The proposed relation extraction method of healthy medical texts based on BERT and BILSTM can accurately and effectively predict the relation categories between entities in healthy medical texts.

相似文献/References:

[1]闫娜娜 刘锋 李锡娟 耿波.基于用户查询的多关系群体挖掘改进算法[J].计算机技术与发展,2008,(06):20.
 YAN Na-na,LIU Feng,LI Xi-juan,et al.An Improved Algorithm of Community Mining from Multi - Relational Network Based on User Inquiry[J].,2008,(04):20.
[2]陈 琛,刘小云,方玉华.融合注意力机制的电子病历命名实体识别[J].计算机技术与发展,2020,30(10):216.[doi:10. 3969 / j. issn. 1673-629X. 2020. 10. 038]
 CHEN Chen,LIU Xiao-yun,FANG Yu-hua.Named Entity Recognition in Electronic Medical Record Introducing Attention Mechanisms[J].,2020,30(04):216.[doi:10. 3969 / j. issn. 1673-629X. 2020. 10. 038]

更新日期/Last Update: 2022-04-10