[1]王 俊,王修来*,栾伟先,等.基于 BERT 模型的科研人才领域命名实体识别[J].计算机技术与发展,2021,31(11):21-27.[doi:10. 3969 / j. issn. 1673-629X. 2021. 11. 004]
 WANG Jun,WANG Xiu-lai*,LUAN Wei-xian,et al.Research on Named Entity Recognition of Scientific Research Talents Field Based on BERT Model[J].,2021,31(11):21-27.[doi:10. 3969 / j. issn. 1673-629X. 2021. 11. 004]
点击复制

基于 BERT 模型的科研人才领域命名实体识别()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
31
期数:
2021年11期
页码:
21-27
栏目:
大数据分析与挖掘
出版日期:
2021-11-10

文章信息/Info

Title:
Research on Named Entity Recognition of Scientific Research Talents Field Based on BERT Model
文章编号:
1673-629X(2021)11-0021-07
作者:
王 俊12 王修来1* 栾伟先3 叶 帆3
1. 南京信息工程大学 管理工程学院,江苏 南京 210044;
2. 南京传媒学院 传媒技术学院,江苏 南京 211172;
3. 中国人民解放军 31102 部队,江苏 南京 210002
Author(s):
WANG Jun12 WANG Xiu-lai1* LUAN Wei-xian3 YE Fan3
1. School of Management Science and Engineering,Nanjing University of Information Science & Technology,Nanjing 210044,China;
2. Communication University of China,Nanjing,Nanjing 211172,China;
3. Unit 31102 of PLA,Nanjing 210002,China
关键词:
BERT 模型命名实体识别科研人才机器学习自然语言处理
Keywords:
BERTnamed entity recognitionscientific research talentsmachine learningnatural language processing
分类号:
TP181
DOI:
10. 3969 / j. issn. 1673-629X. 2021. 11. 004
摘要:
科研人才的发现和挖掘不仅可为学者画像、研究领域热度分析、科研成果预测等诸多环节提供基础信息支撑, 也是提升精准服务科研人才和推动前 沿科技智能化水平的关键技术。 针对传统机器学习算法对科研人才领域命名实体识别准确效率低、高度依赖语料库以及分词不准确等问题, 文中面向科研人才的基础属性和科研属性,对该领域命名实体进行了类别和标注符号的定义,形成了 7 大类共计 19 小类的命名实体。 通过使用 BERT 模型生成词向量,结合 BiLSTM 对上下文关系的记忆能力和 CRF 对标注规则的学习能力, 构建了面向科研人才领域的命名实体识别 BERT-BiLSTM-CRF 模型。 模型在包含 6 134 条科研咨询语料库中进行了训练和参数微调,对网络爬取的科研人才相关数据上测试结果表明,该模型取得了较好的识别效果。
Abstract:
The discovery and mining of scientific research talents can not only provide basic information support for scholars’ portraits,heat analysis of research fields, and prediction of scientific research achievements, but also a key technology to improve the precision service of scientific research talents and promote the intelligent level of frontier science and technology. Aiming at the problems of traditional machine learning algorithm for named entity recognition in the field of scientific research talents,such as low efficiency,high dependence on corpus and inaccurate word segmentation,etc,based on the basic and scientific research attributes of scientific research talents,we define the categories and labeling symbols of named entities in this field,and form 7 categories of named entities with a total of 19 sub categories. We use the BERT model to generate the word vector and construct a BERT-BiLSTM-CRF model for named entity recognition in the field of scientific research talents by combining with BiLSTM’ s ability to memorize the context and CRF’ s ability to learn the annotation rules. The model is trained in 6 134 scientific research consulting corpora and tested on the data of scientific research talents crawled from the Internet. The results show that the model achieves excellent recognition effect.

相似文献/References:

[1]陈 琛,刘小云,方玉华.融合注意力机制的电子病历命名实体识别[J].计算机技术与发展,2020,30(10):216.[doi:10. 3969 / j. issn. 1673-629X. 2020. 10. 038]
 CHEN Chen,LIU Xiao-yun,FANG Yu-hua.Named Entity Recognition in Electronic Medical Record Introducing Attention Mechanisms[J].,2020,30(11):216.[doi:10. 3969 / j. issn. 1673-629X. 2020. 10. 038]
[2]周亦敏,黄 俊.基于 BERT 的学术合作者推荐研究[J].计算机技术与发展,2021,31(03):45.[doi:10. 3969 / j. issn. 1673-629X. 2021. 03. 008]
 ZHOU Yi-min,HUANG Jun.Research on BERT-based Academic Collaborator Recommendation[J].,2021,31(11):45.[doi:10. 3969 / j. issn. 1673-629X. 2021. 03. 008]
[3]潘理虎,赵彭彭,龚大立,等.煤矿事故案例命名实体识别方法研究[J].计算机技术与发展,2022,32(02):154.[doi:10. 3969 / j. issn. 1673-629X. 2022. 02. 025]
 PAN Li-hu,ZHAO Peng-peng,GONG Da-li,et al.Combined ALBERT for Named Entity Recognition in Coal Mine Accident Cases[J].,2022,32(11):154.[doi:10. 3969 / j. issn. 1673-629X. 2022. 02. 025]
[4]尚福华,蒋毅文,曹茂俊.一种增强的多粒度特征融合语义匹配模型[J].计算机技术与发展,2022,32(07):28.[doi:10. 3969 / j. issn. 1673-629X. 2022. 07. 005]
 SHANG Fu-hua,JIANG Yi-wen*,CAO Mao-jun.An Enhanced Multi Granularity Feature Fusion Model for Semantic Matching[J].,2022,32(11):28.[doi:10. 3969 / j. issn. 1673-629X. 2022. 07. 005]
[5]范禹辰,刘相坤,朱建生,等.基于 BERT 的服务网站 Web 攻击检测研究[J].计算机技术与发展,2022,32(08):168.[doi:10. 3969 / j. issn. 1673-629X. 2022. 08. 027]
 FAN Yu-chen,LIU Xiang-kun,ZHU Jian-sheng,et al.Research on Web Attack Detection of Service Website Based on BERT[J].,2022,32(11):168.[doi:10. 3969 / j. issn. 1673-629X. 2022. 08. 027]
[6]刘华玲,孙 毅.基于实体识别和信息融合的知识图谱研究[J].计算机技术与发展,2022,32(09):107.[doi:10. 3969 / j. issn. 1673-629X. 2022. 09. 017]
 LIU Hua-ling,SUN Yi.Knowledge Graph Based on Entity Recognition and Information Fusion--A Case Study of COVID-19[J].,2022,32(11):107.[doi:10. 3969 / j. issn. 1673-629X. 2022. 09. 017]
[7]孙安亮,时宏伟,王金策.基于字符与单词嵌入的航空安全命名实体识别[J].计算机技术与发展,2022,32(09):148.[doi:10. 3969 / j. issn. 1673-629X. 2022. 09. 023]
 SUN An-liang,SHI Hong-wei,WANG Jin-ce.Named Entity Recognition Based on Character and Word Embedding in Aviation Safety[J].,2022,32(11):148.[doi:10. 3969 / j. issn. 1673-629X. 2022. 09. 023]
[8]杜睿山,陈思路,刘文豪.基于岩石文本信息的命名实体识别[J].计算机技术与发展,2022,32(09):188.[doi:10. 3969 / j. issn. 1673-629X. 2022. 09. 029]
 DU Rui-shan,CHEN Si-lu,LIU Wen-hao.Named Entity Recognition Based on Rock Text Information[J].,2022,32(11):188.[doi:10. 3969 / j. issn. 1673-629X. 2022. 09. 029]
[9]胡慧婷,李建平,董振荣,等.基于 BERT 模型的教育技术学领域实体抽取[J].计算机技术与发展,2022,32(10):164.[doi:10. 3969 / j. issn. 1673-629X. 2022. 10. 027]
 HU Hui-ting,LI Jian-ping,DONG Zhen-rong,et al.Named Entity Recognition Method in Educational Technology Field Based on BERT[J].,2022,32(11):164.[doi:10. 3969 / j. issn. 1673-629X. 2022. 10. 027]
[10]罗 峦,夏骄雄.融合 ERNIE 与改进 Transformer 的中文 NER 模型[J].计算机技术与发展,2022,32(10):120.[doi:10. 3969 / j. issn. 1673-629X. 2022. 10. 020]
 LUO Luan,XIA Jiao-xiong.Research on Chinese Named Entity Recognition Combining ERNIE with Improved Transformer[J].,2022,32(11):120.[doi:10. 3969 / j. issn. 1673-629X. 2022. 10. 020]
[11]王卫红,吕红燕,曹玉辉,等.基于 BERT 的混合神经网络实体识别方法[J].计算机技术与发展,2021,31(08):100.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 017]
 WANG Wei-hong,LYU Hong-yan,CAO Yu-hui,et al.A Hybrid Neural Network Entity Recognition Method Based on BERT Model[J].,2021,31(11):100.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 017]

更新日期/Last Update: 2021-11-10