[1]安俊秀,蒋思畅.面向自然语言处理的词向量模型研究综述[J].计算机技术与发展,2023,33(12):17-22.[doi:10. 3969 / j. issn. 1673-629X. 2023. 12. 003]
 AN Jun-xiu,JIANG Si-chang.Survey of Word Vector Model for Natural Language Processing[J].,2023,33(12):17-22.[doi:10. 3969 / j. issn. 1673-629X. 2023. 12. 003]
点击复制

面向自然语言处理的词向量模型研究综述()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
33
期数:
2023年12期
页码:
17-22
栏目:
综述
出版日期:
2023-12-10

文章信息/Info

Title:
Survey of Word Vector Model for Natural Language Processing
文章编号:
1673-629X(2023)12-0017-06
作者:
安俊秀蒋思畅
成都信息工程大学 软件工程学院,四川 成都 610225
Author(s):
AN Jun-xiuJIANG Si-chang
School of Software Engineering,Chengdu University of Information Technology,Chengdu 610225,China
关键词:
自然语言处理词向量深度学习预训练技术静态模型动态模型
Keywords:
natural language processingword vectordeep learningpre-training techniquestatic modeldynamic model
分类号:
TP391. 1
DOI:
10. 3969 / j. issn. 1673-629X. 2023. 12. 003
摘要:
从 20 世纪 50 年代至今,自然语言处理( Natural Language Processing,NLP) 取得了长足的发展。 早期的词向量模型证明了对该领域的研究需要使用数学方法,而不是人类的语言规则。?
进入 21 世纪后,静态模型以深度学习技术为基础,在很多任务中取得了不错的表现;动态模型再将预训练技术融入进来,实现了根据语境对词向量进行调整的功能,为 NLP 领域带来了里程碑
式的突破,后续研究在此基础上向各领域延伸扩展,并且在现实生活中得到了大规模的应用。文章首先对词向量模型及其发展历史做了介绍,然后分析了现代的词向量模型( NNLM,Word2Vec, FastText,Glove,ELMo,GPT,BERT) ,其次说明了多种基于预训练技术的扩展模型和当前自然语言处理技术的应用现状,最后总结了目前存在的主要问题,并提出对未来研究的展望。
Abstract:
Since the 1950s,Natural Language Processing ( NLP) has made great progress. The early word vector model demonstrates thatthe study of NLP requires mathematical methods?
rather than human language rules. After entering the 21st century,the static model whichbased on deep learning techniques achieves good performance in many tasks. The dynamic model makes use of pre-training techniquesand realizes the function of adjusting word vectors according to the context,which brings a milestone breakthrough in the field of NLP.On?
this basis,the follow-up research extends to various fields,and has been applied on a large scale in real life. We firstly introduce theword vector model and its development history,then analyze the modern models based on deep learning ( NNLM,Word2Vec,FastText,Glove,ELMo,GPT,BERT) . Secondly,we explain a variety of extended models based on pre - training technology, and describe thecurrent application status of natural language processing technology. Finally,we summarize the main problems at present,and put forwardthe prospect of future research.

相似文献/References:

[1]陈国华 赵克 李亚涛 易帅.自然语言处理系统中的事件类名词的耦合处理[J].计算机技术与发展,2008,(06):60.
 CHEN Guo-hua,ZHAO Ke,LI Ya-tao,et al.Coupling Processing of Event Noun in NLP Systems[J].,2008,(12):60.
[2]程节华.基于FAQ的智能答疑系统中分词模块的设计[J].计算机技术与发展,2008,(07):181.
 CHENG Jie-hua.Design of Words Module in Intelligent Q/A System Based on FAQ[J].,2008,(12):181.
[3]杨欢 许威 赵克 陈余.动词属性在自然语言处理当中的研究与应用[J].计算机技术与发展,2008,(07):233.
 YANG Huan,XU Wei,ZHAO Ke,et al.Research and Application of Verb Attributes in Natural Language Processing[J].,2008,(12):233.
[4]孙超 张仰森.面向综合语言知识库的知识融合与获取研究[J].计算机技术与发展,2010,(08):25.
 SUN Chao,ZHANG Yang-sen.Research of Knowledge Integration and Obtaining Oriented Comprehensive Language Knowledge System[J].,2010,(12):25.
[5]党建 亿珍珍 赵克 殷鸿.数学领域集体词结构形式化处理研究[J].计算机技术与发展,2007,(05):121.
 DANG Jian,YI Zhen-zhen,ZHAO Ke,et al.Research of Formalization Processing for Collective Structures in Mathematics Domain[J].,2007,(12):121.
[6]江有福 郑庆华.自然语言网络答疑系统中倒排索引技术的研究[J].计算机技术与发展,2006,(02):126.
 JIANG You-fu,ZHENG Qing-hua.Research of Inverted Index in NLWAS[J].,2006,(12):126.
[7]刘亚清 张瑾 于纯妍.基于义原同现频率的汉语词义排歧系统[J].计算机技术与发展,2006,(05):184.
 LIU Ya-qing,ZHANG Jin,YU Chun-yan.A Chinese Word Sense Disambiguation System Based on Primitive CO- Occurrence Data[J].,2006,(12):184.
[8]刘政怡 李炜 吴建国.基于IMM—IME的汉字键盘输入法编程技术研究[J].计算机技术与发展,2006,(12):43.
 LIU Zheng-yi,LI Wei,WU Jian-guo.Research of Programming Technology of Chinese Input Method Based on IMM- IME[J].,2006,(12):43.
[9]赵鹏 何留进 孙凯 方薇[].基于情感计算的网络中文信息分析技术[J].计算机技术与发展,2010,(11):146.
 ZHAO Peng,HE Liu-jin,SUN Kai,et al.Analyzing Technologies of Internet Chinese Information Based on Affective Computing[J].,2010,(12):146.
[10]徐远方 李成城.基于SVM和词间特征的新词识别研究[J].计算机技术与发展,2012,(05):134.
 XU Yuan-fang,LI Cheng-cheng.Research on New Word Identification Based on SVM and Word Characteristics[J].,2012,(12):134.
[11]张翠肖,郝杰辉,刘星宇,等.基于 CNN-BiLSTM 的中文微博立场分析研究[J].计算机技术与发展,2020,30(07):154.[doi:10. 3969 / j. issn. 1673-629X. 2020. 07. 033]
 ZHANG Cui-xiao,HAO Jie-hui,LIU Xing-yu,et al.Research on Stance Detection in Chinise Micro-blog Based on CNN-BiLSTM[J].,2020,30(12):154.[doi:10. 3969 / j. issn. 1673-629X. 2020. 07. 033]

更新日期/Last Update: 2023-12-10