[1]何 铠,管有庆,龚 锐.基于深度学习和支持向量机的文本分类模型[J].计算机技术与发展,2022,32(07):22-27.[doi:10. 3969 / j. issn. 1673-629X. 2022. 07. 004]
 HE Kai,GUAN You-qing,GONG Rui.Text Classification Model Based on Deep Learning and Support Vector Machine[J].,2022,32(07):22-27.[doi:10. 3969 / j. issn. 1673-629X. 2022. 07. 004]
点击复制

基于深度学习和支持向量机的文本分类模型()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
32
期数:
2022年07期
页码:
22-27
栏目:
大数据分析与挖掘
出版日期:
2022-07-10

文章信息/Info

Title:
Text Classification Model Based on Deep Learning and Support Vector Machine
文章编号:
1673-629X(2022)07-0022-06
作者:
何 铠管有庆龚 锐
南京邮电大学 物联网学院,江苏 南京 210003
Author(s):
HE KaiGUAN You-qingGONG Rui
School of Internet of Things,Nanjing University of Posts and Telecommunications,Nanjing 210003,China
关键词:
自然语言处理词频算法中文文本分类权重预处理词密度权重
Keywords:
natural language processingword frequency algorithmChinese text classificationweight pretreatmentword density weigh
分类号:
TP391
DOI:
10. 3969 / j. issn. 1673-629X. 2022. 07. 004
摘要:
NLP( Natural Language Processing,自然语言处理) 是人工智能领域的一个主要研究方向,而文本分类是 NLP 处理技术的重要分支。 自然语言处理使计算机、手机等电子设备能够具有识别理解人类语言的能力,由于其自身的复杂性,目前仍有许多技术难点没有被完全攻克,主要包括不断产生的新词、中文词语的一词多义、自然语言的灵活性等问题。 该文以期刊论文作为实验数据,研究中文文本分类问题,在传统卷积神经网络模型的基础上提出了一种基于卷积神经网络和支持向量机结合的文本分类模型 CNNSVM(Convolutional Neural Network and Support Vector Machine Classifier)。 相较于传统方法,CNNSVM 增加了注意力机制,简化了模型参数,并使用基于支持向量机的分类器替代传统模型中的 softmax 层帮助实现文本的分类。 实验结果显示,该模型提升了特征词语的提取效果,有效解决了 softmax 层泛化能力较弱的问题。
Abstract:
NLP ( Natural Language Processing) is a major research direction in the field of artificial intelligence,and text classifica-tion is an important branch of NLP. Natural language processing enables computers,mobile phones and other electronic devices to recognize and understand human language. Due to its complexity,there are still many technical difficulties that have not been completely solved by researchers, which mainly include new words, polysemy of Chinese words, flexibility of natural language and so on. Based on the experimental data of journal articles,we study the classification of Chinese text. Based on the traditional convolutional neural network model,a text classification model CNNSVM ( Convolutional Neural Network and Support Vector Machine Classifier ) is proposed.Compared with the traditional method, CNNSVM adds an attention mechanism, simplifies the parameters of the model, and uses a classifier based on support vector machine to replace the softmax layer in the traditional model to help realize text classification. The experimental results show that such model improves the extraction effect of feature words and effectively solves the problem of weak generalization ability of softmax layer.

相似文献/References:

[1]陈国华 赵克 李亚涛 易帅.自然语言处理系统中的事件类名词的耦合处理[J].计算机技术与发展,2008,(06):60.
 CHEN Guo-hua,ZHAO Ke,LI Ya-tao,et al.Coupling Processing of Event Noun in NLP Systems[J].,2008,(07):60.
[2]程节华.基于FAQ的智能答疑系统中分词模块的设计[J].计算机技术与发展,2008,(07):181.
 CHENG Jie-hua.Design of Words Module in Intelligent Q/A System Based on FAQ[J].,2008,(07):181.
[3]杨欢 许威 赵克 陈余.动词属性在自然语言处理当中的研究与应用[J].计算机技术与发展,2008,(07):233.
 YANG Huan,XU Wei,ZHAO Ke,et al.Research and Application of Verb Attributes in Natural Language Processing[J].,2008,(07):233.
[4]孙超 张仰森.面向综合语言知识库的知识融合与获取研究[J].计算机技术与发展,2010,(08):25.
 SUN Chao,ZHANG Yang-sen.Research of Knowledge Integration and Obtaining Oriented Comprehensive Language Knowledge System[J].,2010,(07):25.
[5]党建 亿珍珍 赵克 殷鸿.数学领域集体词结构形式化处理研究[J].计算机技术与发展,2007,(05):121.
 DANG Jian,YI Zhen-zhen,ZHAO Ke,et al.Research of Formalization Processing for Collective Structures in Mathematics Domain[J].,2007,(07):121.
[6]江有福 郑庆华.自然语言网络答疑系统中倒排索引技术的研究[J].计算机技术与发展,2006,(02):126.
 JIANG You-fu,ZHENG Qing-hua.Research of Inverted Index in NLWAS[J].,2006,(07):126.
[7]刘亚清 张瑾 于纯妍.基于义原同现频率的汉语词义排歧系统[J].计算机技术与发展,2006,(05):184.
 LIU Ya-qing,ZHANG Jin,YU Chun-yan.A Chinese Word Sense Disambiguation System Based on Primitive CO- Occurrence Data[J].,2006,(07):184.
[8]刘政怡 李炜 吴建国.基于IMM—IME的汉字键盘输入法编程技术研究[J].计算机技术与发展,2006,(12):43.
 LIU Zheng-yi,LI Wei,WU Jian-guo.Research of Programming Technology of Chinese Input Method Based on IMM- IME[J].,2006,(07):43.
[9]赵鹏 何留进 孙凯 方薇[].基于情感计算的网络中文信息分析技术[J].计算机技术与发展,2010,(11):146.
 ZHAO Peng,HE Liu-jin,SUN Kai,et al.Analyzing Technologies of Internet Chinese Information Based on Affective Computing[J].,2010,(07):146.
[10]徐远方 李成城.基于SVM和词间特征的新词识别研究[J].计算机技术与发展,2012,(05):134.
 XU Yuan-fang,LI Cheng-cheng.Research on New Word Identification Based on SVM and Word Characteristics[J].,2012,(07):134.
[11]何 铠,管有庆,龚 锐.一种基于权重预处理的中文文本分类算法[J].计算机技术与发展,2022,32(03):40.[doi:10. 3969 / j. issn. 1673-629X. 2022. 03. 007]
 HE Kai,GUAN You-qing,GONG Rui.A Chinese Text Classification Algorithm Based on Weight Preprocessing[J].,2022,32(07):40.[doi:10. 3969 / j. issn. 1673-629X. 2022. 03. 007]

更新日期/Last Update: 2022-07-10