[1]刘文 吴陈.一种新的中文文本分类算法-One ClassSVM—KNN算法[J].计算机技术与发展,2012,(05):83-86.
 LIU Wen,WU Chen.A New Text Classification Algorithm One Class SVM-KNN[J].,2012,(05):83-86.
点击复制

一种新的中文文本分类算法-One ClassSVM—KNN算法()

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
期数:
2012年05期
页码:
83-86
栏目:
智能、算法、系统工程
出版日期:
1900-01-01

文章信息/Info

Title:
A New Text Classification Algorithm One Class SVM-KNN
文章编号:
1673-629X(2012)05-0083-04
作者:
刘文 吴陈
江苏科技大学智能信息处理实验室
Author(s):
LIU Wen WU Chen
The Opening Laboratory of Intelligent Computing, Jiangsu University of Science and Technology
关键词:
中文文本分类支持向量机K-近邻OneClassSVM—KNN
Keywords:
Chinese text classification support vector machine K-nearest neighbour One Class SVlVI-KNN
分类号:
TP301.6
文献标志码:
A
摘要:
中文文本分类在数据库及搜索引擎中得到广泛的应用,K-近邻(KNN)算法是常用于中文文本分类中的分类方法,但K-近邻在分类过程中需要存储所有的训练样本,并且直到待测样本需要分类时才建立分类,而且还存在类倾斜现象以及存储和计算的开销大等缺陷。单类SVM对只有一类的分类问题具有很好的效果,但不适用于多类分类问题,因此针对KNN存在的缺陷及单类SVM的特点提出OneClassSVM—KNN算法,并给出了算法的定义及详细分析。通过实验证明此方法很好地克服了KNN算法的缺陷,并且查全率、查准率明显优于K-近邻算法
Abstract:
Text classification is widely used in database and search engine. KNN is widely used in Chinese text categorization,however, KNN has many defects in the application of text classification. The deficiency of KNN classification algorithm is that all the training sam- pies are kept until the samples are classified. When the size of samples is very large, the storage and computation will be costly, which will result in classification deviation. One class SVM is a simple and effective classification algorithm in one class. To solve KNN problems, a new algorithm based on harmonic one-class-SVM and KNN was proposed,which will achieve better classification effect. The experiment result is shown that the recall computed using the proposed method is obviously more highly than the KNN method

相似文献/References:

[1]陈俏 曹根牛 陈柳.支持向量机应用于大气污染物浓度预测[J].计算机技术与发展,2010,(01):247.
 CHEN Qiao,CAO Gen-niu,CHEN Liu.Application of Support Vector Machine to Atmospheric Pollution Prediction[J].,2010,(05):247.
[2]李晶 姚明海.基于支持向量机的语义图像分类研究[J].计算机技术与发展,2010,(02):75.
 LI Jing,YAO Ming-hai.Research of Semantic Image Classification Based on Support Vector Machine[J].,2010,(05):75.
[3]姜鹤 陈丽亚.SVM文本分类中一种新的特征提取方法[J].计算机技术与发展,2010,(03):17.
 JIANG He,CHEN Li-ya.A New Feature Selection Method in SVM Text Categorization[J].,2010,(05):17.
[4]李雷 张建民.一种改善的基于支持向量机的边缘检测算子[J].计算机技术与发展,2010,(03):125.
 LI Lei,ZHANG Jian-min.An Improved Edge Detector Using the Support Vector Machines[J].,2010,(05):125.
[5]曹庆璞 董淑福 罗赟骞.网络时延的混沌特性分析及预测[J].计算机技术与发展,2010,(04):43.
 CAO Qing-pu,DONG Shu-fu,LUO Yun-qian.Chaotic Analysis and Prediction of Internet Time- Delay[J].,2010,(05):43.
[6]路川 胡欣杰.区域航空市场航线客流量预测研究[J].计算机技术与发展,2010,(04):84.
 LU Chuan,HU Xin-jie.Analysis of Regional Airline Passenger Forecast Title[J].,2010,(05):84.
[7]黄炜 黄志华.一种基于遗传算法和SVM的特征选择[J].计算机技术与发展,2010,(06):21.
 HUANG Wei,HUANG Zhi-hua.Feature Selection Based on Genetic Algorithm and SVM[J].,2010,(05):21.
[8]孙秋凤.microRNA计算识别中的模式识别技术[J].计算机技术与发展,2010,(06):97.
 SUN Qiu-feng.Pattern Recognition Technology for MicroRNA Identification[J].,2010,(05):97.
[9]刘振岩 王勇 陈立平 马俊杰 陈天恩.基于SVM的农业智能决策Web服务的研究与实现[J].计算机技术与发展,2010,(06):213.
 LIU Zhen-yan,WANG Yong,CHEN Li-ping,et al.Research and Implementation of Intelligence Decision Web Services Based on SVM for Digital Agriculture[J].,2010,(05):213.
[10]王李冬.一种新的人脸识别算法[J].计算机技术与发展,2009,(05):147.
 WANG Li-dong.A New Algorithm of Face Recognition[J].,2009,(05):147.

备注/Memo

备注/Memo:
刘文(1985-),男,山东临沂人,硕士研究生,研究方向为数据挖掘;吴陈,教授,研究方向为数据挖掘、模式识别等
更新日期/Last Update: 1900-01-01