«上一篇/Previous Article|本期目录/Table of Contents|下一篇/Next Article»

[1]习扬,苏一丹,覃希.用KPCA-SVM的方法检测垃圾标签的研究[J].计算机技术与发展,2014,24(05):65-69.
　XI Yang,SU Yi-dan,QIN Xi.Research on Detecting Social Spam with KPCA-SVM Method[J].,2014,24(05):65-69.
点击复制

用KPCA-SVM的方法检测垃圾标签的研究()

分享到：

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:: 24
期数:: 2014年05期

页码:: 65-69

栏目:: 智能、算法、系统工程

出版日期:: 2014-05-31

文章信息/Info

Title:: Research on Detecting Social Spam with KPCA-SVM Method

文章编号:: 1673-629X（2014）05-0065-05

作者:: 习扬; 苏一丹; 覃希; 广西大学计算机与电子信息学院

Author(s):: XI Yang; SU Yi-dan; QIN Xi

关键词:: 数据降维; 核主成分分析法; 支持向量机; 垃圾标签

Keywords:: data dimension reduction; kernel principal component analysis theory; support vector machine; social spam

分类号:: TP301

文献标志码:: A

摘要:: 高维数据中进行各种处理时所需样本数量会成指数级增加，同时样本间距离的价值也逐渐减小，将导致维数灾问题。文本标签数据通常会面临数据维数过高的问题，会影响用户对垃圾标签的检测。文中借助支持向量机的数学模型构建出针对Folksonomy的大规模垃圾标签检测模型。为了减少检测垃圾标签时维数过高的影响，在核主成分分析理论的启发下，将数据降维思想引入数据约简领域，提出基于核主成分分析法的大规模SVM数据集约简模型。最终实例化形成一种新的垃圾标签检测方法，即基于核主成分分析支持向量机( KPCA-SVM)的大规模垃圾标签检测模型。该模型在垃圾标签检测中可以在不影响数据特征的前提下，缩短模型的测试时间且检测性能良好。

Abstract:: The needed sample will increase exponentially when processing high-dimensional data,the value of the distance between the sample also gradually reduced at the same time,which will lead to the dimension disaster problem. Text label data usually face this prob-lem of high-dimensional data,it will affect the users to detect social spam. In this paper,take advantage of the mathematical model of Support Vector Machine ( SVM) to construct the large-scale social spam detection model for Foklsonomy. In order to reduce the influ-ence of high-dimensional data,inspired by the kernel principal component analysis theory,the ideas of data dimension reduction are intro-duced,the large-scale SVM data set reduction model is proposed which is based on kernel principal component analysis. Finally form a new social spam detection method,the large-scale social spam detection model based on kernel principal component analysis and support vector machine. This model would not affect the characteristics in the social spam detection,and it will shorten the test time and have a good detection performance.

相似文献/References:

[1]陈桂林,王生光,徐静妹,等. 基于GA和组合核的SVM入侵检测算法[J].计算机技术与发展,2015,25(02):148.
　CHEN Gui-lin,WANG Sheng-guang,XU Jing-mei,et al. Intrusion Detection Algorithm of SVM Based on GA and Composed Kernel Function[J].,2015,25(05):148.
[2]甄俊涛,刘臣.高维数据多标签分类的食品安全预警研究[J].计算机技术与发展,2020,30(09):109.[doi:10. 3969 / j. issn. 1673-629X. 2020. 09. 020]
　ZHEN Jun-tao,LIU Chen.Research on Food Safety Early Warning of Multi-label Classification of High Dimensional Data[J].,2020,30(05):109.[doi:10. 3969 / j. issn. 1673-629X. 2020. 09. 020]
[3]戴贵洋,綦秀利,余晓晗.融合人类知识的随机森林特征选择方法研究[J].计算机技术与发展,2022,32(07):155.[doi:10. 3969 / j. issn. 1673-629X. 2022. 07. 027]
　DAI Gui-yang,QI Xiu-li*,YU Xiao-han.Research on Random Forest Feature Selection Method by Human Knowledge[J].,2022,32(05):155.[doi:10. 3969 / j. issn. 1673-629X. 2022. 07. 027]
[4]林磊,孙涵.基于自纠错伪标签的无监督域自适应[J].计算机技术与发展,2023,33(01):193.[doi:10. 3969 / j. issn. 1673-629X. 2023. 01. 029]
　LIN Lei,SUN Han.Self-correcting Pseudo Label for Unsupervised Domain Adaptation[J].,2023,33(05):193.[doi:10. 3969 / j. issn. 1673-629X. 2023. 01. 029]

常用功能

工具/Tools

统计/Statistics

摘要浏览/Viewed863
全文下载/Downloads750
评论/Comments