[1]林伟.一种基于成词概率的贝叶斯垃圾邮件过滤方法[J].计算机技术与发展,2011,(09):242-244.
 LIN Wei.A Bayesian Spam Filtering Method Based on Words Probability[J].,2011,(09):242-244.
点击复制

一种基于成词概率的贝叶斯垃圾邮件过滤方法()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
期数:
2011年09期
页码:
242-244
栏目:
安全与防范
出版日期:
1900-01-01

文章信息/Info

Title:
A Bayesian Spam Filtering Method Based on Words Probability
文章编号:
1673-629X(2011)09-0242-03
作者:
林伟
四川警察学院计算机系
Author(s):
LIN Wei
Department of Computer Science,Sichuan Police College
关键词:
垃圾邮件成词概率贝叶斯方法
Keywords:
spam words probability Bayesian method
分类号:
TP18
文献标志码:
A
摘要:
贝叶斯分类方法在英文邮件过滤中效果良好,在中文环境下一直表现不佳,而特征选择是垃圾邮件过滤中的重要步骤,它能够有效地改善过滤效果。文中以成词概率作为特征选择的基础,用构造的方法形成候选特征集,然后进一步用信息增益的方法来度量特征与类的关系,选择信息增益较大的N个特征做为最后的特征向量空间。在此基础上利用贝叶斯方法对邮件进行分类,实验结果验证了该方法在分类时间和分类效果上都优于传统的基于机械分词的贝叶斯方法
Abstract:
Bayesian classification method has expressed high accuracy in English mails filtration,but the performance was not good under Chinese environment.It has taken the words probability as the foundation of the feature selection,the candidate feature sets were formed through the construction method,then use information gain to evaluate the relationship between feature and class,choose the n-larger information gain features as the final feature vector space.Based on this,the mails were classified by Bayesian method.Experimental verification shows this method surpassed the tradition method which based on the mechanical participle of the Bayesian theorem in the classified time and the classified effect

相似文献/References:

[1]鲁晓南 接标.一种基于个性化邮件特征的反垃圾邮件系统[J].计算机技术与发展,2009,(08):155.
 LU Xiao-nan,JIE Biao.An Individual Anti- Spam Technology[J].,2009,(09):155.
[2]顾辉 李翔 薛质 李建华.邻近类别分类在电子邮件过滤系统中的运用[J].计算机技术与发展,2008,(04):202.
 GU Hui,LI Xiang,XUE Zhi,et al.Vicinity Category Classification in Email Filtering System[J].,2008,(09):202.
[3]汤伟 程家兴 纪霞.一种基于概率推理的邮件过滤系统的研究与设计[J].计算机技术与发展,2008,(08):76.
 TANG Wei,CHENG Jia-xing,JI Xia.Research and Design of a Spam Filtering System Based on Probability Inference[J].,2008,(09):76.
[4]汤伟 程家兴 纪霞.统计学理论在邮件分类中的应用研究[J].计算机技术与发展,2008,(12):231.
 TANG Wei,CHENG Jia-xing,JI Xia.Research and Design of a Spam Filtering System Based on Statistical Learning Theory[J].,2008,(09):231.
[5]邱明明 吴国新.一种个性化垃圾邮件识别系统的设计[J].计算机技术与发展,2007,(01):136.
 QIU Ming-ming,WU Guo-xin.Design of a Personal Spam Detection System[J].,2007,(09):136.
[6]龚伟 李柳柏.基于IDSS的中文垃圾邮件过滤模型设计[J].计算机技术与发展,2007,(03):163.
 GONG Wei,LI Liu-bai.Chinese Spam Mail Filtering Model Design Based on IDSS[J].,2007,(09):163.
[7]侯立铭 彭伟.一种互联网垃圾邮件综合过滤方案[J].计算机技术与发展,2007,(04):117.
 HOU Li-ming,PENG Wei.An Integrated Spam- Filtering Approach for Internet[J].,2007,(09):117.
[8]成宝国 冯宏伟.一个基于Naive Bayesian垃圾邮件过滤器的改进[J].计算机技术与发展,2006,(02):98.
 CHENG Bao-guo,FENG Hong-wei.Design of an Improved Spam Filter Based on Naive Bayesian Classifier[J].,2006,(09):98.
[9]张丽 黄东.基于Winnow算法的反垃圾邮件引擎的设计与实现[J].计算机技术与发展,2006,(04):170.
 ZHANG Li,HUANG Dong.Design and Implementation of One Prototype of Anti - Spam Engine Based on Winnow Algorithm[J].,2006,(09):170.
[10]金彩琴 裘国永.对垃圾邮件过滤技术的问题研究[J].计算机技术与发展,2011,(09):225.
 JIN Cai-qin,QIU Guo-yong.Research on Problem of Spam Filtering Technology[J].,2011,(09):225.

备注/Memo

备注/Memo:
四川省青年软件创新工程基金(2007AA42)林伟(1983-),男,讲师,研究方向为数据挖掘与机器学习、网络安全
更新日期/Last Update: 1900-01-01