[1]冯 宇,苑易伟.基于最小超球面密度的孤立点检测算法[J].计算机技术与发展,2019,29(06):32-36.[doi:10. 3969 / j. issn. 1673-629X. 2019. 06. 007]
 FENG Yu,YUAN Yi-wei.An Outlier Detection Algorithm Based on Minimum Hyper Sphere Density[J].,2019,29(06):32-36.[doi:10. 3969 / j. issn. 1673-629X. 2019. 06. 007]
点击复制

基于最小超球面密度的孤立点检测算法()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
29
期数:
2019年06期
页码:
32-36
栏目:
智能、算法、系统工程
出版日期:
2019-06-10

文章信息/Info

Title:
An Outlier Detection Algorithm Based on Minimum Hyper Sphere Density
文章编号:
1673-629X(2019)06-0032-05
作者:
冯 宇1 苑易伟2
1. 长安大学 电子与控制工程学院,陕西 西安 710064;2. 西安理工大学 自动化与信息工程学院,陕西 西安 710048
Author(s):
FENG Yu1 YUAN Yi-wei2
1. School of Electronics and Control Engineering,Chang’an University,Xi’an 710064,China;2. School of Automation and Information Engineering,Xi’an University of Technology,Xi’an 710048,China
关键词:
孤立点检测最小超球面有效近邻局部密度差密度背离程度
Keywords:
outlier detectionminimum hyper sphereeffective neighborlocal density differencedensity deviation
分类号:
TP301
DOI:
10. 3969 / j. issn. 1673-629X. 2019. 06. 007
摘要:
定义了最小超球面密度的概念,提出了一种基于最小超球面密度的孤立点检测算法(minimum hyper sphere density,MHSD)。 该算法根据数据的 k 近邻和反 k 近邻获得数据的有效近邻,并使用最小超球面密度和有效近邻计算每个 数据的密度背离程度,进而计算每个数据的孤立程度,将孤立程度超过规定阈值的数据视为孤立点。 实验数据为一个二维人工数据集和两个高维实际数据集,检测三个数据集的孤立点,对算法性能进行评估,并与经典的局部离群因子算法(local outlier factor,LOF)、离群影响因子算法(influenced outlierness,INFLO) 和密度相似邻域离群因子算法(density similarity neighbor based outlier factor,DSNOF)进行比较。 实验结果表明,基于最小超球面密度的孤立点检测算法可以准确检测出数据中的孤立点,且性能优于三种经典算法。
Abstract:
Minimum hyper sphere density (MHSD) is defined and an outlier detection algorithm based on MHSD is proposed. The effective neighbors are obtained according to k-nearest neighbors and reverse k-nearest neighbors. The density deviation degree of each datum is calculated using minimum hyper sphere density and effective neighbors. Then the isolation degrees can be calculated. Data are regarded as outliers when their isolation degrees are higher than the threshold. A two-dimensional artificial data set and two high-dimensional real data sets are used to evaluate the algorithm performance. The mining results are compared with those of three classical algorithms,which are local outlier factor (LOF),influenced outlierness (INFLO) and density similarity neighbor based outlier factor(DSNOF). The experiment shows that MHSD can find outliers accurately and its performance is better than the three classical algorithms.

相似文献/References:

[1]贾志先.考试数据分析及孤立点检测的谱聚类方法[J].计算机技术与发展,2013,(01):103.
 JIA Zhi-xian.Spectral Clustering Method for Exam Data Analysis and Outlier Detection[J].,2013,(06):103.
[2]朱东生,吴庆波,谭郁松.基于频数的孤立点检测研究[J].计算机技术与发展,2013,(05):10.
 ZHU Dong-sheng,WU Qing-bo,TAN Yu-song.Research on Frequency-based Outlier Mining[J].,2013,(06):10.
[3]李蓉,周维柏. 基于多特征选取和类完全加权的入侵检测[J].计算机技术与发展,2014,24(07):145.
 LI Rong,ZHOU Wei-bai. Intrusion Detection Based on Multiple Feature Selection and Class Fully Weighted [J].,2014,24(06):145.

更新日期/Last Update: 2019-06-10