[1]许必宵,陈升波,韩重阳,等. 改进的数据预处理算法及其应用[J].计算机技术与发展,2015,25(12):143-146.
 XU Bi-xiao,CHEN Sheng-bo,HAN Chong-yang,et al. Improved Data Preprocessing Algorithm and Its Application[J].,2015,25(12):143-146.
点击复制

 改进的数据预处理算法及其应用()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
25
期数:
2015年12期
页码:
143-146
栏目:
应用开发研究
出版日期:
2015-12-10

文章信息/Info

Title:
 Improved Data Preprocessing Algorithm and Its Application
文章编号:
1673-629X(2015)12-0143-04
作者:
 许必宵陈升波韩重阳马梦环宫婧
 南京邮电大学 理学院
Author(s):
 XU Bi-xiaoCHEN Sheng-boHAN Chong-yangMA Meng-huanGONG Jing
关键词:
 数据预处理SNM算法层次聚类聚类分析
Keywords:
 data preprocessingSNMhierarchical clustering clustering analysis
分类号:
TP301.6
文献标志码:
A
摘要:
 聚类分析是数据挖掘领域一项重要的课题. 针对重复数据与孤立数据的预处理可以优化聚类结果. 重复数据处理方面,文中在传统的重复数据查找算法SNM的基础上加入了伸缩窗口与变化移动速度的思想,提高了查找的准确率与效率;孤立数据方面,文中提出基于层次聚类分簇搜寻算法,算法利用层次聚类将数据分成独立的簇再依次搜寻孤立点提高了查询速率,并加入恢复检验的部分恢复被误删的非孤立点提高查找的准确率. 实验仿真中,首先抽取部分数据验证了改进后的数据预处理算法的准确性,然后将数据预处理算法用于处理移动用户消费数据后再对数据进行聚类分析,从而达到对客户的归属地信息识别的目的. 实验结果表明,文中提出的预处理算法具有很高的准确率与效率.
Abstract:
 Clustering analysis is an important project in data mining. Data preprocessing for repeated data and isolated data can optimize the result of clustering. About repeated data processing,added the idea of elastic window and changeable movement speed in traditional SNM to improve the accuracy and efficiency of searching. About isolated data processing,proposed a searching algorithm based on hierar-chical clustering and searching in divided clusters. Algorithm utilizes hierarchical clustering to divide the data into several independent clusters and sequentially search isolated point to improve the query speed. Meanwhile,algorithm adds recovery partial to recover isolated points which are misestimated to improve the accuracy of searching. In the experiment part,first extract the partial data to verify the accu-racy of the data preprocessing algorithm,next applies the algorithm for processing data of a list of consumption of mobile customers. Then make use of processed data to cluster in order to identify customers’ information on their hometown. The experimental results indicate that the preprocessing algorithm proposed is accurate and efficient.

相似文献/References:

[1]方杰 朱京红.日志挖掘中的数据预处理[J].计算机技术与发展,2010,(04):17.
 FANG Jie,ZHU Jing-hong.Data Pretreatment of Log Mining[J].,2010,(12):17.
[2]于飞 丁华福 姜伦.Web日志挖掘中数据预处理技术的研究[J].计算机技术与发展,2010,(05):47.
 YU Fei,DING Hua-fu,JIANG Lun.Research on Data Preprocessing Technology in Web Log Mining[J].,2010,(12):47.
[3]葛育祥 熊励.整合文本挖掘的商务智能系统结构研究[J].计算机技术与发展,2009,(04):1.
 GE Yu-xiang,XIONG Li.System Structure Study of Business Intelligence Integrated Text Mining[J].,2009,(12):1.
[4]方元康 胡学钢 夏启寿.一种改进的Web日志会话识别方法[J].计算机技术与发展,2008,(11):214.
 FANG Yuan-kang,HU Xue-gang,XIA Qi-shou.An Improved Method for Transaction Session Identification in Web Usage Mining[J].,2008,(12):214.
[5]王琼 刘珏 徐汀荣.结合Web站点结构的路径补充[J].计算机技术与发展,2007,(06):120.
 WANG Qiong,LIU Jue,XU Ting-rong.Combining With the Structure of Website for Path Complement[J].,2007,(12):120.
[6]李烈彪 张海鹏 周亚峰.Web日志挖掘中数据预处理方法的研究[J].计算机技术与发展,2007,(07):45.
 LI Lie-biao,ZHANG Hai-peng,ZHOU Ya-feng.Data Preprocessing Method Research for Web Log Mining[J].,2007,(12):45.
[7]熊忠阳 周亚峰.Web访问挖掘的预处理技术的研究[J].计算机技术与发展,2007,(08):11.
 XIONG Zhong-yang,ZHOU Ya-feng.Research on Data Preprocessing Technology in Web Log Mining[J].,2007,(12):11.
[8]严楠 刘涛.基于校园网的用户行为数据分析系统的设计[J].计算机技术与发展,2007,(01):239.
 YAN Nan,LIU Tao.Design of Data Analyzing System of Visitors' Behavior Patterns Based on Web of Campus[J].,2007,(12):239.
[9]董艳.数据预处理方法在移动通信行业中的应用[J].计算机技术与发展,2010,(11):225.
 DONG Yan.Application of Data Pre-processing Method in Mobile Telecommunication Industry[J].,2010,(12):225.
[10]周爱武 肖云 封军.Web日志挖掘数据预处理优化[J].计算机技术与发展,2011,(01):42.
 ZHOU Ai-wu,XIAO Yun,FENG Jun.An Improved Method for Data Preprocessing in Web Log Mining[J].,2011,(12):42.
[11]曾永忠[] []张帅[] 马忠权[]. 一种基于用户会话的异常检测方法[J].计算机技术与发展,2014,24(07):141.
 ZENG Yong-zhong[][],ZHANG Shuai[] A Zhong-quan[]. An Anomaly Detection Method Based on Session[J].,2014,24(12):141.
[12]石美红[],李楠[],马静[],等. RFID技术在西服生产订单跟踪管理中的应用研究[J].计算机技术与发展,2017,27(08):182.
 SHI Mei-hong[],LI Nan[],MA Jing[],et al. Application of RFID Technology in Suit Production Order Tracking Management[J].,2017,27(12):182.

更新日期/Last Update: 2016-01-29