[1]陈煜,李玲娟.一种基于决策树的隐私保护数据流分类算法[J].计算机技术与发展,2017,27(07):111-114.
 CHEN Yu,LI Ling-juan. A Decision Tree-based Privacy Preserving Classification Mining Algorithm for Data Streams[J].,2017,27(07):111-114.
点击复制

一种基于决策树的隐私保护数据流分类算法()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
27
期数:
2017年07期
页码:
111-114
栏目:
安全与防范
出版日期:
2017-07-10

文章信息/Info

Title:
 A Decision Tree-based Privacy Preserving Classification Mining Algorithm for Data Streams
文章编号:
1673-629X(2017)07-0111-04
作者:
陈煜李玲娟
 南京邮电大学 计算机学院
Author(s):
 CHEN YuLI Ling-juan
关键词:
 决策树隐私保护数据流分类
Keywords:
 decision treeprivacy preservingdata streamclassification
分类号:
TP311
文献标志码:
A
摘要:
 隐私保护的决策树挖掘方法主要是基于数据扰动的方法和基于安全多方计算的方法.由于数据流高速、连续无限和动态的特性,这些隐私保护方法在数据流挖掘应用上有所不足.针对当前数据流挖掘应用中的隐私泄露问题,提出了一种基于决策树的隐私保护的数据流分类算法-PPFDT.该算法通过采用添加随机噪声的方法对数据加以隐私保护,改进经典的数据流挖掘算法-VFDT,并使用阈值算法找到扰动数据流的最佳分裂属性和最佳分裂点,从而直接在扰动数据流上建立决策树,通过使用该决策树对初始数据流和扰动数据流分类得到较精准的结果.从PPFDT算法的隐私保护程度和在直接扰动的数据流上的分类性能两方面,基于UCI的WaveForm数据集进行了实验验证.实验结果表明,该算法在数据流上快速准确分类的同时,具有一定的隐私保护程度.
Abstract:
 Privacy preserving data mining methods are mainly based on perturbation and randomization approaches and secure multi-party computation approaches.Due to the high-speed data streams with unlimited continuous and dynamic characteristics,these methods are still inadequate.In order to solve privacy leaking problem on current data streams mining application,a privacy preserving fast decision tree mining algorithm for data streams named as PPFDT has been designed and implemented.It adds random noises to protect data privacy and improves the data mining algorithm named VFDT,and uses threshold method to find the best split attribute and the best split point of perturbed data streams,so that a decision tree is directly built on perturbed data streams.Then the decision tree is used to classify original data streams and perturbed data streams for getting accurate results.From the aspects of the privacy protection degree of the PPFDT algorithm and the classification performance on the direct perturbed data stream,the algorithm has been experimentally verified on the Waveform dataset of UCI.The experimental results show that the algorithm can achieve certain degrees of privacy protection,and at the same time,classify data streams fast and accurately.

相似文献/References:

[1]杨静 张楠男 李建 刘延明 梁美红.决策树算法的研究与应用[J].计算机技术与发展,2010,(02):114.
 YANG Jing,ZHANG Nan-nan,LI Jian,et al.Research and Application of Decision Tree Algorithm[J].,2010,(07):114.
[2]耿波 仲红 徐杰 闫娜娜.用关联分析法对负荷预测结果进行二次处理[J].计算机技术与发展,2008,(04):171.
 GENG Bo,ZHONG Hong,XU Jie,et al.Using Correlation Analysis to Treat Load Forecasting Results[J].,2008,(07):171.
[3]胡琼凯 黄建华.基于协议分析和决策树的入侵检测研究[J].计算机技术与发展,2009,(06):179.
 HU Oiong-kai,HUANG Jian-hua.Intrusion Detection Based on Protocol Analysis and Decision Tree[J].,2009,(07):179.
[4]王园园 倪志伟 赵裕啸 伍章俊.基于决策树的模糊聚类评价算法及其应用[J].计算机技术与发展,2009,(09):232.
 WANG Yuan-yuan,NI Zhi-wei,ZHAO Yu-xiao,et al.Fuzzy Clustering Evaluation Algorithm Based on Decision Tree and Application[J].,2009,(07):232.
[5]李广水 郑滔 孙梅.基于分形维的决策树构建及应用研究[J].计算机技术与发展,2009,(12):5.
 LI Guang-shui,ZHENG Tao,SUN Mei.Research of Decision Tree Design and Application Based on Fractal Dimension[J].,2009,(07):5.
[6]石瑛 胡学钢 方磊.基于决策树的多特征语音情感识别[J].计算机技术与发展,2009,(01):147.
 SHI Ying,HU Xue-gang,FANG Lei.Research of Speech Emotion Recognition Based on Decision Tree and Acoustic Features[J].,2009,(07):147.
[7]李霞.ID3分类算法在银行客户流失中的应用研究[J].计算机技术与发展,2009,(03):158.
 LI Xia.ID3 Applying to Loss of Bank Clients[J].,2009,(07):158.
[8]马菁 顾景文.决策树在软件测试用例生成中的应用[J].计算机技术与发展,2008,(02):66.
 MA Jing,GU Jing-wen.Application of Decision Tree on Software Test Case Generation[J].,2008,(07):66.
[9]汪小燕 杨思春.一种基于分辨矩阵的新的属性约简算法[J].计算机技术与发展,2008,(02):77.
 WANG Xiao-yan,YANG Si-chun.A New Algorithm for AttributeReduction Based on Discernible Matrix[J].,2008,(07):77.
[10]刘星毅.一种新的决策树分裂属性选择方法[J].计算机技术与发展,2008,(05):70.
 LIU Xing-yi.A New Splitting Criterion of Decision Trees[J].,2008,(07):70.
[11]张莹,毕卓. 基于SPMD的C4.5并行决策树加速分析[J].计算机技术与发展,2015,25(01):29.
 ZHANG Ying,BI Zhuo. Analysis of Parallel C4 . 5 Decision Tree Acceleration Based on SPMD[J].,2015,25(07):29.
[12]戴琳,张悦,韦玉,等. 基于 WEKA 平台的移动客户流量消费分析[J].计算机技术与发展,2016,26(01):115.
 DAI Lin,ZHANG Yue,WEI Yu,et al. Analysis of Mobile Customer Traffic Consumption Based on WEKA Platform[J].,2016,26(07):115.
[13]黄继鹏,陈志,芮路,等. 基于模糊聚类决策树的分布式语者识别算法[J].计算机技术与发展,2017,27(08):79.
 HUANG Ji-peng,CHEN Zhi,RUI Lu,et al. Distributed Speaker Identification Algorithm with Fuzzy Clustering Decision Tree[J].,2017,27(07):79.

更新日期/Last Update: 2017-08-22