[1]李玉伟,杨 庚.满足差分隐私的一种频繁序列挖掘算法[J].计算机技术与发展,2022,32(05):99-105.[doi:10. 3969 / j. issn. 1673-629X. 2022. 05. 017]
 LI Yu-wei,YANG Geng.An Algorithm for Mining Frequent Sequence under Differential Privacy[J].,2022,32(05):99-105.[doi:10. 3969 / j. issn. 1673-629X. 2022. 05. 017]
点击复制

满足差分隐私的一种频繁序列挖掘算法()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
32
期数:
2022年05期
页码:
99-105
栏目:
网络与安全
出版日期:
2022-05-10

文章信息/Info

Title:
An Algorithm for Mining Frequent Sequence under Differential Privacy
文章编号:
1673-629X(2022)05-0099-07
作者:
李玉伟1 杨 庚12
1. 南京邮电大学 计算机学院、软件学院、网络空间安全学院,江苏 南京 210023;
2. 江苏省大数据安全与智能处理重点实验室,江苏 南京 210023
Author(s):
LI Yu-wei1 YANG Geng12
1. School of Computer,Software and Cyberspace Security,Nanjing University of Posts and Telecommunications,Nanjing 210023,China;
2. Jiangsu Key Laboratory of Big Data Security and Intelligent Processing,Nanjing 210023,China
关键词:
频繁模式序列数据差分隐私拉普拉斯噪音稀疏向量技术
Keywords:
frequent patternssequential datadifferential privacyLaplace noisesparse vector technology
分类号:
TP309
DOI:
10. 3969 / j. issn. 1673-629X. 2022. 05. 017
摘要:
在这个大数据时代,无论是数据量还是数据种类都在以极快的速度增长,因此数据挖掘技术在各行各业( 例如移动轨迹预测、广告投递、医疗诊断等方面) 中都得到了广泛的运用。 频繁序列挖掘是数据挖掘领域中的一个重要方向,但是在挖掘过程中和发布序列数据时很有可能会泄露一些用户的隐私信息,产生严重的安全隐患。 Dwork 等人提出的差分隐私模型可以为数据挖掘的隐私保护提供安全保证,与传统的隐私保护方法(基于 k-匿名及其扩展分组模型) 相比,该模型通过添加噪音对数据进行扰动,即使攻击者拥有最大的背景知识也能达到差分隐私保护的目的。 文章设计了一种渐进式序列挖掘差分隐私保护算法,该算法通过改进的稀疏向量技术实现对挖掘过程添加拉普拉斯噪音,并对候选频繁序列的真实支持度以及阈值进行扰动。 算法在理论角度被证明满足差分隐私,在真实数据集上的实验结果表明该算法具有较好的可用性。
Abstract:
In this era of big data,both the amount and types of data are growing at a very fast speed,so data mining technology has been widely used in? ? ? all walks of life ( such as trajectory prediction, advertising delivery, medical diagnosis and so on) . Frequent sequence mining is an important direction in the field of data mining,but in the process of mining and publishing sequence data,it is likely to leak some users爷 privacy information, resulting in serious security risks. The differential privacy model proposed by Dwork can provide security guarantee for the privacy protection of data mining. Compared with the traditional privacy protection method ( based on k -anonymity and its extended grouping model) ,this model can achieve the purpose of differential privacy protection by adding noise to disturb the data,even if the attacker has the largest background knowledge. An improved SVT ( sparse vector technology) method is used to add Laplace noise to a new progressive mining algorithm, which disturbs the real support of candidate frequent sequences and threshold. The algorithm is proved to satisfy the differential privacy in theory,and the experiment on real data sets also shows high -quality usability.

相似文献/References:

[1]段仰广 韦玉科.基于循环十字链表的频繁模式挖掘算法[J].计算机技术与发展,2009,(10):73.
 DUAN Yang-guang,WEI Yu-ke.Algorithm for Mining Frequent Patterns Based on Circular Orthogonal Linked List[J].,2009,(05):73.
[2]张友志 江伟 江晋剑.一种基于编码的关联规则挖掘算法[J].计算机技术与发展,2008,(12):92.
 ZHANG You-zhi,JIANG Wei,JIANG Jin-jian.An Association Rule Mining Algorithm Based on Code[J].,2008,(05):92.
[3]程舒通.Web点击流的频繁模式聚类算法[J].计算机技术与发展,2007,(09):18.
 CHENG Shu-tong.Clustering Algorithm of Web Click Flow Frequency Pattern[J].,2007,(05):18.
[4]史金成 胡学钢.数据流挖掘研究[J].计算机技术与发展,2007,(11):11.
 SHI Jin-cheng,HU Xue-gang.Study on Data Stream Mining[J].,2007,(05):11.
[5]程转流[] 王本年.数据流中的频繁模式挖掘[J].计算机技术与发展,2007,(12):53.
 CHENG Zhuan-liu,WANG Ben-nian.Frequent Pattern Mining in Data Streams[J].,2007,(05):53.
[6]马可,李玲娟,孙杜靖. 分布式并行化数据流频繁模式挖掘算法[J].计算机技术与发展,2016,26(07):75.
 MA Ke,LI Ling-juan,SUN Du-jing. Distributed Parallel Algorithm of Mining Frequent Pattern on Data Stream[J].,2016,26(05):75.

更新日期/Last Update: 2022-05-10