[1]陆俊尧,李玲娟.基于Spark的协同过滤算法并行化研究[J].计算机技术与发展,2019,29(01):85-89.[doi:10. 3969 / j. issn. 1673-629X. 2019. 01. 018]
 LU Jun-yao,LI Ling-juan.Research on Parallelization of Collaborative FilteringAlgorithm Based on Spark[J].,2019,29(01):85-89.[doi:10. 3969 / j. issn. 1673-629X. 2019. 01. 018]
点击复制

基于Spark的协同过滤算法并行化研究()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
29
期数:
2019年01期
页码:
85-89
栏目:
智能、算法、系统工程
出版日期:
2019-01-10

文章信息/Info

Title:
Research on Parallelization of Collaborative FilteringAlgorithm Based on Spark
文章编号:
1673-629X(2019)01-0085-05
作者:
陆俊尧 李玲娟
南京邮电大学 计算机学院,江苏 南京,210023
Author(s):
LU Jun-yaoLI Ling-juan
School of Computer Science,Nanjing University of Posts and Telecommunications,Nanjing 210023,China
关键词:
协同过滤 Spark平台 并行化 基于项目
Keywords:
collaborative filteringSpark platformparallelizationitem-based
分类号:
TP301
DOI:
10. 3969 / j. issn. 1673-629X. 2019. 01. 018
文献标志码:
A
摘要:
协同过滤算法在推荐系统中应用广泛.但是随着数据量的爆炸式增长,协同过滤算法所需的计算量也随之增长.针对传统的单机集中式计算已无法满足推荐系统的实时性和扩展性要求的问题,基于主流的大数据平台Spark在迭代计算以及内存计算方面的优势,设计了一种基于项目的协同过滤算法在Spark上的并行化方案.该方案利用RDD并行化计算的特点,通过合理设计RDD算子来实现对物品间相似度计算过程和评分计算过程的并行化,同时采用了RDD的缓存机制以及Spark中的广播变量来对一些重要的计算资源进行缓存与分发,从而提高计算速度.用MovieLens公开数据集对基于Spark平台的并行化Item-Based协同过滤算法的性能进行测试,结果表明该并行化协同过滤算法在准确性以及时效性方面均有较好的表现.
Abstract:
Collaborative filtering algorithm is a widely used in the recommendation system. However,with the explosive growth of the amount of data,the amount of computing required by the collaborative filtering algorithm also increases. The traditional centralized computing of single machine has not been able to meet the requirements of the real-time and expansibility of the recommended system. Based on the advantages of the mainstream big data platform Spark in iterative computing and memory computing,we design a parallelization scheme of item-based collaborative filtering algorithm based on Spark. Based on the parallelization characteristics of RDD,it realizes the parallelization of items’ similarity calculation and score calculation by reasonably designing RDD operator,at the same time using the cache mechanism of RDD and broadcast variables of Spark to cache and distribute some important computing resources,so as to improve the calculation speed. The performance of parallel item-based collaborative filtering algorithm based on Spark platform is tested by MovieLens dataset. The results show that this parallel collaborative filtering algorithm performs well in accuracy and timeliness.

相似文献/References:

[1]邵延振 蒙韧 袁鼎荣 李新友.基于Web结构分区的协同过滤推荐算法研究[J].计算机技术与发展,2010,(06):67.
 SHAO Yan-zhen,MENG Ren,YUAN Ding-rong,et al.Collaborative Filtering Recommendation Algorithm Research Based on Web Blocks[J].,2010,(01):67.
[2]查文琴 梁昌勇 曹镭.基于用户聚类的协同过滤推荐方法[J].计算机技术与发展,2009,(06):69.
 ZHA Wen-qin,LIANG Chang-yong,CAO Lei.Collaborative Filtering Recommendation Method Based on Clustering of Users[J].,2009,(01):69.
[3]姜雅倩 王直杰 张珏.基于供求关系及协同过滤技术的推荐模型研究[J].计算机技术与发展,2007,(06):18.
 JIANG Ya-qian,WANG Zhi-jie,ZHANG Jue.Research on Recommendation Model Based on Supply and Demand Relation and Collaborative Filtering[J].,2007,(01):18.
[4]游文 叶水生.电子商务推荐系统中的协同过滤推荐[J].计算机技术与发展,2006,(09):70.
 YOU Wen,YE Shui-sheng.A Survey of Collaborative Filtering Algorithm Applied in E- commerce Recommender System[J].,2006,(01):70.
[5]徐红 彭黎 郭艾寅 徐云剑.基于用户多兴趣的协同过滤策略改进研究[J].计算机技术与发展,2011,(04):73.
 XU Hong,PENG Li,GUO Ai-yin,et al.User-Based Collaborative Filtering Strategies More Interested in Improvement of Research[J].,2011,(01):73.
[6]杨东风 牛永洁.基于混合规则的图书推荐模型设计与研究[J].计算机技术与发展,2011,(07):210.
 YANG Dong-feng,NIU Yong-jie.Books Recommended Model Design and Research Based on Mixing Rules[J].,2011,(01):210.
[7]吴月萍 王娜 马良.基于蚁群算法的协同过滤推荐系统的研究[J].计算机技术与发展,2011,(10):73.
 WU Yue-ping,WANG Na,MA Liang.Research of Collaboration Filtering Recommendation System Based on Ant Algorithm[J].,2011,(01):73.
[8]李克潮,蓝冬梅.一种属性和评分的协同过滤混合推荐算法[J].计算机技术与发展,2013,(07):116.
 LI Ke-chao,LAN Dong-mei.A Collaborative Filtering Hybrid Recommendation Algorithm for Attribute and Rating[J].,2013,(01):116.
[9]范虎,花伟伟.协同过滤推荐算法的研究与改进[J].计算机技术与发展,2013,(09):66.
 FAN Hu[],HUA Wei-wei[].Research and Improvement of Collaborative Filtering Recommendation Algorithm[J].,2013,(01):66.
[10]李振博,徐桂琼,査九. 基于用户谱聚类的协同过滤推荐算法[J].计算机技术与发展,2014,24(09):59.
 LI Zhen-bo,XU Gui-qiong,ZHA Jiu. A Collaborative Filtering Recommendation Algorithm Based on User Spectral Clustering[J].,2014,24(01):59.

更新日期/Last Update: 2019-01-10