[1]徐新瑞,孟彩霞,周雯,等. 一种基于Spark时效化协同过滤推荐算法[J].计算机技术与发展,2015,25(06):48-55.
 XU Xin-rui,MENG Cai-xia,ZHOU Wen,et al. A Real-time Collaborative Filtering Recommendation Algorithm Based on Spark[J].,2015,25(06):48-55.
点击复制

 一种基于Spark时效化协同过滤推荐算法()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
25
期数:
2015年06期
页码:
48-55
栏目:
智能、算法、系统工程
出版日期:
2015-06-10

文章信息/Info

Title:
 A Real-time Collaborative Filtering Recommendation Algorithm Based on Spark
文章编号:
1673-629X(2015)06-0048-08
作者:
 徐新瑞孟彩霞周雯刘盈
 西安邮电大学 计算机学院
Author(s):
 XU Xin-ruiMENG Cai-xiaZHOU WenLIU Ying
关键词:
 在线学习自适应软边缘软置信权重二阶协同过滤推荐系统HadoopSparK on YARN
Keywords:
 online learningadaptive soft marginsoft confidence weight second order collaborative filtering recommender system Ha-doop Spark on YARN
分类号:
TP301.6
文献标志码:
A
摘要:
 针对传统的批量学习的基于模型的协同过滤算法对新用户(物品)更新缓慢,模型重训练成本高且扩展性不足,对噪音数据的处理有待提高,尤其是随着数据量的增长和时效性要求越来越高,挖掘其中的知识变得越来越困难等问题,对置信权重在线协同过滤算法进行改进。引入自适应软边缘,提出二阶在线优化方法处理在线协同过滤中问题的新算法( Soft Confidence Weighted Online Collaborative Filtering,SCWOCF),并在SparK流处理推荐框架下利用四组真实数据与相关算法作对比测试。实验结果表明,新算法能够及时处理用户(物品)的动态变化,并提升推荐的实时性和准确性,降低计算成本,对噪声数据健壮性更强。
Abstract:
 Focused on some drawbacks of traditional collaborative filtering algorithms based on model of batch learning,such as updating slowly for new users or items,highly retraining cost and expanding difficultly,and handling noise data need to be improved,especially, being more and more difficult for knowledge mining with growing data and the requirement of real-time,the online collaborative filtering algorithm of confidence weighted is improved. In order to solve these problems, a new algorithm named SCWOCF ( Soft Confidence Weighted Online Collaborative Filtering) was proposed. In this algorithm,the adaptive soft margin was added and the second order online optimization methodology was used to solve online collaborative filtering problems. Finally, several experiments with four real-world datasets was conducted compared with some similar algorithms on the Spark stream processing recommendation framework. The results show that the new algorithm can timely handle dynamic change of users and items,promoting the real-time and accuracy of recommenda-tion,reducing cost of computation,increasing robustness to noise data.

相似文献/References:

[1]王雪松 申群太.基于多Agent系统的自调节及协同工作的组合投资模型[J].计算机技术与发展,2010,(05):117.
 WANG Xue-song,SHEN Qun-tai.Combination Investment Model of Auto-Adjustment and Cooperate with Work Based on Multi-Agent System[J].,2010,(06):117.
[2]王娟 柴玉梅.基于多议题协商的贝叶斯学习[J].计算机技术与发展,2006,(02):154.
 WANG Juan,CHAI Yu-mei.A Bayesian Learning Based on Multi- Issues Negotiation[J].,2006,(06):154.
[3]张志宏,吴庆波,邵立松,等.基于飞腾平台TOE协议栈的设计与实现[J].计算机技术与发展,2014,24(07):1.
 ZHANG Zhi-hong,WU Qing-bo,SHAO Li-song,et al. Design and Implementation of TCP/IP Offload Engine Protocol Stack Based on FT Platform[J].,2014,24(06):1.
[4]梁文快,李毅. 改进的基因表达算法对航班优化排序问题研究[J].计算机技术与发展,2014,24(07):5.
 LIANG Wen-kuai,LI Yi. Research on Optimization of Flight Scheduling Problem Based on Improved Gene Expression Algorithm[J].,2014,24(06):5.
[5]黄静,王枫,谢志新,等. EAST文档管理系统的设计与实现[J].计算机技术与发展,2014,24(07):13.
 HUANG Jing,WANG Feng,XIE Zhi-xin,et al. Design and Implementation of EAST Document Management System[J].,2014,24(06):13.
[6]侯善江[],张代远[][][]. 基于样条权函数神经网络P2P流量识别方法[J].计算机技术与发展,2014,24(07):21.
 HOU Shan-jiang[],ZHANG Dai-yuan[][][]. P2P Traffic Identification Based on Spline Weight Function Neural Network[J].,2014,24(06):21.
[7]李璨,耿国华,李康,等. 一种基于三维模型的文物碎片线图生成方法[J].计算机技术与发展,2014,24(07):25.
 LI Can,GENG Guo-hua,LI Kang,et al. A Method of Obtaining Cultural Debris’ s Line Chart Based on Three-dimensional Model[J].,2014,24(06):25.
[8]翁鹤,皮德常. 混沌RBF神经网络异常检测算法[J].计算机技术与发展,2014,24(07):29.
 WENG He,PI De-chang. Chaotic RBF Neural Network Anomaly Detection Algorithm[J].,2014,24(06):29.
[9]刘茜[],荆晓远[],李文倩[],等. 基于流形学习的正交稀疏保留投影[J].计算机技术与发展,2014,24(07):34.
 LIU Qian[],JING Xiao-yuan[,LI Wen-qian[],et al. Orthogonal Sparsity Preserving Projections Based on Manifold Learning[J].,2014,24(06):34.
[10]尚福华,李想,巩淼. 基于模糊框架-产生式知识表示及推理研究[J].计算机技术与发展,2014,24(07):38.
 SHANG Fu-hua,LI Xiang,GONG Miao. Research on Knowledge Representation and Inference Based on Fuzzy Framework-production[J].,2014,24(06):38.
[11]郭新. 一种改进的短期交通流量预测算法研究[J].计算机技术与发展,2015,25(02):103.
 GUO Xin. Research on an Improved Prediction Algorithm for Short-term Traffic Flow [J].,2015,25(06):103.
[12]何利雪,陈健. 基于ERBAC模式的在线考试模型研究与实现[J].计算机技术与发展,2015,25(04):139.
 HE Li-xue,CHEN Jian. Research and Realization of Online Test/Examination System Based on an ERBAC Model[J].,2015,25(06):139.

更新日期/Last Update: 2015-07-27