[1]应 毅,黄 慧,刘定一.基于 PageRank 的热点发现混合算法研究[J].计算机技术与发展,2019,29(09):81-85.[doi:10. 3969 / j. issn. 1673-629X. 2019. 09. 016]
 YING Yi,HUANG Hui,LIU Ding-yi.Research on Hotspot Detection Hybrid Algorithm Based on PageRank[J].,2019,29(09):81-85.[doi:10. 3969 / j. issn. 1673-629X. 2019. 09. 016]
点击复制

基于 PageRank 的热点发现混合算法研究()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
29
期数:
2019年09期
页码:
81-85
栏目:
智能、算法、系统工程
出版日期:
2019-09-10

文章信息/Info

Title:
Research on Hotspot Detection Hybrid Algorithm Based on PageRank
文章编号:
1673-629X(2019)09-0081-05
作者:
应 毅黄 慧刘定一
三江学院 计算机科学与工程学院,江苏 南京 210012
Author(s):
YING YiHUANG HuiLIU Ding-yi
School of Computer Science and Technology,Sanjiang University,Nanjing 210012,China
关键词:
PageRank用户社会地位相似度指数热度指数
Keywords:
PageRankuser social statussimilarity indexheat index
分类号:
TP391
DOI:
10. 3969 / j. issn. 1673-629X. 2019. 09. 016
摘要:
社交网络下的热点话题发现技术是当前舆情分析与预测的基础性研究问题。 传统的基于聚类、分类的文本分析方法不适用于网络舆情挖掘,经典的 PageRank 算法仅考虑网页间的链接结构,为了更加准确和全面地多角度综合评价舆情热点,文中综合考虑用户社会地位、博文相似度指数和热度指数三个热点发现的重要指标,提出了基于 PageRank 和相似度计算的热点发现混合算法(HDH-PRSC)。 其中基于 PageRank 算法与微博用户粉丝间的链接结构图获取用户的社会地位值;结合 TF-IDF 算法与余弦相似性算法计算博文的相似度指数;利用转发数、评论数和点赞数获得博文的热度指数。博文的最终热度评分由用户社会地位值、博文相似度指数和热度指数三项分值相加获得。 依托新浪微博数据的实验表明,HDH-PRSC 算法能够更为合理地发现热点话题。
Abstract:
Hot topic discovery technology in social networks is a fundamental research issue in current public opinion analysis and prediction. However,the traditional text analysis method based on clustering and classification is not suitable for network public opinion mining,and the classical PageRank algorithm only considers the link structure between web pages. In order to evaluate public opinion hotspots more accurately and comprehensively from different angles,considering user social status,blog similarity index and heat index as three important indicators,we propose a hotspot detection hybrid algorithm based on PageRank and similarity calculation (HDH-PRSC), among which the social status value of users is obtained by the link structure map of micro-blog followers of each certain user,the similarity index of blog text is calculated based on TF-IDF algorithm and cosine similarity algorithm,while the heat index is got by forwarding numbers,comment numbers and point of praise. Finally,the heat score of blog can be obtained by adding the three scores of social status value,similarity index and heat index together. Experiments based on Sina micro-blog data shows that the HDH-PRSC algorithm can find hot topics more reasonably and effectively.

相似文献/References:

[1]陈学进.网络结构挖掘算法研究[J].计算机技术与发展,2009,(05):41.
 CHEN Xue-jin.Research of Algorithm for Web Structure Mining[J].,2009,(09):41.
[2]常庆 周明全 耿国华.基于PageRank和HITS的Web搜索[J].计算机技术与发展,2008,(07):77.
 CHANG Qing,ZHOU Ming-quan,GENG Guo-hua.PageRank and HITS- Based Web Search[J].,2008,(09):77.
[3]姜鑫维 赵岳松.Topic PageRank——一种基于主题的搜索引擎[J].计算机技术与发展,2007,(05):238.
 JIANG Xin-wei,ZHAO Yue-song.Topic PageRank:a Search Engine Based on Topic[J].,2007,(09):238.
[4]冯振明.Google核心——PageRank算法探讨[J].计算机技术与发展,2006,(07):82.
 FENG Zhen-ming.Google' s Core: Discussion about PageRank Algorithm[J].,2006,(09):82.
[5]李远方 邓世昆 闻玉彪 韩月阳.Hadoop-MapReduce下的PageRank矩阵分块算法[J].计算机技术与发展,2011,(08):6.
 LI Yuan-fang,DENG Shi-kun,WEN Yu-biao,et al.PageRank Matrix Partitioned Algorithm Using Hadoop-MapReduce[J].,2011,(09):6.
[6]舒琰,向阳,张骐,等.基于PageRank的微博排名MapReduce算法研究[J].计算机技术与发展,2013,(02):73.
 SHU Yan,XIANG Yang,ZHANG Qi,et al.Research on MapReduce Algorithm of Micro Blog Ranking Based on PageRank[J].,2013,(09):73.
[7]樊同科,谢勇.一种混合搜索算法在智能Web中的应用[J].计算机技术与发展,2013,(08):220.
 FAN Tong-ke,XIE Yong.Application of a Hybrid Search Algorithm in Intelligent Web[J].,2013,(09):220.
[8]吴家皋,刘杰,钱科宇,等. 基于改进排序算法的用户查询优化的研究[J].计算机技术与发展,2015,25(07):49.
 WU Jia-gao,LIU Jie,QIAN Ke-yu,et al. Research on User’ s Query Optimization Based on Improved Sorting Algorithm[J].,2015,25(09):49.
[9]王全民,赵亚康.基于 PageRank 算法的校园好友关系分析[J].计算机技术与发展,2020,30(01):140.[doi:10. 3969 / j. issn. 1673-629X. 2020. 01. 025]
 WANG Quan-min,ZHAO Ya-kang.Campus Friend Relationship Analysis Based on Pagerank Algorithm[J].,2020,30(09):140.[doi:10. 3969 / j. issn. 1673-629X. 2020. 01. 025]
[10]李 勇.一种改进的微博用户影响力分析算法[J].计算机技术与发展,2020,30(08):27.[doi:10. 3969 / j. issn. 1673-629X. 2020. 08. 005]
 LI Yong.An Improved Algorithm of Microblog User Influence Analysis[J].,2020,30(09):27.[doi:10. 3969 / j. issn. 1673-629X. 2020. 08. 005]

更新日期/Last Update: 2019-09-10