[1]唐籍涛 李飞 郭昌松.网络舆情监控中新词识别问题的研究[J].计算机技术与发展,2012,(01):119-121.
 TANG Ji-tao,LI Fei,GUO Chang-song.Research of New Word Pattern Recognization in Network Monitoring Public Opinion[J].,2012,(01):119-121.
点击复制

网络舆情监控中新词识别问题的研究()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
期数:
2012年01期
页码:
119-121
栏目:
智能、算法、系统工程
出版日期:
1900-01-01

文章信息/Info

Title:
Research of New Word Pattern Recognization in Network Monitoring Public Opinion
文章编号:
1673-629X(2012)01-0119-03
作者:
唐籍涛1 李飞2 郭昌松1
[1]成都信息工程学院计算机系[2]成都信息工程学院网络工程系
Author(s):
TANG Ji-tao LI Fei GUO Chang-song
[1]Department of Computer Science, Chengdu University of Information Technology[2]Department of Network Engineering, Chengdu University of Information Technology
关键词:
网络舆情监控新词识别分词词典
Keywords:
network monitoring public opinionnew word pattem recognizationdictionary
分类号:
TP31
文献标志码:
A
摘要:
在网络舆情监控中,由于事件的突发性和网络词汇的泛滥,各种各样的新兴词汇以及新的字符串大量涌现,而有穷的分词词典对新词的识别基本上无能为力,这些无法识别的字符串将被现有的分词系统分为零散的碎片,这将极大地影响热点词和主题词提取的准确性,成为网络舆情监控系统性能提升的瓶颈。文中分析了当前主要的几种分词技术的优缺点,利用网络舆情监控中未被词典收录的主题词的局部高频这一特性,通过计算异常分词与周围分词之间的粘结度,从而识别出未被词典收录的主题词。实验结果表明:所提出的分词算法能识别出未被词典收录的主题词,相比传统的分词算法,更加适合于网络舆情监控
Abstract:
With rapid development and deepen evolution of internet public opinion in the internet, a variety of new vocabulary and new string comes out due to the sudden of matters and the high frequence of new words occur on network, therefore, the current method of sub -dictionary has no effect on them in a large extent. The most important and most deadly is that those rare appear strings are divided into scattered fragments by the existing segmentation system, which will greatly affect the accuracy in extracting out the hot words and the keywords. Know that the situation will become the bottleneck of improving performance in network monitoring system. It analyzes the major advantages and disadvantages of several word segmentation and draw out the characteristics ,using the local high-frequency of the keyword not included into dictionary in the monitoring public opinion, then calculating the anomalous bond between the abnormal words and its around words, finally, to identify the keywords not edit. The experiment shows : compared to the traditional segmentation algo- rithm, this segmentation algorithm can identify the keywords better and is more suitable for network monitoring public opinion

相似文献/References:

[1]徐远方 李成城.基于SVM和词间特征的新词识别研究[J].计算机技术与发展,2012,(05):134.
 XU Yuan-fang,LI Cheng-cheng.Research on New Word Identification Based on SVM and Word Characteristics[J].,2012,(01):134.
[2]徐远方,李成城.基于支持向量机和约束条件的新词识别研究[J].计算机技术与发展,2014,24(01):98.
 XU Yuan-fang,LI Cheng-cheng.Research on New Word Identification Based on Support Vector Machine and Constraint Condition[J].,2014,24(01):98.
[3]刘申凯,周霁婷,朱永华,等.融合知识图谱和 ESA 方法的网络新词识别[J].计算机技术与发展,2019,29(03):12.[doi:10.3969/ j. issn.1673-629X.2019.03.003]
 LIU Shen-kai,ZHOU Ji-ting,ZHU Yong-hua,et al.Network New Word Recognition Based on Fusion of Knowledge Graph and ESA[J].,2019,29(01):12.[doi:10.3969/ j. issn.1673-629X.2019.03.003]

备注/Memo

备注/Memo:
四川省教育科研项目(川教函[2011]210号)唐籍涛(1986-),男,四川成都人,硕士研究生,研究方向为网络舆情监控;李飞,教授,硕士生导师,研究方向为计算机应用与信息安全
更新日期/Last Update: 1900-01-01