[1]林江豪,顾也力,周咏梅,等.基于表情符号的情感词典的构建研究[J].计算机技术与发展,2019,29(06):181-185.[doi:10. 3969 / j. issn. 1673-629X. 2019. 06. 037]
 LIN Jiang-hao,GU Ye-li,ZHOU Yong-mei,et al.Research on Building Sentiment Lexicon Based on Emoticons[J].,2019,29(06):181-185.[doi:10. 3969 / j. issn. 1673-629X. 2019. 06. 037]
点击复制

基于表情符号的情感词典的构建研究()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
29
期数:
2019年06期
页码:
181-185
栏目:
应用开发研究
出版日期:
2019-06-10

文章信息/Info

Title:
Research on Building Sentiment Lexicon Based on Emoticons
文章编号:
1673-629X(2019)06-0181-05
作者:
林江豪12 顾也力1 周咏梅23 阳爱民23 陈摇 锦12
1. 广东外语外贸大学,广东 广州 510006;2. 广东外语外贸大学 语言工程与计算实验室,广东 广州 510006;3. 广东外语外贸大学 信息科学与技术学院,广东 广州 510006
Author(s):
LIN Jiang-hao12 GU Ye-li1 ZHOU Yong-mei2 3 YANG Ai-min2 3 CHEN Jin12
1. Guangdong University of Foreign Studies,Guangzhou 510006,China; 2. Laboratory for Language Engineering and Computing,Guangdong University of Foreign Studies,Guangzhou 510006,China; 3. School of Information Science and Technology,Guangdong University of Foreign Studies,Guangzhou 510006,China
关键词:
情感词典情感词情感权值种子表情符号SO-PMITF-IDF
Keywords:
sentiment lexiconsentiment wordsentimental weightseed emoticonsSO-PMITF-IDF
分类号:
TP391
DOI:
10. 3969 / j. issn. 1673-629X. 2019. 06. 037
摘要:
情感词典是文本情感分析的基础资源。 利用表情符号明显的情感表达作用,提出一种基于种子表情符和 SO-PMI算法结合的情感词典构建方法。 选择 44 个情感明显、内容丰富的表情符号词作为种子情感集合。 构建过程融合了 TF-IDF 值在词汇重要程度的度量作用,有效选择候选情感词集。 基于 SO-PMI 算法,在大量语料中计算候选情感词汇与种子表情符号之间的情感共现信息,进而确定词汇的情感权值和极性。 在 500 万条微博语料中,计算并构建情感词典SentiNet,共有情感词汇 13 814 个,其中正向词汇 6 885 个,负向词汇 6 929 个。 将 SentiNet 应用于微博文本情感分析任务中,实验结果表明,SentiNet 能实现情感词的情感表示,并可应用于大规模的微博语料情感分析任务。 该方法融合了情感词的重要度衡量优势和种子表情符号集的情感表达优势,证明了获得的情感权值有效。
Abstract:
Sentiment lexicon is the basic resource of text sentiment analysis. By using the advantages of the obvious emotion expression of emoticons,we propose a construction method of sentiment lexicon via seed emoticons and SO-PMI method. First of all,forty-four sentimental emoticons,which possess obvious sentiment and rich content,are choose as a set of seed words. Then,candidate sentimental words among the microblog texts are acquired via the measuring value TF-IDF. Based on the SO-PMI method,the sentimental concurrence information between the candidate sentimental words and the seed emoticons can be calculated in a large set of texts,and then the sentimental weight and polarity of the candidate sentimental words is determined. Subsequently,the sentimental weight of the candidate sentimental words is calculated based on five million microblog texts. And the sentiment lexicon (SentiNet) is built,with a size of 13 814 sentiment words,including 6 885 positive words and 6 929 negative words. Finally, SentiNet is applied into the polarityclassification of sentimental text analysis. The experiment shows that SentiNet can represent sentiment of sentimental words and is more adaptable into massive microblog text sentiment analysis. The proposed method combines the importance measure advantage of affective words with the sentimental expression advantage of seed emoticons,and the sentimental weight is effective.

相似文献/References:

[1]王义真,郑 啸,后 盾,等.基于SVM 的高维混合特征短文本情感分类[J].计算机技术与发展,2018,28(02):88.[doi:10.3969/j.issn.1673-629X.2018.02.020]
 WANG Yi-zhen,ZHENG Xiao,HOU Dun,et al.Short Text Sentiment Classification of High Dimensional Hybrid Feature Based on SVM[J].,2018,28(06):88.[doi:10.3969/j.issn.1673-629X.2018.02.020]
[2]杨立月,王移芝.微博情感分析的情感词典构造及分析方法研究[J].计算机技术与发展,2019,29(02):13.[doi:10.3969/j.issn.1673-629X.2019.02.003]
 YANG Liyue,WANG Yizhi.Research on Construction and Analysis of Emotion Dictionary in Emotion Analysis of Micro-blog[J].,2019,29(06):13.[doi:10.3969/j.issn.1673-629X.2019.02.003]
[3]邱全磊,崔宗敏,喻 静.基于表情和语气的情感词典用于弹幕情感分析[J].计算机技术与发展,2020,30(08):178.[doi:10. 3969 / j. issn. 1673-629X. 2020. 08. 031]
 QIU Quan-lei,CUI Zong-min,YU Jing.Emotional Dictionary Based on Emoticons and Modal for Barrage Sentiment Analysis[J].,2020,30(06):178.[doi:10. 3969 / j. issn. 1673-629X. 2020. 08. 031]
[4]刘玉文,翟菊叶,朱文婕,等.基于文本语义的热点事件网络暴力分析方法[J].计算机技术与发展,2022,32(07):208.[doi:10. 3969 / j. issn. 1673-629X. 2022. 07. 036]
 LIU Yu-wen,ZHAI Ju-ye,ZHU Wen-jie,et al.A Text Semantics Based Approach for Cyber Violence Analysis on Hot Event[J].,2022,32(06):208.[doi:10. 3969 / j. issn. 1673-629X. 2022. 07. 036]
[5]黄卫东,程小香.基于微博平台的舆情参与主体情感强度研究[J].计算机技术与发展,2022,32(11):140.[doi:10. 3969 / j. issn. 1673-629X. 2022. 11. 021]
 HUANG Wei-dong,CHENG Xiao-xiang.Research on Emotional Intensity of Public Opinion Participants Based on Microblog Platform[J].,2022,32(06):140.[doi:10. 3969 / j. issn. 1673-629X. 2022. 11. 021]

更新日期/Last Update: 2019-06-10