[1]奚超亮,冷泳林.混合负采样的知识图谱嵌入[J].计算机技术与发展,2023,33(09):168-174.[doi:10. 3969 / j. issn. 1673-629X. 2023. 09. 025]
 XI Chao-liang,LENG Yong-lin.Knowledge Graph Embedding with Mixed Negative Sampling[J].,2023,33(09):168-174.[doi:10. 3969 / j. issn. 1673-629X. 2023. 09. 025]
点击复制

混合负采样的知识图谱嵌入()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
33
期数:
2023年09期
页码:
168-174
栏目:
人工智能
出版日期:
2023-09-10

文章信息/Info

Title:
Knowledge Graph Embedding with Mixed Negative Sampling
文章编号:
1673-629X(2023)09-0168-07
作者:
奚超亮冷泳林
渤海大学 信息科学与技术学院,辽宁 锦州 121000
Author(s):
XI Chao-liangLENG Yong-lin
School of Information Science and Technology,Bohai University,Jinzhou 121000,China
关键词:
翻译模型知识图谱三元组分类链路预测DBSCAN clustering负采样
Keywords:
translation modelknowledge graphtriple classificationlink predictiondensity-based spatial clustering of applications withnoise clusteringnegative sampling
分类号:
TP391
DOI:
10. 3969 / j. issn. 1673-629X. 2023. 09. 025
摘要:
知识图谱嵌入表示模型将实体与关系转化为低维的向量表示,来表达实体与关系之间的关联语义,是解决知识图谱补全问题的重要方法。 传统模型采用随机负采样来构
造负例三元组,容易产生低质量负样本,影响表示模型的特征学习能力。 基于相似性的负采样方法,对实体点进行聚类,提高了负采样的质量。 但针对知识图谱中的稀疏
点,因无法控制聚类点数量,导致模型性能降低。 经过对相似性负采样和样本点稀疏问题的研究,采用基于密度的聚类算法 DBSCAN(Density-Based Spatial Clustering of Applications with Noise) 对聚类中的样本进行头尾实体的替换,并对 DBSCAN 中的领域聚类半径采取了自适应优化,找到合适的聚类中心,降低离群点的数量。 同时对于
聚类外的离群点进行过采样,构造离群点的相似点,解决稀疏点负采样的问题。 最后,将该负采样方法与 TransE 结合,得到了混合负采样模型 TransE-DNS。研究结果表明:TransE-DNS 在链路预测和三元组分类任务上取得了更好的效果。
Abstract:
The embedding model of knowledge graph transforms entities and relationships into low dimensional vector representation toexpress the association semantics between entities and relationships,which is an important method to solve the problem of knowledgegraph completion. The traditional embedding model adopts?
random sampling to construct negative triples,which is easy to produce low-quality negative samples,affecting the feature learning ability of representation models.?
The clustering-based negative samplings clusterentity points to improve the quality of negative sampling. However,for the sparse points of the knowledge graph,the clustering cannotcontrol the number of clustering points,which leads to the degradation of the model performance. After researching on negative similaritysampling?
and sparse sample points,we adopt DBSCAN to replace the head and tail entities of the samples in the cluster and adaptively optimize the domain clustering radius?
in DBSCAN to find a suitable cluster center and reduce the number of outliers. At the same time,oversampling is conducted for outliers to build similarity points,which?
is used to solve the sparse point problem. Finally,the negativesampling method is combined with TransE to obtain the mixed negative sampling model Trans-DNS. The results show that TransE-DNShas achieved better results in link prediction and triple classification tasks.

相似文献/References:

[1]孙艳,田丽梅. 基于多维尺度分析的舆情研究主题词知识图谱[J].计算机技术与发展,2016,26(04):187.
 SUN Yan,TIAN Li-mei. Mapping Knowledge Domain on Subject Headings of Public Sentiment Research Based on Multi-dimensional Scaling[J].,2016,26(09):187.
[2]刘申凯,周霁婷,朱永华,等.融合知识图谱和 ESA 方法的网络新词识别[J].计算机技术与发展,2019,29(03):12.[doi:10.3969/ j. issn.1673-629X.2019.03.003]
 LIU Shen-kai,ZHOU Ji-ting,ZHU Yong-hua,et al.Network New Word Recognition Based on Fusion of Knowledge Graph and ESA[J].,2019,29(09):12.[doi:10.3969/ j. issn.1673-629X.2019.03.003]
[3]戈其平,钟艳如.基于数学教学的知识图谱构建[J].计算机技术与发展,2019,29(03):187.[doi:10.3969/ j. issn.1673-629X.2019.03.039]
 GE Qi-ping,ZHONG Yan-ru.Construction of Knowledge Atlas Based on Mathematics Teaching[J].,2019,29(09):187.[doi:10.3969/ j. issn.1673-629X.2019.03.039]
[4]魏 瑾,李伟华,潘 炜.基于知识图谱的智能决策支持技术及应用研究[J].计算机技术与发展,2020,30(01):1.[doi:10. 3969 / j. issn. 1673-629X. 2020. 01. 001]
 WEI Jin,LI Wei-hua,PAN Wei.Research on Intelligent Decision Support Technology and Application Based on Knowledge Graph[J].,2020,30(09):1.[doi:10. 3969 / j. issn. 1673-629X. 2020. 01. 001]
[5]项 威,王 邦.中文事件抽取研究综述[J].计算机技术与发展,2020,30(02):1.[doi:10. 3969 / j. issn. 1673-629X. 2020. 02. 001]
 XIANG Wei,WANG Bang.Survey of Chinese Event Extraction Research[J].,2020,30(09):1.[doi:10. 3969 / j. issn. 1673-629X. 2020. 02. 001]
[6]刘家祝,郭 强,吴碧伟,等.基于子图相交的社交账号与知识图谱实体对齐[J].计算机技术与发展,2020,30(05):10.[doi:10. 3969 / j. issn. 1673-629X. 2020. 05. 003]
 LIU Jia-zhu,GUO Qiang,WU Bi-wei,et al.Subgraph Intersection Based Alignment between Social Media Account and Knowledge Graph Entity[J].,2020,30(09):10.[doi:10. 3969 / j. issn. 1673-629X. 2020. 05. 003]
[7]陆菁宇,张绍阳,黄文旎.学科发展状态的知识图谱构建[J].计算机技术与发展,2020,30(06):145.[doi:10. 3969 / j. issn. 1673-629X. 2020. 06. 028]
 LU Jing-yu,ZHANG Shao-yang,HUANG Wen-ni.Analysis of Development Status of Discipline Based on Knowledge Graph[J].,2020,30(09):145.[doi:10. 3969 / j. issn. 1673-629X. 2020. 06. 028]
[8]黄东晋,秦 汉,郭 昊.基于 BERT-CNN 的电影原声智能问答系统[J].计算机技术与发展,2020,30(11):158.[doi:10. 3969 / j. issn. 1673-629X. 2020. 11. 029]
 HUANG Dong-jin,QIN Han,GUO Hao.Movie Soundtrack Intelligent Question and Answer System Based on BERT-CNN[J].,2020,30(09):158.[doi:10. 3969 / j. issn. 1673-629X. 2020. 11. 029]
[9]任佳妮,杨 阳.全球医疗机器人技术领域创新态势分析[J].计算机技术与发展,2021,31(04):158.[doi:10. 3969 / j. issn. 1673-629X. 2021. 04. 027]
 REN Jia-ni,YANG Yang.Analysis of Innovation Situation in Field of Global MedicalRobot Technology[J].,2021,31(09):158.[doi:10. 3969 / j. issn. 1673-629X. 2021. 04. 027]
[10]卢 琪,谢艺菲,谢 钧,等.知识图谱在智能问答中的应用研究[J].计算机技术与发展,2021,31(07):13.[doi:10. 3969 / j. issn. 1673-629X. 2021. 07. 003]
 LU Qi,XIE Yi-fei,XIE Jun,et al.Research on Application of Knowledge Graphs in Intelligent Question Answering[J].,2021,31(09):13.[doi:10. 3969 / j. issn. 1673-629X. 2021. 07. 003]

更新日期/Last Update: 2023-09-10