[1]毛宇薇,牛耘.基于分布式假设的弱监督蛋白质交互关系识别[J].计算机技术与发展,2018,28(09):34-37.[doi:10.3969/j.issn.1673-629X.2018.09.008]
 MAO Yu-wei,NIU Yun.Weakly Supervised Protein-protein Interaction Identification Based on Distribution Hypothesis[J].,2018,28(09):34-37.[doi:10.3969/j.issn.1673-629X.2018.09.008]
点击复制

基于分布式假设的弱监督蛋白质交互关系识别()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
28
期数:
2018年09期
页码:
34-37
栏目:
智能、算法、系统工程
出版日期:
2018-09-10

文章信息/Info

Title:
Weakly Supervised Protein-protein Interaction Identification Based on Distribution Hypothesis
文章编号:
1673-629X(2018)09-0034-04
作者:
毛宇薇 牛耘
南京航空航天大学 计算机科学与技术学院,江苏 南京,211106
Author(s):
MAO Yu-weiNIU Yun
School of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics, Nanjing 211106,China
关键词:
蛋白质交互 分布式假设 弱监督算法 关系相似性
Keywords:
protein-protein interactiondistribution hypothesisweakly-supervised methodrelational similarity
分类号:
TP301
DOI:
10.3969/j.issn.1673-629X.2018.09.008
文献标志码:
A
摘要:
蛋白质交互(protein-protein interaction)是生物医学领域一项重要的研究内容,目前由生物医学进行的PPI实验结果主要以文献的形式存储,随着生物医学文献的大量增加,以手工收集信息的方式已经难以满足实际需求.对此,提出一种基于分布式假设的弱监督蛋白质交互识别方法.首先,从描述蛋白质交互关系的上下文中提取表达语义关系的词汇模式,以少量有交互关系的蛋白质对构成初始种子集,基于分布式假设理论,根据词汇模式在种子集中的分布构建向量空间模型.然后依据相似性对词汇模式进行聚类,形成具有语义相似性的模式簇,利用这些簇在语料中找到新的具有相似分布的模式加入候选集.最后对候选集里的蛋白质对及其模式进行评估,挑选出满足条件的蛋白质对加入种子集进行迭代,最终得到有交互关系的蛋白质对.相比于现有方法,该方法考虑了上下文的语义相关性,实验结果表明,该方法以很小的种子集规模取得了较高的精确度与召回率.
Abstract:
Protein-protein interaction (PPI) is an important content of biological research. The results of PPI experiments carried out by biomedical research are mainly stored in the form of literature. With the increasing of biomedical literatures,the way of manually collecting information has been difficult to meet the actual needs. For this,we propose a weakly supervised protein-protein interaction identification approach based on distribution hypothesis. First,a few interactive protein pairs are collected as seeds,and lexical patterns of all protein pair which express semantic relation is extracted. Based on distribution hypothesis,vector space model is constructed according to distribution of patterns over seeds. Then,lexical patterns are clustered using the similarity. Using these clusters,some new semantically related patterns are recognized and then added to candidates. Lastly,based on the score of lexical patterns,protein pairs in candidates are evaluated and selected to the seed set. The seed set is expanded iteratively,and finally interactive protein pairs are identified. This approach considers the semantically relation in context and achieves high precision and recall by small seeds set compared to results of previous studies.

相似文献/References:

[1]王宇伟,牛耘. 基于关系相似性的蛋白质交互作用识别[J].计算机技术与发展,2015,25(02):42.
 WANG Yu-wei,NIU Yun. Identification of Protein-protein Interaction Based on Relational Similarity[J].,2015,25(09):42.
[2]彭昀磊,牛 耘.基于弱监督的蛋白质交互识别[J].计算机技术与发展,2018,28(02):19.[doi:10.3969/j.issn.1673-629X.2018.02.005]
 PENG Yunlei,NIU Yun.Protein-protein Interaction Identification Based on Weak Supervision[J].,2018,28(09):19.[doi:10.3969/j.issn.1673-629X.2018.02.005]
[3]吴红梅,牛耘. 基于词性加权和单词相似性的蛋白质交互识别[J].计算机技术与发展,2015,25(12):6.
 WU Hong-mei,NIU Yun. Protein-protein Interaction Identification Based on POS Weighted and Word Similarity[J].,2015,25(09):6.
[4]吴红梅,牛耘. 基于特征加权的蛋白质交互识别[J].计算机技术与发展,2016,26(02):114.
 WU Hong-mei,NIU Yun. Identification of Protein-protein Interaction Based on Feature Weighted[J].,2016,26(09):114.
[5]彭昀磊,牛耘.基于词向量的特征词选择[J].计算机技术与发展,2018,28(06):7.[doi:10.3969/ j. issn.1673-629X.2018.06.002]
 PENG Yun-lei,NIU Yun.Feature Words Selection Based on Word Embedding[J].,2018,28(09):7.[doi:10.3969/ j. issn.1673-629X.2018.06.002]
[6]张景,吴红梅,牛耘. 基于Minimum Cuts的蛋白质交互识别[J].计算机技术与发展,2017,27(06):17.
 ZHANG Jing,WU Hong-mei,NIU Yun. Identification of Protein-protein Interaction with Minimum Cuts[J].,2017,27(09):17.
[7]闵庆凯,蔡松成.基于交叉预测的蛋白质交互识别[J].计算机技术与发展,2018,28(04):17.[doi:10.3969/ j. issn.1673-629X.2018.04.004]
 MIN Qing-kai,CAI Song-cheng.Protein-protein Interaction Identification Based on Cross Prediction[J].,2018,28(09):17.[doi:10.3969/ j. issn.1673-629X.2018.04.004]
[8]蔡松成,牛耘.基于最大期望算法的蛋白质交互关系识别[J].计算机技术与发展,2018,28(08):48.[doi:10.3969/ j. issn.1673-629X.2018.08.010]
 CAI Song-cheng,NIU Yun.Protein-protein Interaction Identification Based on Expectation Maximization Algorithm[J].,2018,28(09):48.[doi:10.3969/ j. issn.1673-629X.2018.08.010]
[9]蔡松成,牛耘.基于词频统计的蛋白质交互关系识别[J].计算机技术与发展,2019,29(02):65.[doi:10.3969/j.issn.1673-629X.2019.02.013]
 CAI Songcheng,NIU Yun.Protein-protein Interaction Identification Based on Word Frequency Count[J].,2019,29(09):65.[doi:10.3969/j.issn.1673-629X.2019.02.013]

更新日期/Last Update: 2018-09-10