[1]蔡松成,牛耘.基于最大期望算法的蛋白质交互关系识别[J].计算机技术与发展,2018,28(08):48-52.[doi:10.3969/ j. issn.1673-629X.2018.08.010]
 CAI Song-cheng,NIU Yun.Protein-protein Interaction Identification Based on Expectation Maximization Algorithm[J].,2018,28(08):48-52.[doi:10.3969/ j. issn.1673-629X.2018.08.010]
点击复制

基于最大期望算法的蛋白质交互关系识别()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
28
期数:
2018年08期
页码:
48-52
栏目:
智能、算法、系统工程
出版日期:
2018-08-10

文章信息/Info

Title:
Protein-protein Interaction Identification Based on Expectation Maximization Algorithm
文章编号:
1673-629X(2018)08-0048-05
作者:
蔡松成牛耘
南京航空航天大学 计算机科学与技术学院,江苏 南京 211106
Author(s):
CAI Song-chengNIU Yun
School of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China
关键词:
蛋白质交互最大期望算法多实例多标记蛋白质实体识别
Keywords:
protein-protein interactionexpectation maximization algorithmmulti-instance multi-label protein entity recognition
分类号:
TP391
DOI:
10.3969/ j. issn.1673-629X.2018.08.010
文献标志码:
A
摘要:
针对基于远监督的方法中训练数据存在噪音的问题,采用了一种基于最大期望(EM)算法的多实例多标记的方法来进行蛋白质关系的抽取。 首先通过对大规模生物医学文本的自动搜索建立目标蛋白质对的签名档,提取出签名档中的词法和语法等特征,作为蛋白质对签名档的向量空间模型(VSM);然后引入隐变量,将蛋白质对的签名档及其标签构建为多实例多标记学习模型,利用最大期望算法来迭代消除训练数据中的噪音;最后通过有监督的方法来预测未知蛋白质对的交互关系。 针对蛋白质对描述中还存在的其他蛋白质名称会对交互关系的判断产生影响,改进了蛋白质对的特征表示。 实验结果表明,该方法较原始的最大期望算法取得了更高且均衡的精确度和召回率。
Abstract:
In order to solve the problem of noise in training data based on remote supervision,a multi-instance multi-label method based on maximum expectation (EM) algorithm is adopted to extract protein relations. The signature of a protein pair is obtained first by searching large scale biomedical text and lexical and syntactic features are extracted to form protein pair’s vector space model (VSM).Then,we jointly model the signatures of protein pairs and their labels using MIML learning with latent variable and reduce noise iteratively by using EM algorithm. Finally,we predict whether unknown protein pairs are interactive or not with supervised method. As the signature of the target protein pair usually contains other proteins which may affect the judgment of the interaction between target protein pairs,we improve the feature expression of protein pairs. The experiment shows that the method has achieved high and well balanced precision
and recall compared to the original EM algorithm.

相似文献/References:

[1]王宇伟,牛耘. 基于关系相似性的蛋白质交互作用识别[J].计算机技术与发展,2015,25(02):42.
 WANG Yu-wei,NIU Yun. Identification of Protein-protein Interaction Based on Relational Similarity[J].,2015,25(08):42.
[2]彭昀磊,牛 耘.基于弱监督的蛋白质交互识别[J].计算机技术与发展,2018,28(02):19.[doi:10.3969/j.issn.1673-629X.2018.02.005]
 PENG Yunlei,NIU Yun.Protein-protein Interaction Identification Based on Weak Supervision[J].,2018,28(08):19.[doi:10.3969/j.issn.1673-629X.2018.02.005]
[3]吴红梅,牛耘. 基于词性加权和单词相似性的蛋白质交互识别[J].计算机技术与发展,2015,25(12):6.
 WU Hong-mei,NIU Yun. Protein-protein Interaction Identification Based on POS Weighted and Word Similarity[J].,2015,25(08):6.
[4]吴红梅,牛耘. 基于特征加权的蛋白质交互识别[J].计算机技术与发展,2016,26(02):114.
 WU Hong-mei,NIU Yun. Identification of Protein-protein Interaction Based on Feature Weighted[J].,2016,26(08):114.
[5]彭昀磊,牛耘.基于词向量的特征词选择[J].计算机技术与发展,2018,28(06):7.[doi:10.3969/ j. issn.1673-629X.2018.06.002]
 PENG Yun-lei,NIU Yun.Feature Words Selection Based on Word Embedding[J].,2018,28(08):7.[doi:10.3969/ j. issn.1673-629X.2018.06.002]
[6]张景,吴红梅,牛耘. 基于Minimum Cuts的蛋白质交互识别[J].计算机技术与发展,2017,27(06):17.
 ZHANG Jing,WU Hong-mei,NIU Yun. Identification of Protein-protein Interaction with Minimum Cuts[J].,2017,27(08):17.
[7]范 莹,郝琳娜,易 华,等.基于最大期望和协同过滤算法的研究与应用[J].计算机技术与发展,2017,27(12):139.[doi:10.3969/ j. issn.1673-629X.2017.12.030]
 FAN Ying,HAO Lin-na,YI Hua,et al.Research and Application of Algorithm Based on Maximum Expectation and Collaborative Filtering[J].,2017,27(08):139.[doi:10.3969/ j. issn.1673-629X.2017.12.030]
[8]闵庆凯,蔡松成.基于交叉预测的蛋白质交互识别[J].计算机技术与发展,2018,28(04):17.[doi:10.3969/ j. issn.1673-629X.2018.04.004]
 MIN Qing-kai,CAI Song-cheng.Protein-protein Interaction Identification Based on Cross Prediction[J].,2018,28(08):17.[doi:10.3969/ j. issn.1673-629X.2018.04.004]
[9]毛宇薇,牛耘.基于分布式假设的弱监督蛋白质交互关系识别[J].计算机技术与发展,2018,28(09):34.[doi:10.3969/j.issn.1673-629X.2018.09.008]
 MAO Yu-wei,NIU Yun.Weakly Supervised Protein-protein Interaction Identification Based on Distribution Hypothesis[J].,2018,28(08):34.[doi:10.3969/j.issn.1673-629X.2018.09.008]
[10]蔡松成,牛耘.基于词频统计的蛋白质交互关系识别[J].计算机技术与发展,2019,29(02):65.[doi:10.3969/j.issn.1673-629X.2019.02.013]
 CAI Songcheng,NIU Yun.Protein-protein Interaction Identification Based on Word Frequency Count[J].,2019,29(08):65.[doi:10.3969/j.issn.1673-629X.2019.02.013]

更新日期/Last Update: 2018-10-26