«上一篇/Previous Article|本期目录/Table of Contents|下一篇/Next Article»

j. issn. 1673-629X. 2022. 02. 035]
点击复制

基于改进 Attention Mask 编解码器 CPI 的研究()

分享到：

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:: 32
期数:: 2022年02期

页码:: 214-220

栏目:: 应用前沿与综合

出版日期:: 2022-02-10

文章信息/Info

Title:: Research on Compound-protein Interaction Classification Based on Improved Attention Mask Encoder-decoder

文章编号:: 1673-629X(2022)02-0214-07

作者:: 李大舟; 陈思思; 高巍; 于锦涛; 沈阳化工大学计算机科学与技术学院,辽宁沈阳 110142

Author(s):: LI Da-zhou; CHEN Si-si; GAO Wei; YU Jin-tao; School of Computer and Technology,Shenyang University of Chemical Technology,Shenyang 110142,China

关键词:: 深度学习; 多头自注意力; 化合物蛋白相互作用; Item2vec; 编码器-解码器

Keywords:: deep learning; multi-head self-attention; compound-protein interaction; Item2vec; encoder-decoder

分类号:: TP31

DOI:: 10. 3969 / j. issn. 1673-629X. 2022. 02. 035

摘要:: 化合物-蛋白质相互作用(CPI) 的研究对药物发现有着重要作用,它可以为药物靶标选择提供有价值的信息,在一定程度上提高先导化合物的命中率,进而加快药物发现的进程。由此提出了一种基于改进 Attention Mask 编解码器的化合物与蛋白质相互作用分类的预测模型,分别使用 RDkit 和 Item2vec 处理化合物的 SMILES 字符串和蛋白质的氨基酸序列,将得到的化合物和蛋白质低维特征表示的向量输入到该模型,通过分配权重的方式来计算蛋白质中的哪个子序列对化合物分子更重要,使用带有 Attention 机制的神经网络计算权重,模拟化合物和蛋白质之间的相互作用关系,最后作为一个二分类问题输出化合物和蛋白质是否相互作用的预测概率。模型性能测评采用 ROC 曲线下面积、准确召回率曲线作为评价指标,实验结果表明,该模型相比于 GraphDTA 和 GCN 模型而言,拥有更好的性能表现,AUC 值提高了 0. 04 左右,PRC 值提高了 0. 07 左右。

Abstract:: The study of compound-protein interaction ( CPI) plays an important role in drug discovery,which can provide valuable information for drug target selection,improve the hit rate of lead compounds to some extent,and accelerate the process of drug discovery.Therefore,a prediction model of compound-protein interaction classification based on the improved Attention Mask encoder-decoder isproposed. RDkit and Item2vec are used to process the SMILES string of the compound and the amino acid sequence of the protein,andthe vector representation of low - dimensional characteristics of compounds and proteins is input into the model. The assigned weight isused to calculate which subsequence in the protein is more important for the compound molecule. The neural network with Attentionmechanism is to calculate the weight and simulate the interaction between the compound and the protein. Finally as a binary classification problem, output the predicted probability of whether the compound and the protein interact. The model performance evaluation uses thearea under the ROC curve and the accurate recall curve as evaluation indicators. According to the experimental results,this model hasbetter performance than the GraphDTA and GCN models,with the AUC value increased by about 0. 04, and the PRC value increased byabout 0. 07.

相似文献/References:

[1]陈强锐,谢世朋.基于深度学习的肺部肿瘤检测方法[J].计算机技术与发展,2018,28(04):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
　CHEN Qiang-rui,XIE Shi-peng.Lung Cancer Detection Method Based on Deep Learning[J].,2018,28(02):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
[2]施泽浩,赵启军.基于全卷积网络的目标检测算法[J].计算机技术与发展,2018,28(05):55.[doi:10.3969/j.issn.1673－629X.2018.05.013]
　SHI Ze-hao,ZHAO Qi-jun.Object Detection Algorithm Based on Fully Convolutional Neural Network[J].,2018,28(02):55.[doi:10.3969/j.issn.1673－629X.2018.05.013]
[3]黄法秀,张世杰,吴志红,等.数据增广下的人脸识别研究[J].计算机技术与发展,2020,30(03):67.[doi:10. 3969 / j. issn. 1673-629X. 2020. 03. 013]
　HUANG Fa-xiu,ZHANG Shi-jie,WU Zhi-hong,et al.Research on Face Recognition Based on Data Augmentation[J].,2020,30(02):67.[doi:10. 3969 / j. issn. 1673-629X. 2020. 03. 013]
[4]陈浩翔,蔡建明,刘铿然,等. 手写数字深度特征学习与识别[J].计算机技术与发展,2016,26(07):19.
　CHEN Hao-xiang,CAI Jian-ming,LIU Keng-ran,et al. Deep Learning and Recognition of Handwritten Numeral Features[J].,2016,26(02):19.
[5]高翔,陈志,岳文静,等.基于视频场景深度学习的人物语义识别模型[J].计算机技术与发展,2018,28(06):53.[doi:10.3969/ j. issn.1673-629X.2018.06.012]
　GAO Xiang,CHEN Zhi,YUE Wen-jing,et al.Human Semantic Recognition Model Based on Video Scene Deep Learning[J].,2018,28(02):53.[doi:10.3969/ j. issn.1673-629X.2018.06.012]
[6]贺飞翔,赵启军. 基于深度学习的头部姿态估计[J].计算机技术与发展,2016,26(11):1.
　HE Fei-xiang,ZHAO Qi-jun. Head Pose Estimation Based on Deep Learning[J].,2016,26(02):1.
[7]徐融,邱晓晖.一种改进的 YOLO V3 目标检测方法[J].计算机技术与发展,2020,30(07):30.[doi:10. 3969 / j. issn. 1673-629X. 2020. 07. 007]
　XU Rong,QIU Xiao-hui.An Improved YOLO V3 Object Detection[J].,2020,30(02):30.[doi:10. 3969 / j. issn. 1673-629X. 2020. 07. 007]
[8]曾志平[] [],萧海东[],张新鹏[]. 基于DBN的金融时序数据建模与决策[J].计算机技术与发展,2017,27(04):1.
　ZENG Zhi-ping[] [],XIAO Hai-dong[],ZHANG Xin-peng[]. Modeling and Decision-making of Financial Time Series Data with DBN[J].,2017,27(02):1.
[9]李全兵,文钊*,田艳梅*,等.基于 WGAN 的音频关键词识别研究[J].计算机技术与发展,2021,31(08):26.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 005]
　LI Quan-bing,WEN Zhao *,TIAN Yan-mei *,et al.Research on Audio Keywords Recognition Based on WassersteinGenerative Adversarial Network[J].,2021,31(02):26.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 005]
[10]李宏林. 分析式纹理合成技术及其在深度学习的应用[J].计算机技术与发展,2017,27(11):7.
　LI Hong-lin. Analyzed Texture-synthesis Techniques and Their Applications in Deep Learning[J].,2017,27(02):7.
[11]徐丽燕,徐康*,黄兴挺,等.基于 Transformer 的时序数据异常检测方法[J].计算机技术与发展,2023,33(03):152.[doi:10. 3969 / j. issn. 1673-629X. 2023. 03. 023]
　XU Li-yan,XU Kang*,HUANG Xing-ting,et al.Transformer-based Method of Anomaly Detection for Time Series Data[J].,2023,33(02):152.[doi:10. 3969 / j. issn. 1673-629X. 2023. 03. 023]

常用功能

工具/Tools

统计/Statistics

摘要浏览/Viewed619
全文下载/Downloads411
评论/Comments