[1]吴春燕,李 理,黄鹏程,等.融合动态卷积注意力的机器阅读理解研究[J].计算机技术与发展,2023,33(07):160-166.[doi:10. 3969 / j. issn. 1673-629X. 2023. 07. 024]
 WU Chun-yan,LI Li,HUANG Peng-cheng,et al.Study on Machine Reading Comprehension Hybriding Dynamic Convolution Attention[J].,2023,33(07):160-166.[doi:10. 3969 / j. issn. 1673-629X. 2023. 07. 024]
点击复制

融合动态卷积注意力的机器阅读理解研究()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
33
期数:
2023年07期
页码:
160-166
栏目:
人工智能
出版日期:
2023-07-10

文章信息/Info

Title:
Study on Machine Reading Comprehension Hybriding Dynamic Convolution Attention
文章编号:
1673-629X(2023)07-0160-07
作者:
吴春燕1 李 理1 黄鹏程1 刘知贵1 2 张小乾2
1. 西南科技大学 计算机科学与技术学院,四川 绵阳 621000;
2. 西南科技大学 信息工程学院,四川 绵阳 621000
Author(s):
WU Chun-yan1 LI Li1 HUANG Peng-cheng1 LIU Zhi-gui12 ZHANG Xiao-qian2
1. School of Computer Science and Technology,Southwest University of Science and Technology,Mianyang 621000,China;
2. School of Information Engineering,Southwest University of Science and Technology,Mianyang 621000,China
关键词:
机器阅读理解片段抽取答案预测长短期记忆神经网络动态卷积
Keywords:
machine reading comprehensionspan-extractinganswer predictionlong short-term memorydynamic convolution
分类号:
TP391
DOI:
10. 3969 / j. issn. 1673-629X. 2023. 07. 024
摘要:
针对机器阅读理解在采用长短期记忆神经网络和注意力机制处理文本序列信息时,存在特征信息提取不足和预测结果准确性不高的问题,提出了一种融合动态卷积注意力的片段抽取型机器阅读理解模型。 该模型考虑到 LSTM 的当前输入和之前的状态相互独立,可能会导致上下文信息丢失,采用 Mogrifier 作为编码器,让当前输入与前一个状态充分交互多次,增强上下文和问题中的显著结构特征并减弱其次要特征;其次,由于静态卷积的卷积核相同,只能提取固定长度文本的特征,这可能对机器更好的理解文本产生阻碍,通过引入动态卷积,采用多个不同卷积核的一维卷积来捕获上下文和问题的局部结构,弥补注意力机制只有全局捕获能力的缺点。 在 SQuAD 数据集上的实验结果表明,与其他模型相比,该方法有效提升了模型在特征信息提取和答案预测方面的能力。
Abstract:
To solve the problems of insufficient feature information extraction and low accuracy of prediction results when?
using long short-term memory and attention mechanism to process text sequence information in machine reading comprehension, we propose a span -extracting machine reading comprehension model hybriding dynamic convolution attention. Considering that the current input and theprevious state of LSTM are independent of each other,which may lead to the loss of context information,the Mogrifier is adopted as theencoder,which makes the current input fully interact with the previous state several times, so as to enhance the significant structuralfeatures in the context and the problem and weaken the secondary features. Secondly,because the convolution kernel of static convolutionis the same,only the features of fixed length text can be extracted,which may hinder the machine from better understanding the text. Byintroducing dynamic convolution, one - dimensional convolution of multiple different convolution kernels is used to capture the localstructure of the context and the problem, which makes up for the disadvantage that the attention mechanism has only global captureability. Experimental results on SQuAD datasets show that compared with other models,the proposed method can effectively improve the model’ s ability in feature information extraction and answer prediction.
更新日期/Last Update: 2023-07-10