[1]刘 迪,奚雪峰,崔志明,等.抽取-生成式自动文本摘要技术研究综述[J].计算机技术与发展,2023,33(05):1-8.[doi:10. 3969 / j. issn. 1673-629X. 2023. 05. 001]
 LIU Di,XI Xue-feng,CUI Zhi-ming,et al.Review of Research on Extractive-abstractive Automatic Text Summarization Technology[J].,2023,33(05):1-8.[doi:10. 3969 / j. issn. 1673-629X. 2023. 05. 001]
点击复制

抽取-生成式自动文本摘要技术研究综述()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
33
期数:
2023年05期
页码:
1-8
栏目:
综述
出版日期:
2023-05-10

文章信息/Info

Title:
Review of Research on Extractive-abstractive Automatic Text Summarization Technology
文章编号:
1673-629X(2023)05-0001-08
作者:
刘 迪1 奚雪峰2 崔志明3 盛胜利4
1. 苏州科技大学 电子信息与工程学院,江苏 苏州 215000;
2. 苏州市虚拟现实智能交互及应用重点实验室,江苏 苏州 215000;
3. 苏州智慧城市研究院,江苏 苏州 215000;
4. 德州理工大学,得克萨斯州 拉伯克市 79401
Author(s):
LIU Di1 XI Xue-feng2 CUI Zhi-ming3 SHENG Sheng-li4
1. School of Electronic Information and Engineering,Suzhou University of Science and Technology,Suzhou 215000,China;
2. Suzhou Key Laboratory of Virtual Reality Intelligent Interaction and Application Technology,Suzhou 215000,China;
3. Suzhou Smart City Research Institute,Suzhou 215000,China;
4. Texas Institute of Technology,Lubbock 79401,USA
关键词:
自然语言处理自动文本摘要抽取-生成式评价方法数据集
Keywords:
natural language processingautomatic text summarizationextractive-abstractiveevaluation methoddatesets
分类号:
TP391
DOI:
10. 3969 / j. issn. 1673-629X. 2023. 05. 001
摘要:
自动文本摘要技术是一项利用计算机按照某类应用自动地将文本或文本集合转换成简短摘要的信息压缩技术。在当前互联网的快速发展背景下,涌现出大量复杂的信息,导致人工无法精准捕捉有效的信息。 为此,在本着更准确、更便捷、更高效地收集信息为目的的前提下,利用自然语言处理中自动文本摘要技术处理复杂文本的优势将显得格外突出。随着抽取式摘要技术和生成式摘要技术的发展成熟,抽取-生成式摘要技术逐渐兴起。 以技术分析为干线,对抽取-生成式摘要技术进行综述。 首先,介绍了抽取-生成式摘要技术中的评价方法以及常用中英文数据集;其次,通过实例分析六类主流技术方法并对比其优缺点:基于强化学习的方法、基于信息论的方法、基于指针网络的方法、基于序列标注的方法、基于预训练的方法、基于联合注意力的方法;最后,总结了抽取-生成式摘要技术面临的挑战并展望了抽取-生成式摘要技术未来的发展方向。
Abstract:
Automatic text summarization is an information compression technology that automatically converts text or text collection intoshort summarization by computer according to some kind of application. In the context of the rapid development of the current Internet,alarge number of complex information has emerged, resulting in manual cannot accurately capture effective information. Therefore, inorder to collect information more accurately,conveniently and efficiently,the advantage of using automatic text summarization technologyin natural language processing to deal with complex texts will be particularly prominent. With the development and maturity of extractivesummarization technology and abstractive summarization technology, extractive - abstractive summarization technology has emerged.Taking technical analysis as the main line,the extractive-abstractive summarization technology is reviewed. Firstly,we introduce the evaluation method of extractive-abstractive summarization technology and the commonly used Chinese and English data sets. Secondly,six mainstream technical methods are analyzed through examples and their advantages and disadvantages are compared,including methodsbased on reinforcement learning,methods based on information theory,methods based on pointer network,methods based on sequence labeling,methods based on pre-training,and methods based on joint attention. Finally,the challenges faced by extractive-abstractive summarization are summarized and the future development of extractive-abstractive summarization is prospected.

相似文献/References:

[1]陈国华 赵克 李亚涛 易帅.自然语言处理系统中的事件类名词的耦合处理[J].计算机技术与发展,2008,(06):60.
 CHEN Guo-hua,ZHAO Ke,LI Ya-tao,et al.Coupling Processing of Event Noun in NLP Systems[J].,2008,(05):60.
[2]程节华.基于FAQ的智能答疑系统中分词模块的设计[J].计算机技术与发展,2008,(07):181.
 CHENG Jie-hua.Design of Words Module in Intelligent Q/A System Based on FAQ[J].,2008,(05):181.
[3]杨欢 许威 赵克 陈余.动词属性在自然语言处理当中的研究与应用[J].计算机技术与发展,2008,(07):233.
 YANG Huan,XU Wei,ZHAO Ke,et al.Research and Application of Verb Attributes in Natural Language Processing[J].,2008,(05):233.
[4]孙超 张仰森.面向综合语言知识库的知识融合与获取研究[J].计算机技术与发展,2010,(08):25.
 SUN Chao,ZHANG Yang-sen.Research of Knowledge Integration and Obtaining Oriented Comprehensive Language Knowledge System[J].,2010,(05):25.
[5]党建 亿珍珍 赵克 殷鸿.数学领域集体词结构形式化处理研究[J].计算机技术与发展,2007,(05):121.
 DANG Jian,YI Zhen-zhen,ZHAO Ke,et al.Research of Formalization Processing for Collective Structures in Mathematics Domain[J].,2007,(05):121.
[6]江有福 郑庆华.自然语言网络答疑系统中倒排索引技术的研究[J].计算机技术与发展,2006,(02):126.
 JIANG You-fu,ZHENG Qing-hua.Research of Inverted Index in NLWAS[J].,2006,(05):126.
[7]刘亚清 张瑾 于纯妍.基于义原同现频率的汉语词义排歧系统[J].计算机技术与发展,2006,(05):184.
 LIU Ya-qing,ZHANG Jin,YU Chun-yan.A Chinese Word Sense Disambiguation System Based on Primitive CO- Occurrence Data[J].,2006,(05):184.
[8]刘政怡 李炜 吴建国.基于IMM—IME的汉字键盘输入法编程技术研究[J].计算机技术与发展,2006,(12):43.
 LIU Zheng-yi,LI Wei,WU Jian-guo.Research of Programming Technology of Chinese Input Method Based on IMM- IME[J].,2006,(05):43.
[9]赵鹏 何留进 孙凯 方薇[].基于情感计算的网络中文信息分析技术[J].计算机技术与发展,2010,(11):146.
 ZHAO Peng,HE Liu-jin,SUN Kai,et al.Analyzing Technologies of Internet Chinese Information Based on Affective Computing[J].,2010,(05):146.
[10]徐远方 李成城.基于SVM和词间特征的新词识别研究[J].计算机技术与发展,2012,(05):134.
 XU Yuan-fang,LI Cheng-cheng.Research on New Word Identification Based on SVM and Word Characteristics[J].,2012,(05):134.

更新日期/Last Update: 2023-05-10