[1]杨红超 肖基毅.基于HMM/BP混合模型的文本信息抽取研究[J].计算机技术与发展,2011,(05):115-117.
 YANG Hong-chao,XIAO Ji-yi.Text Information Extraction Research Based on HMM and BP Network Hybrid Model[J].,2011,(05):115-117.
点击复制

基于HMM/BP混合模型的文本信息抽取研究()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
期数:
2011年05期
页码:
115-117
栏目:
智能、算法、系统工程
出版日期:
1900-01-01

文章信息/Info

Title:
Text Information Extraction Research Based on HMM and BP Network Hybrid Model
文章编号:
1673-629X(2011)05-0115-03
作者:
杨红超 肖基毅
南华大学计算机科学与技术学院
Author(s):
YANG Hong-chao XIAO Ji-yi
School of Computer and Technology, University of South China
关键词:
信息抽取隐马尔可夫模型BP网络
Keywords:
information extraction HMM BPN
分类号:
TP391
文献标志码:
A
摘要:
作为自然语言处理的一个分支,文本信息抽取成为了提取大量文本信息中有用信息的重要手段。介绍了目前在信息抽取领域中应用广泛的两种技术方法:HMM和BP网络模型,分析了各自的优缺点,并在此基础上提出了一种基于两者的混合模型,该混合模型通过BP网络优秀的分类甄别能力来弥补HMM在分类方面的不足,而通过HMM强大的时域建模能力来弥补BP网络建模能力弱的问题,因此该模型具有强大的建模能力、分类性以及适应性强等特点。实验证明,相比传统的HMM以及BP网络模型,该混和模型在精确度和召回率上有了10%-15%的提高
Abstract:
As a branch of natural language processing, the extraction of useful information in large text , the text information extraction became an important means. Introduce the information extraction widely used two kinds of technical methods: HMM and BP network model, analyze their advantages and disadvantages and on this basis propose a hybrid model, based on two models mentioned above. In this model, the classification by BP network capacity is to make up for deficiencies in the classificationof HMM, HMM through strong time-domain modeling capabilities to make up for weak BP network modeling problem,so the hybrid model has strong modeling capabil- ities, classified and adaptability, etc. Experimental results show that compared to the traditional HMM and the BP network model, hybrid model in precision and recall rate is on the increase by 10% - 15%

相似文献/References:

[1]秦振海 谭守标 徐超.基于Web的表格信息抽取研究[J].计算机技术与发展,2010,(02):217.
 QIN Zhen-hai,TAN Shou-biao,XU Chao.Study on ,Tables Information Extraction Based on Web[J].,2010,(05):217.
[2]胡国晴 李建华.一种基于可信度分析的Web页面新属性发现方法[J].计算机技术与发展,2009,(01):56.
 HU Guo-qing,LI Jian-hua.A Credibility Analysis- Based Method to Discover New Attributes Web Pages[J].,2009,(05):56.
[3]李宏伟 史培中 张素智.一种高效Web数据抽取包装器的设计与实现[J].计算机技术与发展,2009,(02):123.
 LI Hong-wei,SHI Pei-zhong,ZHANG Su-zhi.Design and Implementation of an Efficient Wrapper for Web Data Extraction[J].,2009,(05):123.
[4]赵金仿 赵艳 缪建明.网页信息抽取及其自动文本分类的实现[J].计算机技术与发展,2008,(10):37.
 ZHAO Jin-fang,ZHAO Yan,MIAO Jian-ming.Extraction of Homepage Text Information and Realization of Text Automatic Categorization[J].,2008,(05):37.
[5]崔阳 吴爱华.一种面向B2B垂直搜索的网页信息去噪方法[J].计算机技术与发展,2008,(12):70.
 CUI Yang,WU Ai-hua.A Method of Eliminating Noisy Information in Web Pages Oriented B2B Vertical Searching[J].,2008,(05):70.
[6]徐慧 杨学兵.基于本体相似度的中文科研论文信息抽取[J].计算机技术与发展,2008,(12):203.
 XU Hui,YANG Xue-bing.Information Extraction from Chinese Research Papers Based on Ontology Similarity[J].,2008,(05):203.
[7]仲华 崔志明.基于XML的信息抽取和多层向量空间技术研究[J].计算机技术与发展,2007,(07):49.
 ZHONG Hua,CUI Zhi-ming.Research on Information Extraction and Multilayer Vector Space Based on XML Technology[J].,2007,(05):49.
[8]陈静 朱巧明 贡正仙.基于Ontology的信息抽取研究综述[J].计算机技术与发展,2007,(10):84.
 CHEN Jing,ZHU Qiao-ming,GONG Zheng-xian.Overview of Ontology - Based Information Extraction[J].,2007,(05):84.
[9]易利涛 周肆清 丁长松.信息抽取中领域本体建模方法研究[J].计算机技术与发展,2011,(10):23.
 YI Li-tao,ZHOU Si-qing,DING Chang-song.Research on Modeling Method of Domain Ontology in Information Extraction[J].,2011,(05):23.
[10]邹腊梅 龚向坚.基于混合模拟退火-遗传算法和HMM的Web挖掘[J].计算机技术与发展,2012,(03):106.
 ZOU La-mei,GONG Xiang-jian.Web Mining Based on Hybrid Simulated Annealing Genetic Algorithm and HMM[J].,2012,(05):106.
[11]韩普 姜杰.HMM在自然语言处理领域中的应用研究[J].计算机技术与发展,2010,(02):245.
 HAN Pu,JIANG Jie.Application and Research of Hidden Markov Model in Natural Language Processing Domain[J].,2010,(05):245.
[12]邹腊梅 肖基毅 龚向坚.基于Maximum Likelihood与HMM的文本挖掘[J].计算机技术与发展,2007,(12):110.
 ZOU La-mei,XIAO Ji-yi,GONG Xiang-jian.Text Information Mining Based on Maximum Likelihood and Hidden Markov Model[J].,2007,(05):110.

备注/Memo

备注/Memo:
湖南省科技计划项目(2008GK3090)杨红超(1985-),男,山东肥城人,硕士研究生,研究方向为智能信息系统与知识发现;肖基毅,教授,硕士生导师,研究方向为文本信息抽取
更新日期/Last Update: 1900-01-01