[1]周凤丽 林晓丽.基于Lucene的Web搜索引擎的研究和实现[J].计算机技术与发展,2012,(01):140-142.
 ZHOU Feng-li,LIN Xiao-li.Research and Implementation of Web Search Engine Based on Lucene[J].,2012,(01):140-142.
点击复制

基于Lucene的Web搜索引擎的研究和实现()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
期数:
2012年01期
页码:
140-142
栏目:
智能、算法、系统工程
出版日期:
1900-01-01

文章信息/Info

Title:
Research and Implementation of Web Search Engine Based on Lucene
文章编号:
1673-629X(2012)01-0140-03
作者:
周凤丽 林晓丽
武汉科技大学城市学院信息工程学部
Author(s):
ZHOU Feng-liLIN Xiao-li
Information Engineering Department, Wuhan University of Science and Technology City Institute
关键词:
网络爬虫应用系统搜索引擎多线程
Keywords:
Web spider application system search engine multi-threading
分类号:
TP31
文献标志码:
A
摘要:
互联网的快速发展也使搜索引擎不断的发展着,而搜索引擎逐渐转向商业化运行,使得搜索引擎的技术细节越来越隐蔽。文章研究和分析了搜索引擎工具Lucene的原理、模型和索引器,设计了一个搜索引擎系统。该系统采用了非递归的方式负责Web站点的网页爬取以及爬取过程中URL链接的存储、处理等,并通过多线程技术管理多个抓取线程,实现了并发抓取网页,提高了系统的运行效率。最后采用JSP技术设计了一个简易的新闻搜索引擎客户端,系统可以稳定运行,基本符合搜索引擎原理的探索,具有一定的现实意义
Abstract:
Search engine has made a constant development with the development of the interact,but its gradual shifting to commercial operation makes the technical details of search engine more and more hidden. Based on research and analysis of the system strocture,model and indexer of Lucene,it implements a search engine system, this system uses a non-recursive mode to take responsibility for Web craw- ling in the Web and distributing,handling of URL links in the process of crawling,it manages multiple crawling threads by multi-threa- ding technology,implements concurrently Web pages crawling and improves the system operating efficiency. And then, use JSP technolo- gy to design a simple news search engine clients. The system can run stable in line which achieves the search engine' s principles and has certain significance

相似文献/References:

[1]张林才 张燕 王红霞.节点对等WebSpider设计与实现[J].计算机技术与发展,2010,(03):195.
 ZHANG Lin-cai,ZHANG Yan,WANG Hong-xia.Design and Realization of Peer - to - Peer Web Spider[J].,2010,(01):195.
[2]张春元 康耀红 伍小芹.Web新闻自动采集发布系统的设计与实现[J].计算机技术与发展,2009,(09):250.
 ZHANG Chun-yuan,KANG Yao-hong,WU Xiao-qin.Design and Implementation of Web News Automatically Gathering and Publishing System[J].,2009,(01):250.
[3]黄宇达 魏霞 王迤冉[].一种轻量级中文搜索引擎模型的设计与实现[J].计算机技术与发展,2012,(09):201.
 HUANG Yu-da,WEI Xia,WANG Yi-ran.Design and Implementation of System Model of a Lightweight Chinese Search Engine[J].,2012,(01):201.
[4]张俊,李鲁群,周熔.基于Lucene的搜索引擎的研究与应用[J].计算机技术与发展,2013,(06):230.
 ZHANG Jun,LI Lu-qun,ZHOU Rong.Research and Application of Search Engine Based on Lucene[J].,2013,(01):230.
[5]郭绍永,白东玲.基于J2EE的应用系统通用框架的搭建及开发[J].计算机技术与发展,2013,(09):206.
 GUO Shao-yong[],BAI Dong-ling[].Design and Development of Application System General Framework Based on J2EE[J].,2013,(01):206.
[6]孙青云,王俊峰,赵宗渠,等.一种基于模拟登录的微博数据采集方案[J].计算机技术与发展,2014,24(03):6.
 SUN Qing-yun[],WANG Jun-feng[],ZHAO Zong-qu[],et al.A Microblog Data Collection Method Based on Simulated Login Technology[J].,2014,24(01):6.
[7]杨洋[][],李晓风[][],赵赫[][],等. 基于网络爬虫的文献检索系统的研究和实现[J].计算机技术与发展,2014,24(11):35.
 YANG Yang[][],LI Xiao-feng[][],ZHAO He[][],et al. Research and Realization of Academic Search System Based on Network Crawler[J].,2014,24(01):35.
[8]付剑生[] .徐林龙[]。 林文斌[]. 分布式全网职位搜索引擎的研究与实现[J].计算机技术与发展,2015,25(05):6.
 FU Jian-sheng[],XU Lin-long[],LIN Wen-bin[]. Research and Implementation of Distributed Network-wide Job Search Engine[J].,2015,25(01):6.
[9]韩贝,马明栋,王得玉.基于Scrapy框架的爬虫和反爬虫研究[J].计算机技术与发展,2019,29(02):139.[doi:10.3969/j.issn.1673-629X.2019.02.029]
 HAN Bei,MA Mingdong,WANG Deyu.Research on Crawler and Anti-reptile Based on Scrapy Framework[J].,2019,29(01):139.[doi:10.3969/j.issn.1673-629X.2019.02.029]
[10]王荩梓,赖雯洁. 基于房产交易网站的数据获取与在线工具开发[J].计算机技术与发展,2017,27(05):154.
 WANG Jin-zi,LAI Wen-jie. Data Acquisition and Development of Online Analysis Tools Based on Real Estate Transaction Websites[J].,2017,27(01):154.

备注/Memo

备注/Memo:
湖北省教育科学“十一五”规划2009年度立项课题(20098236)周风丽(1980-),女,湖北荆门人,讲师,硕士,研究方向为软件工程
更新日期/Last Update: 1900-01-01