[1]侯向宁,徐草草,杨井荣.基于 Spark 的花卉图像分类研究[J].计算机技术与发展,2022,32(07):70-74.[doi:10. 3969 / j. issn. 1673-629X. 2022. 07. 012]
 HOU Xiang-ning,XU Cao-cao,YANG Jing-rong.Study of Flower Image Classification Based on Spark[J].,2022,32(07):70-74.[doi:10. 3969 / j. issn. 1673-629X. 2022. 07. 012]
点击复制

基于 Spark 的花卉图像分类研究()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
32
期数:
2022年07期
页码:
70-74
栏目:
图形与图像
出版日期:
2022-07-10

文章信息/Info

Title:
Study of Flower Image Classification Based on Spark
文章编号:
1673-629X(2022)07-0070-05
作者:
侯向宁徐草草杨井荣
成都理工大学 工程技术学院 电子信息与计算机工程系,四川 乐山 614000
Author(s):
HOU Xiang-ningXU Cao-caoYANG Jing-rong
Department of Electronic Information and Computer Engineering,School of Engineering and Technique,Chengdu University of Technology,Leshan 614000,China
关键词:
花卉分类HadoopSparkVGG16TensorFlowOnSparkSK 单元
Keywords:
flower classificationHadoopSparkVGG16TensorFlowOnSparkSK unit
分类号:
TP391
DOI:
10. 3969 / j. issn. 1673-629X. 2022. 07. 012
摘要:
针对传统单机模式对海量花卉图像数据分类效率低下以及现有网络模型对花卉分类准确率不高的问题,首先通过搭建 Hadoop 及 Spark 分布式计算框架,利用 HDFS 存储海量花卉图像数据,Spark 进行分布式并行计算,HBASE 存储相关的集群参数及网络模型参数。 其次在对现有的 VGG16 网络模型进行研究的基础上, 将选择性软注意力机制引入VGG16 网络对其进行改进,使 VGG16 网络可以从不同的感受野获取信息,并使网络泛化能力变得更强。 最终在 Spark 分布式计算框架中采用 TensorFlowOnSpark 技术,实现花卉图像特征提取、模型训练及分类测试的并行化,既降低了模型训练的时间,同时也提高了花卉分类的准确率。 实验表明,与未引入 SK( 选择性内核) 单元的 VGG16 模型相比,花卉分类的准确率提高了近 15. 3 个百分点。 实验还表明,分布式计算有利于负载均衡,极大地降低了模型训练及分类测试的耗时,能进一步提高海量花卉数据分类的效率。
Abstract:
In view of the low efficiency of the traditional single-machine mode in the classification of massive flower image data? ? ?and the low accuracy of the existing network model in the classification of flowers,firstly by building Hadoop and Spark distributed computing framework,HDFS is used to store massive flower image data,and Spark is used for distributed parallel computing,and HBASE to store cluster parameters and network model parameters. Secondly,on the basis of the research of existing VGG16 network model,the selective soft attention mechanism is introduced into the VGG16 network to improve it,so that the VGG16 network can obtain information from different receptive fields, and make the network generalization ability become stronger. Finally, TensorFlowOnSpark technology was adopted in the Spark distributed computing framework to realize the parallelization of flower image feature extraction,model training and classification test,which not only reduced the time of model training, but also improved the accuracy of flower classification. The experiment shows that compared with the VGG16 model without SK ( selective kernel) unit, the accuracy of flower classification is improved by 15. 3 percentage points. The experiment also shows that distributed computing is beneficial to load balance,greatly reduces the time of model training and classification test,and can further improve the efficiency of massive flower data classification.

相似文献/References:

[1]李远方 邓世昆 闻玉彪 韩月阳.Hadoop-MapReduce下的PageRank矩阵分块算法[J].计算机技术与发展,2011,(08):6.
 LI Yuan-fang,DENG Shi-kun,WEN Yu-biao,et al.PageRank Matrix Partitioned Algorithm Using Hadoop-MapReduce[J].,2011,(07):6.
[2]李远方 贾时银 邓世昆 韩月阳.基于树结构的MapReduce模型[J].计算机技术与发展,2011,(08):149.
 LI Yuan-fang,JIA Shi-yin,DENG Shi-kun,et al.MapReduce Model Based on Tree Structure[J].,2011,(07):149.
[3]王梅,朱信忠,赵建民,等.基于 Hadoop 的海量图像检索系统[J].计算机技术与发展,2013,(01):204.
 WANG Mei,ZHU Xin-zhong,ZHAO Jian-min,et al.Massive Images Retrieval System Based on Hadoop[J].,2013,(07):204.
[4]王晓军,孙惠.基于MapReduce的多路连接优化方法研究[J].计算机技术与发展,2013,(06):59.
 WANG Xiao-jun,SUN Hui.Research of Optimizing Multiway Joins Based on MapReduce[J].,2013,(07):59.
[5]朱贤军,李敬兆.无加密模式下对云数据的隐私保密[J].计算机技术与发展,2013,(06):216.
 ZHU Xian-jun,LI Jing-zhao.Cloud Data Privacy under None Encryption[J].,2013,(07):216.
[6]周婷,张君瑛,罗成.基于Hadoop的K-means聚类算法的实现[J].计算机技术与发展,2013,(07):18.
 ZHOU Ting[],ZHANG Jun-ying[],LUO Cheng[].Realization of K-means Clustering Algorithm Based on Hadoop[J].,2013,(07):18.
[7]吕婉琪,钟诚,唐印浒,等.Hadoop分布式架构下大数据集的并行挖掘[J].计算机技术与发展,2014,24(01):22.
 L Wan-qi,ZHONG Cheng,TANG Yin-hu,et al.Parallel Mining of Large Dataset in Hadoop Distributed Computing Framework[J].,2014,24(07):22.
[8]王晓军,邹亮亮. Hadoop迭代优化技术的研究[J].计算机技术与发展,2014,24(09):98.
 WANG Xiao-jun,ZOU Liang-liang. Research on Optimizing Iterative Technology of Hadoop[J].,2014,24(07):98.
[9]徐源吾[][],王珣[][]. 基于Hadoop的智能家居信息处理平台[J].计算机技术与发展,2014,24(09):183.
 XU Yuan-wu[] [],WANG Xun[][]. nformation Processing Platform of Smart Home Based on Hadoop[J].,2014,24(07):183.
[10]孙媛,黄刚. 基于Hadoop平台的C4.5算法的分析与研究[J].计算机技术与发展,2014,24(11):83.
 SUN Yuan,HUANG Gang. Analysis and Study of C4 . 5 Algorithm Based on Hadoop Platform[J].,2014,24(07):83.

更新日期/Last Update: 2022-07-10