[1]马 彬,李玉涛,许 琪.融合线性插值和对抗性学习的深度子空间聚类[J].计算机技术与发展,2023,33(03):207-214.[doi:10. 3969 / j. issn. 1673-629X. 2023. 03. 031]
 MA Bin,LI Yu-tao,XU Qi.eal-time Stream Processing and Storage System of Automatic Weather Station Data Based on Spark Streaming[J].,2023,33(03):207-214.[doi:10. 3969 / j. issn. 1673-629X. 2023. 03. 031]
点击复制

融合线性插值和对抗性学习的深度子空间聚类()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
33
期数:
2023年03期
页码:
207-214
栏目:
新型计算应用系统
出版日期:
2023-03-10

文章信息/Info

Title:
eal-time Stream Processing and Storage System of Automatic Weather Station Data Based on Spark Streaming
文章编号:
1673-629X(2023)03-0208-08
作者:
马 彬1 李玉涛1 许 琪2
1. 江苏省气象信息中心,江苏 南京 210005;
2. 江苏省气侯中心,江苏 南京 210005
Author(s):
MA Bin1 LI Yu-tao1 XU Qi
1. Jiangsu Meteorological Information Center,Nanjing 210005,China;
2. Jiangsu Climate Center,Nanjing 210005,China
关键词:
气象自动站数据Spark Streaming实时处理Flume分布式数据库
Keywords:
automatic weather station dataSpark Streamingreal-time processingFlumedistributed database
分类号:
TP399
DOI:
10. 3969 / j. issn. 1673-629X. 2023. 03. 031
摘要:
在当前大数据技术蓬勃发展的时代,人们对气象数据的实时处理、数据质量、数据存储及大规模查询等要求也越来越高。 针对现有气象自动站数据业务落地环节多,任务处理耦合紧但系统部署分散等问题,文中基于 Spark Streaming 的流式计算框架,研究使用 Flume 解析收集自动站原始数据,在 Spark Streaming 中设计融入自动站数据质控算法,最终通过对分布式数据库存储的表设计,使气象自动站数据具备高效率、高质量、高可靠的应用服务能力。 性能测试结果表明,基于 Spark Streaming 的气象自动站数据实时流处理与存储系统,数据从文件采集、解码、流处理至入库的全流程能够在秒级完成,TB 级数据查询响应为毫秒级,加权查询为秒级,完全满足自动站数据业务应用需求,从而为进一步提高气象自动站数据质量与服务水平提供基础支撑。
Abstract:
In the current era of vigorous development of big data technology,people have higher and higher requirements for real-timeprocessing,data quality,data storage and large-scale query of meteorological data. Aiming at these problems of automatic weather stationdata with many operation nodes,tightly-coupled task processing,and decentralized system deployment,based on the streaming computingframework of Spark Streaming,the Flume is used to analyze and collect the raw data of automatic weather stations,then the data qualitycontrol algorithms are designed and integrated into Spark Streaming. Finally,the table of distributed database storage is designed to makethe automatic weather station data with high-efficiency,high-quality,and high-reliability application service capabilities. The test resultsshow that the real-time stream processing and storage system of automatic weather station data based on Spark Streaming can completethe process of data collection,decoding,stream processing and storage in milliseconds,with the query response of TB level data in milliseconds,weighted query in second level,which fully meets the requirements of automatic station data in operation application. Thus itwill provide a basic support for further improving the data quality and service of automatic weather stations.

相似文献/References:

[1]李恩洲,况立群*,张 元,等.智慧供热大数据监测平台研究及应用[J].计算机技术与发展,2021,31(11):176.[doi:10. 3969 / j. issn. 1673-629X. 2021. 11. 029]
 LI En-zhou,KUANG Li-qun*,ZHANG Yuan,et al.Research and Application of Big Data Monitoring Platform for Intelligent Heating[J].,2021,31(03):176.[doi:10. 3969 / j. issn. 1673-629X. 2021. 11. 029]

更新日期/Last Update: 2023-03-10