[1]谢怡,王航,刘新瀚,等. 大数据环境下数据读取关键技术研究[J].计算机技术与发展,2015,25(02):113-116.
 XIE Yi,WANG Hang,LIU Xin-han,et al. Research on Data Reading Techniques Based on Big Data Environment[J].,2015,25(02):113-116.
点击复制

 大数据环境下数据读取关键技术研究()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
25
期数:
2015年02期
页码:
113-116
栏目:
智能、算法、系统工程
出版日期:
2015-02-10

文章信息/Info

Title:
 Research on Data Reading Techniques Based on Big Data Environment
文章编号:
1673-629X(2015)02-0113-04
作者:
 谢怡王航刘新瀚陈梓洋孙知信
 南京邮电大学 宽带无线通信与传感网技术教育部重点实验室
Author(s):
 XIE Yi WANG Hang LIU Xin-hanCHEN Zi-yangSUN Zhi-xin
关键词:
 大数据列存储压缩物化技术
Keywords:
big data column-storagecompressionmaterialization techniques
分类号:
TP31
文献标志码:
A
摘要:
 针对大数据环境下数据读取面临的主要挑战,文中重点研究了分布式文件系统中数据读取关键技术。根据数据存放结构的不同,从数据加载、查询处理和存储空间利用三个方面分析比较行存储、列存储和行列混合存储的优缺点和面临的挑战,重点介绍列存储中涉及到的压缩和物化技术,具体分析了存储压缩中经常运用的行程编码算法、词典编码算法、位向量编码算法和元组重构中运用的延迟物化技术。通过分析现有技术存在的问题,探讨相关的解决方案,并展望了未来研究的发展方向。
Abstract:
 nder the big data environment,data reading has faced enormous challenges. In this paper,focus on the key technologies of da-ta in the distributed file system. Analyze the row-storage,column-storage,hybrid-storage according to data placement structure from data loading,query processing and storage space utilization. Besides,it introduces materialization techniques used in column-storage including run-length encoding,dictionary encoding,bit-vector encoding and lazy decompression. Meanwhile,by analysis of the present problem, discuss the relative solutions,and has a prospect of future development.

相似文献/References:

[1]严霄凤,张德馨.大数据研究[J].计算机技术与发展,2013,(04):168.
 YAN Xiao-feng,ZHANG De-xin.Big Data Research[J].,2013,(02):168.
[2]张志宏,吴庆波,邵立松,等.基于飞腾平台TOE协议栈的设计与实现[J].计算机技术与发展,2014,24(07):1.
 ZHANG Zhi-hong,WU Qing-bo,SHAO Li-song,et al. Design and Implementation of TCP/IP Offload Engine Protocol Stack Based on FT Platform[J].,2014,24(02):1.
[3]梁文快,李毅. 改进的基因表达算法对航班优化排序问题研究[J].计算机技术与发展,2014,24(07):5.
 LIANG Wen-kuai,LI Yi. Research on Optimization of Flight Scheduling Problem Based on Improved Gene Expression Algorithm[J].,2014,24(02):5.
[4]黄静,王枫,谢志新,等. EAST文档管理系统的设计与实现[J].计算机技术与发展,2014,24(07):13.
 HUANG Jing,WANG Feng,XIE Zhi-xin,et al. Design and Implementation of EAST Document Management System[J].,2014,24(02):13.
[5]侯善江[],张代远[][][]. 基于样条权函数神经网络P2P流量识别方法[J].计算机技术与发展,2014,24(07):21.
 HOU Shan-jiang[],ZHANG Dai-yuan[][][]. P2P Traffic Identification Based on Spline Weight Function Neural Network[J].,2014,24(02):21.
[6]李璨,耿国华,李康,等. 一种基于三维模型的文物碎片线图生成方法[J].计算机技术与发展,2014,24(07):25.
 LI Can,GENG Guo-hua,LI Kang,et al. A Method of Obtaining Cultural Debris’ s Line Chart Based on Three-dimensional Model[J].,2014,24(02):25.
[7]翁鹤,皮德常. 混沌RBF神经网络异常检测算法[J].计算机技术与发展,2014,24(07):29.
 WENG He,PI De-chang. Chaotic RBF Neural Network Anomaly Detection Algorithm[J].,2014,24(02):29.
[8]刘茜[],荆晓远[],李文倩[],等. 基于流形学习的正交稀疏保留投影[J].计算机技术与发展,2014,24(07):34.
 LIU Qian[],JING Xiao-yuan[,LI Wen-qian[],et al. Orthogonal Sparsity Preserving Projections Based on Manifold Learning[J].,2014,24(02):34.
[9]尚福华,李想,巩淼. 基于模糊框架-产生式知识表示及推理研究[J].计算机技术与发展,2014,24(07):38.
 SHANG Fu-hua,LI Xiang,GONG Miao. Research on Knowledge Representation and Inference Based on Fuzzy Framework-production[J].,2014,24(02):38.
[10]叶偲,李良福,肖樟树. 一种去除运动目标重影的图像镶嵌方法研究[J].计算机技术与发展,2014,24(07):43.
 YE Si,LI Liang-fu,XIAO Zhang-shu. Research of an Image Mosaic Method for Removing Ghost of Moving Targets[J].,2014,24(02):43.
[11]王雷,陈彦先,袁哲,等. 面向预拌混凝土行业的云计算[J].计算机技术与发展,2014,24(08):14.
 WANG Lei,CHEN Yan-xian,YUAN Zhe JI Xu. Research on Cloud Computing for Ready-mixed Concrete Industry[J].,2014,24(02):14.
[12]金宗泽,冯亚丽,文必龙,等. 大数据分析流程框架的研究[J].计算机技术与发展,2014,24(08):117.
 JIN Zong-ze,FENG Ya-l,WEN Bi-long,et al. Research on Framework of Big Data Analytic Process[J].,2014,24(02):117.
[13]张也弛,周文钦,石润华. 一种面向云的大数据完整性检测协议[J].计算机技术与发展,2014,24(09):68.
 ZHANG Ye-chi,ZHOU Wen-qin,SHI Run-hua. A Big Data Integrity Checking Protocol for Cloud[J].,2014,24(02):68.
[14]付燕平,罗明宇,刘其军. 大数据三维模型快速显示技术研究[J].计算机技术与发展,2015,25(05):87.
 FU Yan-ping,LUO Ming-yu,LIU Qi-jun. Research on Fast Display Technology for Big Data Three-dimensional Model[J].,2015,25(02):87.
[15]赵震,任永昌. 大数据时代基于云计算的电子政务平台研究[J].计算机技术与发展,2015,25(10):145.
 ZHAO Zhen,REN Yong-chang. Research on E-government Platform Based on Cloud Computing in Big Data Era[J].,2015,25(02):145.
[16]胡存刚,程莹. 基于粒子群算法的大数据智能搜索引擎的研究[J].计算机技术与发展,2015,25(12):14.
 HU Cun-gang,CHENG Ying. Research on Big Data Intelligent Search Engine Based on PSO[J].,2015,25(02):14.
[17]肖洁,袁嵩,谭天. 大数据时代数据隐私安全研究[J].计算机技术与发展,2016,26(05):91.
 XIAO Jie,YUAN Song,TAN Tian. Research on Data Privacy in Big Data Age[J].,2016,26(02):91.
[18]郭先超,林宗缪,姚文勇. 互联网+质量检测平台设计[J].计算机技术与发展,2016,26(05):120.
 GUO Xian-chao,LIN Zong-miao,YAO Wen-yong. Design of Platform for Internet+ Quality Inspection[J].,2016,26(02):120.
[19]程艳云,张守超,杨杨. 基于大数据的时间序列异常点检测研究[J].计算机技术与发展,2016,26(05):139.
 CHENG Yan-yun,ZHANG Shou-chao,YANG Yang. Research on Time Series Outlier Detection Based on Big Data[J].,2016,26(02):139.
[20]程艳云,张守超,杨杨. 基于大数据的时间序列预测研究与应用[J].计算机技术与发展,2016,26(06):175.
 CHENG Yan-yun,ZHANG Shou-chao,YANG Yang. Research and Application of Time Series Forecasting Based on Big Data[J].,2016,26(02):175.

更新日期/Last Update: 2015-04-28