[1]徐 娇,田萍芳,顾进广,等.基于查询特征表示学习的联邦复杂查询基数估计[J].计算机技术与发展,2024,34(02):32-39.[doi:10. 3969 / j. issn. 1673-629X. 2024. 02. 005]
 XU Jiao,TIAN Ping-fang,GU Jin-guang,et al.Cardinality Estimation of Federated Complex Queries Based on Query Feature Representation Learning[J].,2024,34(02):32-39.[doi:10. 3969 / j. issn. 1673-629X. 2024. 02. 005]
点击复制

基于查询特征表示学习的联邦复杂查询基数估计()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
34
期数:
2024年02期
页码:
32-39
栏目:
大数据与云计算
出版日期:
2024-02-10

文章信息/Info

Title:
Cardinality Estimation of Federated Complex Queries Based on Query Feature Representation Learning
文章编号:
1673-629X(2024)02-0032-08
作者:
徐 娇1234 田萍芳1234 顾进广1234 徐芳芳1234
1. 武汉科技大学 计算机科学与技术学院,湖北 武汉 430065;
2. 湖北省智能信息处理与实时工业系统重点实验室,湖北 武汉 430065;
3. 武汉科技大学 大数据科学与工程研究院,湖北 武汉 430065;
4. 国家新闻出版署富媒体数字出版内容组织与知识服务重点实验室,北京 100083
Author(s):
XU Jiao1234 TIAN Ping-fang1234 GU Jin-guang1234 XU Fang-fang1234
1. School of Computer Science and Technology,Wuhan University of Science and Technology,Wuhan 430065,China;
2. Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System, Wuhan 430065,China;
3. Institute of Big Data Science and Engineering Research,Wuhan University of Science and Technology, Wuhan 430065,China;
4. Key Laboratory of Rich Media Digital Publishing Content Organization and Knowledge Service,National Press and Publication Administration,Beijing 100083,China
关键词:
联邦系统查询优化复杂查询深度学习基数估计
Keywords:
federal systemquery optimizationcomplex querydeep learningcardinality estimation
分类号:
TP319
DOI:
10. 3969 / j. issn. 1673-629X. 2024. 02. 005
摘要:
准确的基数估计是实现最佳查询计划的关键因素,现有方法大多基于深度学习来解决基数估计问题。 然而,这种基于 RDF 图模式的方法专注于具有特定拓扑结构的简单查询,适用范围有限,
缺乏对现实场景中频繁使用的复杂类查询的支持。 为了解决以上问题,提出一种基于查询特征表示学习的联邦复杂查询基数估计模型。 该模型主要处理带有FILTER 或 DISTINCT 关键字
的复杂查询,使用新提出的 FILTER 查询特征化方法将 SPARQL 查询表示为特征向量,通过模型预测查询基数。 同时使用模型预测 DISITINCT 查询中唯一行比率。 在 LUBM 数据集上的实验表明,与最先进的基数估计方法相比,该模型在估计质量上表现优异,平均估计误差中位数可达 1. 16,并对多连接查询的基数估计表现出潜力和可扩展性。
Abstract:
Accurate cardinality estimation is the key factor to realize the best query plan. Most of the existing methods are based on deeplearning to solve the base estimation problem. However,this method based on RDF graph pattern focuses on simple queries with specifictopological structure,which is limited in application scope,and lacks support for complex?
queries frequently used in real scenes. In orderto solve the above problems,we propose a federated complex query cardinality estimation model based on query feature representation learning. This model mainly deals with complex queries with FILTER or DISTINCT keywords. The SPARQL query is expressed as afeature vector by using the newly proposed FILTER query characterization method,and the query cardinality is predicted by the model.Also the model is used to predict the ratio of unique rows in DISITINCT queries. Experiments on?
LUBM data sets show that comparedwith the most advanced cardinality estimation methods, this model performs better in cardinality estimation, with an average medianestimation?
error of 1. 16,and shows potential and scalability for the estimation of multi-join queries.

相似文献/References:

[1]席凤磊 毛宇光 廉成洋.XQuery中FLWOR式的查询重写研究[J].计算机技术与发展,2009,(06):25.
 XI Feng-lei,MAO Yu-guang,LIAN Cheng-yang.Query Rewriting of FLWOR Expressions in XQuery[J].,2009,(02):25.
[2]余俊新 孙涌.Oracle9i中查询优化技术的分析[J].计算机技术与发展,2006,(04):93.
 YU Jun-xin,SUN Yong.Summary of Optimization of Query Statement in Oracle9i[J].,2006,(02):93.
[3]周彦 陈梅 王翰虎 敖飞.基于层次位图连接索引的数据仓库查询优化[J].计算机技术与发展,2011,(03):40.
 ZHOU Yan,CHEN Mei,WANG Han-hu,et al.Query Optimization of Data Warehouse Based on Hierarchical Bitmap Join Index[J].,2011,(02):40.
[4]孙振兴 向阳 刘增宝.PostgreSQL查询优化器分析研究[J].计算机技术与发展,2011,(08):141.
 SUN Zhen-xing,XIANG Yang,LIU Zeng-bao.Analysis and Research on Optimizer of PostgreSQL[J].,2011,(02):141.
[5]邢玉钢 王曼丽 王翰虎 陈梅.基于列式存储的闪存数据库查询优化策略[J].计算机技术与发展,2011,(12):131.
 XING Yu-gang,WANG Man-li,WANG Han-hu,et al.Query Optimization Strategies of Flash Memory Database Based on Column Storage[J].,2011,(02):131.
[6]张辉 赵郁亮 徐江 孙伟华.基于Oracle数据库海量数据的查询优化研究[J].计算机技术与发展,2012,(02):165.
 ZHANG Hui,ZHAO Yu-liang,XU Jiang,et al.Query Optimization Research on Mass of Data Based on Oracle Database[J].,2012,(02):165.
[7]温慧明 宫晓辉 焦洋.基于网格服务的半连接查询优化算法研究[J].计算机技术与发展,2012,(09):123.
 WEN Hui-ming,GONG Xiao-hui,JIAO Yang.Research of Semi-join Query Optimization Algorithm Based on Grid Service[J].,2012,(02):123.
[8]褚龙现.一种改进的半连接查询优化算法[J].计算机技术与发展,2012,(10):136.
 CHU Long-xian.An Improved Semi-join Query Optimization Algorithm[J].,2012,(02):136.
[9]刘维学.SQL Server查询优化器原理与优化实例分析[J].计算机技术与发展,2013,(11):108.
 LIU Wei-xue.Query Optimization Principle and Optimized Instance Analysis of SQL Server[J].,2013,(02):108.
[10]彭义,倪传蕾,柏文阳.基于CouchDB的SPARQL查询引擎实现[J].计算机技术与发展,2014,24(05):6.
 PENG Yi,NI Chuan-lei,et al.Implementation of SPARQL Query Engine Based on CouchDB[J].,2014,24(02):6.

更新日期/Last Update: 2024-02-10