[1]羊麟威,李 静,饶涵宇,等.MicroAFL:一种云上微服务故障自动定位方法[J].计算机技术与发展,2023,33(05):88-95.[doi:10. 3969 / j. issn. 1673-629X. 2023. 05. 014]
 YANG Lin-wei,LI Jing,RAO Han-yu,et al.MicroAFL:Automatic Fault Location for Microservices on Cloud[J].,2023,33(05):88-95.[doi:10. 3969 / j. issn. 1673-629X. 2023. 05. 014]
点击复制

MicroAFL:一种云上微服务故障自动定位方法()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
33
期数:
2023年05期
页码:
88-95
栏目:
软件技术与工程
出版日期:
2023-05-10

文章信息/Info

Title:
MicroAFL:Automatic Fault Location for Microservices on Cloud
文章编号:
1673-629X(2023)05-0088-08
作者:
羊麟威1 李 静1 饶涵宇2 高 颖3 毛 冬2 乔宇杰3
1. 南京航空航天大学 计算机科学与技术学院,江苏 南京 211106;
2. 国网浙江省电力有限公司 信息通信分公司,浙江 杭州 310016;
3. 国家电网有限公司 信息通信分公司,北京 100761
Author(s):
YANG Lin-wei1 LI Jing1 RAO Han-yu2 GAO Ying3 MAO Dong2 QIAO Yu-jie3
1. School of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China;
2. Information and Communication Branch of State Grid Zhejiang Electric Power Company,Hangzhou 310016,China;
3. Information and Communication Branch of State Grid Corporation,Beijing 100761,China
关键词:
自编码器微服务云环境故障自动定位服务调用关系图故障传播
Keywords:
autoencodermicroservicecloud environmentautomatic fault locationservice call diagramfault propagation
分类号:
TP311
DOI:
10. 3969 / j. issn. 1673-629X. 2023. 05. 014
摘要:
随着云上微服务系统规模的不断扩大,微服务之间的依赖关系变得更加紧密复杂,某个微服务的故障可能会通过微服务之间的互相调用传播至其他微服务,进而导致整个微服务系统发生异常。 面对依赖关系复杂的微服务系统,考虑到故障的传播性,设计了一种云上微服务故障自动定位方法 MicroAFL。 首先,MicroAFL 实时监测与收集微服务系统运行指标数据,基于自编码器模型对运行指标数据进行分析,判断微服务系统是否存在异常;一旦检测到异常,MicroAFL 通过解析云上微服务运行实例之间的通讯数据获取微服务之间的调用关系,进而构建服务调用关系图以刻画故障传播途径;其次,将各个微服务的运行状态与系统资源利用率相关联从而计算服务调用关系图中每个节点的异常权重,并通过改进的加权 PageRank 算法推断和定位引发异常的故障微服务; 最后, 在华为云上搭建名为 Sock - shop 的微服务系统对MicroAFL 的故障定位准确性进行评估,实验结果表明 MicroAFL 的故障定位准确率相较对比方法有所提升。
Abstract:
With the expansion of the scale of microservice system on the cloud,the dependencies between distributed components of microservices become more complex. The fault of a microservice may be propagated to other microservices through the mutual calls of microservices, which will lead to the entire microservice system. With the complex dependencies of microservices system and thepropagation of faults, we design MicroAFL, an automatic fault location for microservices on cloud. Firstly, MicroAFL monitors andcollects the metric data of the microservice system in real time, analyzes the metric data based on the autoencoder model, and judgeswhether there is any abnormality in the microservice system. Once an anomaly is detected, MicroAFL obtains the calling relationshipbetween microservices by analyzing the communication data between the running instances of the microservice on the cloud,builds a microservice calling relationship diagram to describe the fault propagation path. Then,the running status of each microservice is associatedwith the system resource utilization to calculate the anomaly weight of each node in the microservice call graph, and the improvedweighted PageRank algorithm is used to infer and locate the faulty microservice that caused the anomaly. Finally, a Sock - shop microservice system was built on Huawei Cloud to evaluate the fault location performance of MicroAFL. The experimental results showthat the fault location accuracy of MicroAFL is improved.

相似文献/References:

[1]黄继杰,林昌年,杨选怀,等.一种支持私有云仿真的 HLA/ RTI 实现方法[J].计算机技术与发展,2019,29(09):164.[doi:10. 3969 / j. issn. 1673-629X. 2019. 09. 031]
 HUANG Ji-jie,LIN Chang-nian,YANG Xuan-huai,et al.An HLA/ RTI Implementation Supporting Private Cloud Simulation[J].,2019,29(05):164.[doi:10. 3969 / j. issn. 1673-629X. 2019. 09. 031]
[2]罗钦凯,倪成章.基于微服务的工作流技术在云管平台的应用[J].计算机技术与发展,2019,29(09):122.[doi:10. 3969 / j. issn. 1673-629X. 2019. 09. 024]
 LUO Qin-kai,NI Cheng-zhang.Application of Workflow Technology Based on Micro-service in Cloud Management Platform[J].,2019,29(05):122.[doi:10. 3969 / j. issn. 1673-629X. 2019. 09. 024]
[3]吴 磊,湛 健,宋丽华.微服务架构在智能家居网关系统中的应用研究[J].计算机技术与发展,2019,29(11):200.[doi:10. 3969 / j. issn. 1673-629X. 2019. 11. 040]
 WU Lei,ZHAN Jian,SONG Li-hua.Research of Application of Micro-service Architecture in Smart Home Gateway System[J].,2019,29(05):200.[doi:10. 3969 / j. issn. 1673-629X. 2019. 11. 040]
[4]姜 伟,潘邵芹.基于 SDN 的微服务负载均衡方案研究[J].计算机技术与发展,2020,30(02):23.[doi:10. 3969 / j. issn. 1673-629X. 2020. 02. 005]
 JIANG Wei,PAN Shao-qin.Research on Load Balance of Microservice Based on SDN[J].,2020,30(05):23.[doi:10. 3969 / j. issn. 1673-629X. 2020. 02. 005]
[5]郑杰生,谢彬瑜,吴广财,等.一种基于模式识别的微服务异常检测方法[J].计算机技术与发展,2020,30(11):123.[doi:10. 3969 / j. issn. 1673-629X. 2020. 11. 023]
 ZHENG Jie-sheng,XIE Bin-yu,WU Guang-cai,et al.An Anomaly Detection Approach for Microservices Based on Pattern Recognition[J].,2020,30(05):123.[doi:10. 3969 / j. issn. 1673-629X. 2020. 11. 023]
[6]郑杰生,谢彬瑜,吴广财,等.一种基于 Lasso 回归的微服务性能建模方法[J].计算机技术与发展,2020,30(12):216.[doi:10. 3969 / j. issn. 1673-629X. 2020. 12. 038]
 ZHENG Jie-sheng,XIE Bin-yu,WU Guang-cai,et al.A Lasso Regression Based Performance Modeling Method for Microservices[J].,2020,30(05):216.[doi:10. 3969 / j. issn. 1673-629X. 2020. 12. 038]
[7]耿晓利,张 芒,尹永宏.高并发高可用的分布式电商平台架构研究[J].计算机技术与发展,2021,31(02):111.[doi:10. 3969 / j. issn. 1673-629X. 2021. 02. 021]
 GENG Xiao-li,ZHANG Mang,YIN Yong-hong.Research on Distributed E-commerce Platform Architecture with High Concurrency and High Availability[J].,2021,31(05):111.[doi:10. 3969 / j. issn. 1673-629X. 2021. 02. 021]
[8]马勤政,徐中伟,梅 萌.基于 Kubernetes 的列控系统测试容器云平台设计[J].计算机技术与发展,2021,31(06):52.[doi:10. 3969 / j. issn. 1673-629X. 2021. 06. 010]
 MA Qin-zheng,XU Zhong-wei,MEI Meng.Design of Container Cloud Platform for Test of Train Control SystemBased on Kubernetes[J].,2021,31(05):52.[doi:10. 3969 / j. issn. 1673-629X. 2021. 06. 010]
[9]邵 瑛,徐 斌.一种通用的文本日志类信息分析评估模型[J].计算机技术与发展,2021,31(增刊):73.[doi:10. 3969 / j. issn. 1673-629X. 2021. S. 014]
 SHAO Ying,XU Bin.A Common Analysis and Evaluation Model of Text Log Information[J].,2021,31(05):73.[doi:10. 3969 / j. issn. 1673-629X. 2021. S. 014]
[10]邹春杰,赵学健,朱 涛,等.基于微服务架构的农产品溯源系统优化[J].计算机技术与发展,2022,32(01):147.[doi:10. 3969 / j. issn. 1673-629X. 2022. 01. 025]
 ZOU Chun-jie,ZHAO Xue-jian,ZHU Tao,et al.Optimization of Agricultural Product Traceability System Based onMicro-service Architecture[J].,2022,32(05):147.[doi:10. 3969 / j. issn. 1673-629X. 2022. 01. 025]

更新日期/Last Update: 2023-05-10