[1]乔旭坤,李 顺,李 君,等.基于机器学习的硬盘故障预测研究[J].计算机技术与发展,2022,32(06):215-220.[doi:10. 3969 / j. issn. 1673-629X. 2022. 06. 036]
 QIAO Xu-kun,LI Shun,LI Jun,et al.Research on Hard Disk Failure Prediction Based on Machine Learning[J].,2022,32(06):215-220.[doi:10. 3969 / j. issn. 1673-629X. 2022. 06. 036]
点击复制

基于机器学习的硬盘故障预测研究()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
32
期数:
2022年06期
页码:
215-220
栏目:
应用前沿与综合
出版日期:
2022-06-10

文章信息/Info

Title:
Research on Hard Disk Failure Prediction Based on Machine Learning
文章编号:
1673-629X(2022)06-0215-06
作者:
乔旭坤李 顺李 君吴 鑫茅智慧
浙江万里学院,浙江 宁波 315100
Author(s):
QIAO Xu-kunLI ShunLI JunWU XinMAO Zhi-hui
Zhejiang Wanli University,Ningbo 315100,China
关键词:
机器学习硬盘故障故障预测随机森林逻辑回归梯度提升决策树AdaBoost 算法
Keywords:
machine learninghard disk failurefailure predictionrandom forestlogistic regressiongradient boosting decision treeAdaBoost
分类号:
TP333
DOI:
10. 3969 / j. issn. 1673-629X. 2022. 06. 036
摘要:
硬盘故障所致的数据丢失和损坏给企业和用户带来重大损失,硬盘故障预测也因此引起了学术界和企业界的高度重视,涌现了不少基于机器学习的故障预测方法,但由于存在机器学习算法模型的样本数据差异、性能指标不一致等原因,无法合理评估预测方法的优劣。 鉴于此,建立了基于机器学习的硬盘故障检测评估平台,在统一的实验平台中对随机森林、逻辑回归、多层感知神经网络、决策树、朴素贝叶斯、极端梯度提升树、梯度提升决策树和 AdaBoost 算法模型进行故障预测性能比较,主要针对相同样本集和同一性能度量进行预测对比研究,还对同一预测模型在不同大小样本集上的预测效果进行了对比。 实验结果表明:随机森林模型和梯度提升决策树模型不仅预测精度很高而且对不同规模的样本集具有很强的泛化性。
Abstract:
Data loss and damage caused by hard disk failure bring significant losses to enterprises and users. Therefore,hard disk failureprediction has also attracted the great attention of academic and enterprise. Many failure prediction methods based on machine learninghave emerged. However, due to the different dataset and performance index, it is hard to evaluate the different algorithm models.Therefore,we establish a hard disk failure detection and evaluation platform for evaluating machine learning methods. The failureprediction performance of eight classical algorithm models are compared in a unified experimental platform, including random forest,logistic regression, multilayer perceptron - artificial neutral network, decision tree, naive Bayes, extreme gradient boosting, gradientboosting decision tree and AdaBoost. The experiments are executed on the same dataset with the same performance metric. Besides theprediction effects of the same prediction model on the datasets with different sizes are compared. The experimental results show that therandom forest and gradient boosting decision tree can achieve high prediction accuracy as well as advantages of generalization for thedatasets with different size.

相似文献/References:

[1]陈全 赵文辉 李洁 江雨燕.选择性集成学习算法的研究[J].计算机技术与发展,2010,(02):87.
 CHEN Quan,ZHAO Wen-hui,LI Jie,et al.Research of Selective Ensemble Learning Algorithm[J].,2010,(06):87.
[2]黄秀丽 王蔚.SVM在非平衡数据集中的应用[J].计算机技术与发展,2009,(06):190.
 HUANG Xiu-li,WANG Wei.Application of SVM in Imbalances Dataset[J].,2009,(06):190.
[3]鲁晓南 接标.一种基于个性化邮件特征的反垃圾邮件系统[J].计算机技术与发展,2009,(08):155.
 LU Xiao-nan,JIE Biao.An Individual Anti- Spam Technology[J].,2009,(06):155.
[4]张苗 张德贤.多类支持向量机文本分类方法[J].计算机技术与发展,2008,(03):139.
 ZHANG Miao,ZHANG De-xian.Research on Text Categorization Based on. M- SVMs[J].,2008,(06):139.
[5]汤萍萍 王红兵.基于强化学习的Web服务组合[J].计算机技术与发展,2008,(03):142.
 TANG Ping-ping,WANG Hong-bing.Web Service Composition Based on Reinforcement -Learning[J].,2008,(06):142.
[6]杨雪洁 赵姝 张燕平.基于商空间理论的冬小麦产量预测和分析[J].计算机技术与发展,2008,(03):249.
 YANG Xue-jie,ZHAO Shu,ZHANG Yan-ping.Analysis on Winter Wheat Yield Based on Quotient Space Theory[J].,2008,(06):249.
[7]汤伟 程家兴 纪霞.一种基于概率推理的邮件过滤系统的研究与设计[J].计算机技术与发展,2008,(08):76.
 TANG Wei,CHENG Jia-xing,JI Xia.Research and Design of a Spam Filtering System Based on Probability Inference[J].,2008,(06):76.
[8]孙海虹 丁华福.基于模糊粗糙集的Web文本分类[J].计算机技术与发展,2010,(07):21.
 SUN Hai-hong,DING Hua-fu.Web Document Classification Based on Fuzzy-Rough Set[J].,2010,(06):21.
[9]汤伟 程家兴 纪霞.统计学理论在邮件分类中的应用研究[J].计算机技术与发展,2008,(12):231.
 TANG Wei,CHENG Jia-xing,JI Xia.Research and Design of a Spam Filtering System Based on Statistical Learning Theory[J].,2008,(06):231.
[10]张高胤 谭成翔 汪海航.基于K-近邻算法的网页自动分类系统的研究及实现[J].计算机技术与发展,2007,(01):21.
 ZHANG Gao-yin,TAN Cheng-xiang,WANG Hai-hang.Design and Implementation of Web Page Automation Classification System Based on K- Nearest Neighbor Algorithm[J].,2007,(06):21.

更新日期/Last Update: 2022-06-10