[1]冀慧杰,倪 枫!,刘 姜,等.基于XGB-BFS特征选择算法的电信客户流失预测[J].计算机技术与发展,2021,31(05):21-25.[doi:10. 3969 / j. issn. 1673-629X. 2021. 05. 004]
 ,,et al.PredictionofTelecom CustomerChurnBasedonXGB-BFSFeatureSelectionAlgorithm[J].,2021,31(05):21-25.[doi:10. 3969 / j. issn. 1673-629X. 2021. 05. 004]
点击复制

基于XGB-BFS特征选择算法的电信客户流失预测()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
31
期数:
2021年05期
页码:
21-25
栏目:
大数据分析与挖掘
出版日期:
2021-05-10

文章信息/Info

Title:
PredictionofTelecom CustomerChurnBasedonXGB-BFSFeatureSelectionAlgorithm
文章编号:
1673-629X(2021)05-0020-05
作者:
冀慧杰倪 枫!刘 姜陆祺灵张旭阳阙中力
上海理工大学管理学院,上海200093
Author(s):
JIHui-jieNIFengLIUJiangLUQi-lingZHANGXu-yangQUEZhong-l
BusinessSchool,UniversityofShanghaiforScience&Technology,Shanghai200093,Chin
关键词:
客户流失预测特征选择XGBoost特征重要性序列后向搜索
Keywords:
customerchurnpredictionfeatureselectionXGBoostfeatureimportancesequentialbackwardselection.
分类号:
TP31
DOI:
10. 3969 / j. issn. 1673-629X. 2021. 05. 004
摘要:
客户流失是现代企业面临最困难的问题,对客户流失进行预测是电信业保留现有客户的最有效策略之一。电信客户数据集往往具有高维特征,选择重要特征并减少无关属性的数量可以提高模型的分类性能。针对客户流失数据集高维特征的问题,提出了一种混合的XGB-BFS特征选择方法。首先基于XGBoost的Fscore值对特征重要性排序来度量特征与目标变量之间的相关关系,然后使用序列后向搜索的方法依次删除重要性最低的特征,根据验证集的AUC值判断是否保留该特征,最后将选择的特征子集用于构建XGBoost客户流失预测模型。在电信客户流失数据集上的实验结果表明,该方法能够筛选出特征重要性较高的特征且删除了冗余特征,与基于递归特征消除的Logistic模型、基于Embedded的Adaboost和随机森林模型相比,具有良好的性能。
Abstract:
Customerchurnisoneofthemostdifficultproblemsfacedbymodernenterprises.Predictingcustomerchurnisoneofthemosteffectivestrategiesfortelecomindustrytoretainexistingcustomers.Telecom customerdatasetsoftenhavehighdimensionalfeatures.Selectingimportantfeaturesandreducingthenumberofirrelevantattributescanimprovetheclassificationperformanceofthemodel.Focusedontheproblemofhighdimensionalfeaturesincustomerchurndataset,ahybridXGB-BFSmethodisproposed.Firstly,theimportanceoffeaturesbasedontheFscorevalueofXGBoostisrankedtomeasurethecorrelationbetweenfeaturesandtargetvariables,andthenthefeaturesareselectedbysequentialbackwardselection.ThismethodsequentiallydeletesthelowestimportantfeatureandjudgeswhethertoretainthefeaturesaccordingtoAUCvalueofthevalidationset.Finally,usingtheselectedfeaturesubsettoconstructXGBoostcustomerchurnpredictionmodel.Theexperimentonthedatasetoftelecomcustomershowsthattheproposedmethodcanfilteroutthefeatureswithhighimportanceandremoveredundantfeatures.ComparedwiththeLogisticmodelbasedonrecursivefeatureelimination,EmbeddedAdaboostmodelandrandomforestmodel,theproposedalgorithmhasbetterperformance.

相似文献/References:

[1]刘利 何先平 袁文亮.股票趋势预测中Wrapper方法的研究与应用[J].计算机技术与发展,2010,(01):209.
 LIU Li,HE Xian-ping,YUAN Wen-liang.Research and Application of Wrapper Approach to Stock Trend Prediction[J].,2010,(05):209.
[2]黄炜 黄志华.一种基于遗传算法和SVM的特征选择[J].计算机技术与发展,2010,(06):21.
 HUANG Wei,HUANG Zhi-hua.Feature Selection Based on Genetic Algorithm and SVM[J].,2010,(05):21.
[3]张家柏 王小玲.基于聚类和二进制PSO的特征选择[J].计算机技术与发展,2010,(06):25.
 ZHANG Jia-bai,WANG Xiao-ling.A Novel Algorithm Based on K-Means Clustering and Binary Particle Swarm Optimization[J].,2010,(05):25.
[4]冯甲策 叶明 王惠文.基于Gram—Schmidt过程的支持向量机降维方法[J].计算机技术与发展,2009,(11):7.
 FENG Jia-ce,YE Ming,WANG Hui-wen.Dimension Reduction Method of Support Vector Machine Based on Gram- Schmidt Process[J].,2009,(05):7.
[5]林伟 柳荣其 徐熙.邮件过滤中一种改进的特征选择方法研究[J].计算机技术与发展,2009,(01):84.
 LIN Wei,LIU Rong-qi,XU Xi.Improvement of Feature Selection Algorithm in Spam Filtering[J].,2009,(05):84.
[6]刘毅 张月琳.基于Agent的邮件过滤与个性化分类系统设计[J].计算机技术与发展,2009,(02):66.
 LIU Yi,ZHANG Yue-lin.Design of a Mail Filter and Personalized Classification System Based on Agent[J].,2009,(05):66.
[7]陈素萍 谢丽聪.一种文本特征选择方法的研究[J].计算机技术与发展,2009,(02):112.
 CHEN Su-ping,XIE Li-cong.Research on Document Feature Selection[J].,2009,(05):112.
[8]段震 王倩倩 张燕平 张铃.覆盖算法下文本分类特征选择的研究[J].计算机技术与发展,2008,(11):29.
 DUAN Zhen,WANG Qian-qian,ZHANG Yan-ping,et al.Study on Feature Selection of Text Classification in Cross Cover Algorithm[J].,2008,(05):29.
[9]王希雷.基于Rough集理论的车牌汉字特征提取[J].计算机技术与发展,2007,(06):26.
 WANG Xi-lei.Car Plate Chinese Character Feature Extraction Based on Rough Set Theory[J].,2007,(05):26.
[10]董梅 胡学钢.基于多特征选择的中文文本分类[J].计算机技术与发展,2007,(07):117.
 DONG Mei,HU Xue-gang.Text Categorization Based on Multiple Features Selection[J].,2007,(05):117.

更新日期/Last Update: 2020-05-10