[1]王全民,张帅帅,杨晶.一种基于协同训练的Android恶意代码检测方法[J].计算机技术与发展,2019,29(01):135-139.[doi:10. 3969 / j. issn. 1673-629X. 2019. 01. 028]
 WANG Quan-min,ZHANG Shuai-shuai,YANG Jing.An Android Malicious Code Detection Method Based onCooperative Training[J].,2019,29(01):135-139.[doi:10. 3969 / j. issn. 1673-629X. 2019. 01. 028]
点击复制

一种基于协同训练的Android恶意代码检测方法()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
29
期数:
2019年01期
页码:
135-139
栏目:
安全与防范
出版日期:
2019-01-10

文章信息/Info

Title:
An Android Malicious Code Detection Method Based onCooperative Training
文章编号:
1673-629X(2019)01-0135-05
作者:
王全民 张帅帅 杨晶
北京工业大学 信息学部,北京,100124
Author(s):
WANG Quan-minZHANG Shuai-shuaiYANG Jing
Department of Informatics,Beijing University of Technology,Beijing 100124,China
关键词:
机器学习 Co-training 三视图 投票 分类器
Keywords:
machine learningCo-trainingthree-viewvotingclassifier
分类号:
TP301
DOI:
10. 3969 / j. issn. 1673-629X. 2019. 01. 028
摘要:
对于传统的恶意程序检测方法,将机器学习算法应用在未知恶意程序的检测方法进行研究.使用单一特征的机器学习算法无法充分发挥其数据处理能力,检测效果一般.使用两视图协同训练,对于一个未知样本两个分类器预测结果相反时处理不佳.因此,在机器学习的基础上,采用一种三视图协同训练算法,三个分类器对未知样本预测有分歧时,基于"少数服从多数"的思想进行"投票"决定,具有比较理想的效果.该方法对APK软件进行逆向分析和特征提取,选取权限申请特征、API调用序列特征和OpCode特征三个非重叠子视图,针对每个子视图甄选最优算法分别生成分类器.在此基础上,采用Co-training算法思想,对三个分类器协同训练,实现了在已知样本较少的情况下,三个单独分类器检测性能的同步提升.从安卓市场下载各类良性样本4600个,从恶意软件样本分享网站VirusShare下载最新恶意样本4360个,按照已标记样本数量从30到120个分为10组实验,对约1800个样本进行分类测试,实验结果表明该检测方法具有更优的效果.
Abstract:
For the traditional detection method of malicious program,the machine learning algorithm is applied to the detection method ofunknown malware. The machine learning algorithm with a single feature cannot give full play to its data processing ability,and the detec-tion effect is general. The two view collaborative training is not well for two classifiers with unknown samples when the prediction resultsare opposite. Therefore,based on machine learning,we adopt a collaborative training algorithm based on three views. When three classi-fiers are divided into unknown samples,voting is decided based on the idea of “majority obeys the majority”. This method carries out re-verse analysis and feature extraction for APK software. It selects three non-overlapping sub-views of permission application features,API calling sequence feature and OpCode feature,and generates classifiers for each sub view to select the best algorithm. Based on that,the Co-training algorithm is used to train three classifiers and achieve synchronous performance improvement of three individual classifi-ers under less known samples. We download more than 4 600 benign samples from the Android Market,and more than 4 360 latest mal-ware samples from VirusShare,a malware samples sharing site. According to the number of labeled samples from 30 to 120,10 groups ofexperiments are conducted and about 1 800 samples are classified. The experiment shows that the detection method has a better effect.

相似文献/References:

[1]陈全 赵文辉 李洁 江雨燕.选择性集成学习算法的研究[J].计算机技术与发展,2010,(02):87.
 CHEN Quan,ZHAO Wen-hui,LI Jie,et al.Research of Selective Ensemble Learning Algorithm[J].,2010,(01):87.
[2]黄秀丽 王蔚.SVM在非平衡数据集中的应用[J].计算机技术与发展,2009,(06):190.
 HUANG Xiu-li,WANG Wei.Application of SVM in Imbalances Dataset[J].,2009,(01):190.
[3]鲁晓南 接标.一种基于个性化邮件特征的反垃圾邮件系统[J].计算机技术与发展,2009,(08):155.
 LU Xiao-nan,JIE Biao.An Individual Anti- Spam Technology[J].,2009,(01):155.
[4]张苗 张德贤.多类支持向量机文本分类方法[J].计算机技术与发展,2008,(03):139.
 ZHANG Miao,ZHANG De-xian.Research on Text Categorization Based on. M- SVMs[J].,2008,(01):139.
[5]汤萍萍 王红兵.基于强化学习的Web服务组合[J].计算机技术与发展,2008,(03):142.
 TANG Ping-ping,WANG Hong-bing.Web Service Composition Based on Reinforcement -Learning[J].,2008,(01):142.
[6]杨雪洁 赵姝 张燕平.基于商空间理论的冬小麦产量预测和分析[J].计算机技术与发展,2008,(03):249.
 YANG Xue-jie,ZHAO Shu,ZHANG Yan-ping.Analysis on Winter Wheat Yield Based on Quotient Space Theory[J].,2008,(01):249.
[7]汤伟 程家兴 纪霞.一种基于概率推理的邮件过滤系统的研究与设计[J].计算机技术与发展,2008,(08):76.
 TANG Wei,CHENG Jia-xing,JI Xia.Research and Design of a Spam Filtering System Based on Probability Inference[J].,2008,(01):76.
[8]孙海虹 丁华福.基于模糊粗糙集的Web文本分类[J].计算机技术与发展,2010,(07):21.
 SUN Hai-hong,DING Hua-fu.Web Document Classification Based on Fuzzy-Rough Set[J].,2010,(01):21.
[9]汤伟 程家兴 纪霞.统计学理论在邮件分类中的应用研究[J].计算机技术与发展,2008,(12):231.
 TANG Wei,CHENG Jia-xing,JI Xia.Research and Design of a Spam Filtering System Based on Statistical Learning Theory[J].,2008,(01):231.
[10]张高胤 谭成翔 汪海航.基于K-近邻算法的网页自动分类系统的研究及实现[J].计算机技术与发展,2007,(01):21.
 ZHANG Gao-yin,TAN Cheng-xiang,WANG Hai-hang.Design and Implementation of Web Page Automation Classification System Based on K- Nearest Neighbor Algorithm[J].,2007,(01):21.

更新日期/Last Update: 2019-01-10