[1]张海翔,李培培,胡学钢.类不平衡的公共和标签特定特征多标签分类[J].计算机技术与发展,2024,34(02):46-52.[doi:10. 3969 / j. issn. 1673-629X. 2024. 02. 007]
 ZHANG Hai-xiang,LI Pei-pei,HU Xue-gang.Class Imbalance Multi-label Classification with Common and Label Specific Features[J].,2024,34(02):46-52.[doi:10. 3969 / j. issn. 1673-629X. 2024. 02. 007]
点击复制

类不平衡的公共和标签特定特征多标签分类()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
34
期数:
2024年02期
页码:
46-52
栏目:
大数据与云计算
出版日期:
2024-02-10

文章信息/Info

Title:
Class Imbalance Multi-label Classification with Common and Label Specific Features
文章编号:
1673-629X(2024)02-0046-07
作者:
张海翔1 李培培2 胡学钢2
1. 蚌埠医学院附属合肥市第二人民医院 讯息处,安徽 合肥 230012;
2. 合肥工业大学 大数据知识工程教育部重点实验室,安徽 合肥 230601
Author(s):
ZHANG Hai-xiang1 LI Pei-pei2 HU Xue-gang2
1. Information Division,The Second People’s Hospital of Hefei Affiliated to Bengbu Medical College, Hefei 230012,China;
2. Key Laboratory of Knowledge Engineering with Big Data ( Hefei University of Technology) , Ministry of Education,Hefei 230601,China
关键词:
多标签分类类不平衡公共特征标签特定特征标签相关性
Keywords:
multi-label classificationclass-imbalancecommon featureslabel-specific featureslabel correlation
分类号:
TP183
DOI:
10. 3969 / j. issn. 1673-629X. 2024. 02. 007
摘要:
多标签分类主要解决实例数据对应多个标签问题,现有多标签方法大多利用所有特征组成的相同数据表示来区分所有标签,由于每个标签自身特点不同,统一的特征不能完全区分标签,给模型训练带来负面作用和时间成本增加,如何利用对每个标签而言最具有辨别力的特征来提高模型分类性能成为一种难题,此外现实中类不平衡问题同样会导致多标签学习模型的性能下降。 基于此,提出一种类不平衡的公共和标签特定特征多标签分类方法。 首先,找到种子实例的最近邻居,然后通过插值技术得到合成实例的特征来解决类不平衡问题;其次,为了找出对每个标签最具代表性的特征,引入 l1 ,l2,1 正则化约束系数矩阵提取标签的特定特征和公共特征;最后,使用标签相关性实现关联标签的模型输出相似,实例相关性保证关联特征共享对应标签分布信息提高分类性能。 实验表明所提方法与其他多标签分类方法相比获得了更好的分类精度。
Abstract:
Multi-label classification mainly deals with the problem that instances data is associated with multiple class labels. Most of theexisting multi-label methods use the same data representation consisting of all features to distinguish all labels. However, due to thedifferent characteristics of each label,unified features cannot fully differentiate them,which brings negative effects and increases time costto model training. Therefore,it becomes a challenge to improve the model classification performance by utilizing the most discriminative 
features for each label. In addition,the problem of class imbalance in reality can also result in a decline in the performance of multi-labellearning models. Motivated by this,we propose?
a new approach of class imbalance multi - label classification with common and labelspecific features. Firstly,we find the nearest neighbors of seed instances,and then use interpolation techniques to obtain the features ofsynthetic instances to solve the problem of class imbalance. Secondly,in order to find the most representative features for each label,we introduce?
l1 - norm and l2,1 - norm regularizers constraint coefficient matrix to extract label - specific features and common features.Finally,we use label correlation to achieve similar model output of associated labels,and instance correlation to ensure that associatedfeatures share corresponding label distribution information to improve classification performance. Extensive experiments show acompetitive performance of proposed method against other multi-label learning approaches.

相似文献/References:

[1]秦锋 黄俊 程泽凯 杨帆.多标签分类器准确性评估方法的研究[J].计算机技术与发展,2010,(01):43.
 QIN Feng,HUANG Jun,CHENG Ze-kai,et al.A Study on Accuracy Evaluation Method for Multi-Label Classifier[J].,2010,(02):43.
[2]黄明晓,荆晓远,李敏,等.基于主动学习的平衡类鉴别分析[J].计算机技术与发展,2014,24(06):95.
 HUANG Ming-xiao,JING Xiao-yuan,LI Min,et al.Class-balanced Discriminant Analysis Based on Active Learning[J].,2014,24(02):95.
[3]成希[][],荆晓远[],姚永芳[],等. 核化正交平衡类鉴别分析[J].计算机技术与发展,2015,25(01):133.
 CHENG Xi[][],JING Xiao-yuan[],YAO Yong-fang[],et al. Kernel Orthogonal Class-balanced Discriminant Analysis[J].,2015,25(02):133.
[4]杜阳阳,李华康,李涛.基于Node2vec 的改进算法的研究[J].计算机技术与发展,2018,28(07):6.[doi:10.3969/ j. issn.1673-629X.2018.07.002]
 DU Yang-yang,LI Hua-kang,LI Tao.Research on Improved Algorithm Based on Node2vec[J].,2018,28(02):6.[doi:10.3969/ j. issn.1673-629X.2018.07.002]
[5]史作婷,吴 迪,荆晓远,等.类不平衡稀疏重构度量学习软件缺陷预测[J].计算机技术与发展,2018,28(06):125.[doi:10.3969/ j. issn.1673-629X.2018.06.028]
 SHI Zuo-ting,WU Di,JING Xiao-yuan,et al.Prediction of Defect of Class-imbalance Sparse Reconstruction Metric Learning Software[J].,2018,28(02):125.[doi:10.3969/ j. issn.1673-629X.2018.06.028]
[6]甄俊涛,刘 臣.高维数据多标签分类的食品安全预警研究[J].计算机技术与发展,2020,30(09):109.[doi:10. 3969 / j. issn. 1673-629X. 2020. 09. 020]
 ZHEN Jun-tao,LIU Chen.Research on Food Safety Early Warning of Multi-label Classification of High Dimensional Data[J].,2020,30(02):109.[doi:10. 3969 / j. issn. 1673-629X. 2020. 09. 020]
[7]何 涛,陈 剑,闻英友,等.基于堆叠模型的司法短文本多标签分类[J].计算机技术与发展,2021,31(03):27.[doi:10. 3969 / j. issn. 1673-629X. 2021. 03. 005]
 HE Tao,CHEN Jian,WEN Ying-you,et al.Multi-label Classification of Judicial Short Texts Based on Stacking Model[J].,2021,31(02):27.[doi:10. 3969 / j. issn. 1673-629X. 2021. 03. 005]
[8]张海翔,李培培,胡学钢.基于自适应密度邻域关系的多标签在线流特征选择[J].计算机技术与发展,2024,34(01):23.[doi:10. 3969 / j. issn. 1673-629X. 2024. 01. 004]
 ZHANG Hai-xiang,LI Pei-pei,HU Xue-gang.Multi-label Online Stream Feature Selection Based on Adaptive Density Neighborhood Relation[J].,2024,34(02):23.[doi:10. 3969 / j. issn. 1673-629X. 2024. 01. 004]

更新日期/Last Update: 2024-02-10