Multi - view clustering can synthesize complementary information from different views,and often results better than a singleview. However,the?traditional multi-view clustering method is limited by linear and shallow learning functions,which is difficult to characterize the deep information?
of the data. Existing deep learning methods pay insufficient attention to multi-dimensional detailed featureswhen characterizing multi-view data.?
In order to solve these problems,an encoder model based on convolutional attention mechanism( AEMC) is proposed,which integrates the conv-
olutional attention module into the encoder according to the specific representation ofdifferent views to adaptively learn the key features of each?
view. In addition,in order to optimize the model,according to the encoder representation,positive and negative samples are constructed through a comparative learning strategy,so that the similarity between positivesamples increases and the similarity of negative samples decreases,guiding the clustering process to make it more robust. The empiricalresults show that the model is better than most current mainstream methods,and its clustering accuracy on the E-MNIST,E-FMNIST,VOC and RGB-D datasets is improved by 10. 2% ,8. 1% ,7. 4% and 4. 9% compared with the benchmark model, respectively,and theclustering accuracy of E - MNIST and E - FMNIST datasets is higher than that of the current optimal Comparative Clustering Method
( CoMVC) by 0. 7% and 1. 3% , respectively,slightly lower than that of the Contrasting Clustering Method ( CoMVC) on the RGB-Ddataset.