CN-115546525-B - Multi-view clustering method and device, electronic equipment and storage medium

CN115546525BCN 115546525 BCN115546525 BCN 115546525BCN-115546525-B

Abstract

The application relates to the technical field of multi-view clustering, in particular to a multi-view clustering method, a multi-view clustering device, electronic equipment and a storage medium, wherein the method comprises the steps of obtaining a plurality of views to be clustered; inputting the image features into a clustering model which is trained in advance, respectively extracting complementary information and structural information between different views by utilizing a preset public hidden space and a preset structural hidden space of the clustering model, matching the actual category of each view by utilizing the complementary information and the structural information, clustering the views based on the actual category, and outputting a clustering result of the views. Therefore, the problems that in the related art, a graph is built based on an original data matrix or shallow features, and a plurality of graphs are fused by utilizing different fusion strategies, so that the calculated amount is large, the clustering effect is seriously dependent on the quality of the original composition, the multi-view clustering effect is limited, and the like are solved.

Inventors

LUO YONGGANG
HU XIAOQIONG
ZHANG QI
LIAO GANG
SHAN YUMEI

Assignees

重庆长安汽车股份有限公司

Dates

Publication Date: 20260512
Application Date: 20221020

Claims (9)

1. A multi-view clustering method, comprising the steps of: acquiring a plurality of views to be clustered; extracting at least one image feature of each view to obtain a plurality of image features; Inputting the image features into a clustering model obtained by training in advance, respectively extracting complementary information and structure information between different views by using a preset public hidden space and a preset structured hidden space of the clustering model, matching actual categories of each view by using the complementary information and the structure information, clustering the views based on the actual categories, and outputting a clustering result of the views; before inputting the plurality of view samples into the preset public hidden space and the preset structured hidden space respectively and outputting the complementary information and the structure information between the view samples, the method comprises the following steps: constructing an undirected k nearest neighbor graph by utilizing the preset public hidden space; Inputting the multiple view samples into the undirected k neighbor graph to obtain an actual neighbor relation among the multiple view samples; And taking the actual neighbor relation and the preset public hidden space representation as input of a preset graph convolution network, outputting an updated representation of the preset public hidden space, carrying out layer-by-layer convolution through the preset graph convolution network to obtain an updated representation of each layer of convolution, and constructing and obtaining the preset structured hidden space based on the updated representation of each layer of convolution.
2. The method of claim 1, wherein the clustering model is trained based on training data carrying true clustering result labels, comprising: acquiring training data carrying a real clustering result label, wherein the training data comprises a plurality of view samples; respectively inputting the plurality of view samples into the preset public hidden space and the preset structured hidden space, and outputting complementary information and structure information between the view samples; Calculating actual similarity between a sample point of each view sample and a preset clustering center according to the complementary information and the structure information, determining an actual category of each view sample according to the actual similarity, and clustering the plurality of view samples based on the actual category to obtain a training clustering result; And calculating a training loss value according to the training clustering result and the real clustering result, optimizing the clustering center from the current credibility distribution by utilizing target distribution determined by actual similarity when the training loss value is larger than a convergence threshold, and updating the preset structured hidden space until the training loss value is smaller than or equal to the convergence threshold, stopping iterative training, and obtaining a trained clustering model.
3. The method of claim 2, wherein said calculating a training loss value from said training cluster result and said true cluster result comprises: Calculating reconstruction loss and clustering loss according to the training clustering result and the real clustering result; And obtaining the weight parameter of the clustering loss, and calculating the training loss value according to the weight parameter, the clustering loss and the reconstruction loss.
4. The method according to claim 2, comprising, before inputting the plurality of view samples into the preset public hidden space and the preset structured hidden space, respectively, outputting complementary information and structure information between the view samples: acquiring hidden space representation of any view in a preset sample; calculating to obtain a mapping relation between the hidden space representation and the arbitrary view by taking the minimum reconstruction loss of the hidden space representation as an optimization target; and constructing the preset public hidden space based on the mapping relation and a preset multi-layer perceptron.
5. A multi-view clustering method, comprising the steps of: acquiring training data carrying a real clustering result label, wherein the training data comprises a plurality of view samples; Respectively inputting the plurality of view samples into a preset public hidden space and a preset structured hidden space, and outputting complementary information and structural information among the view samples; Calculating actual similarity between a sample point of each view sample and a preset clustering center according to the complementary information and the structure information, determining an actual category of each view sample according to the actual similarity, and clustering the plurality of view samples based on the actual category to obtain a training clustering result; Calculating a training loss value according to the training clustering result and the real clustering result, optimizing the clustering center from current credibility distribution by utilizing target distribution determined by actual similarity when the training loss value is larger than a convergence threshold, updating the preset structured hidden space until the training loss value is smaller than or equal to the convergence threshold, stopping iterative training to obtain a training-completed clustering model, and clustering a plurality of views to be clustered by utilizing the clustering model to obtain a clustering result of the views; Before the plurality of view samples are respectively input into the preset public hidden space and the preset structured hidden space and complementary information and structure information between the view samples are output, the method comprises the following steps: constructing an undirected k nearest neighbor graph by utilizing the preset public hidden space; Inputting the multiple view samples into the undirected k neighbor graph to obtain an actual neighbor relation among the multiple view samples; And taking the actual neighbor relation and the preset public hidden space representation as input of a preset graph convolution network, outputting an updated representation of the preset public hidden space, carrying out layer-by-layer convolution through the preset graph convolution network to obtain an updated representation of each layer of convolution, and constructing and obtaining the preset structured hidden space based on the updated representation of each layer of convolution.
6. A multi-view clustering device, characterized in that it implements a multi-view clustering method according to any one of claims 1-4, comprising: the first acquisition module is used for acquiring a plurality of views to be clustered; The extraction module is used for extracting at least one image feature of each view to obtain a plurality of image features; The first clustering module is used for inputting the image features into a clustering model obtained through training in advance, respectively extracting complementary information and structural information between different views by utilizing a preset public hidden space and a preset structural hidden space of the clustering model, matching actual categories of each view by utilizing the complementary information and the structural information, clustering the views based on the actual categories, and outputting clustering results of the views.
7. A multi-view clustering apparatus, wherein the multi-view clustering method according to claim 5 is implemented, comprising: The second acquisition module is used for acquiring training data carrying real clustering result labels, wherein the training data comprises a plurality of view samples; The processing module is used for respectively inputting the plurality of view samples into a preset public hidden space and a preset structured hidden space and outputting complementary information and structural information among the view samples; the training module is used for calculating the actual similarity between the sample point of each view sample and a preset clustering center according to the complementary information and the structure information, determining the actual category of each view sample according to the actual similarity, and clustering the view samples based on the actual category to obtain a training clustering result; And the second clustering module is used for calculating a training loss value according to the training clustering result and the real clustering result, optimizing the clustering center from the current confidence distribution by utilizing the target distribution determined by the actual similarity when the training loss value is larger than the convergence threshold, updating the preset structured hidden space until the training loss value is smaller than or equal to the convergence threshold, stopping iterative training to obtain a clustering model after training, and clustering a plurality of views to be clustered by utilizing the clustering model to obtain the clustering result of the plurality of views.
8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor executing the program to implement the multi-view clustering method of any one of claims 1-5.
9. A computer readable storage medium having stored thereon a computer program, wherein the program is executed by a processor for implementing a multi-view clustering method according to any one of claims 1-5.

Description

Multi-view clustering method and device, electronic equipment and storage medium Technical Field The present application relates to the field of multi-view clustering technologies, and in particular, to a multi-view clustering method, a device, an electronic apparatus, and a storage medium. Background Clustering is an unsupervised learning task, and the basic idea of a clustering algorithm is to divide a group of data objects according to a certain standard, divide the same objects into the same class, and divide different objects into different classes. In the information age, the data volume grows exponentially, so that it is more necessary to divide and organize similar objects, thereby helping researchers to distinguish their internal logic more easily and reducing the cost of data labeling. At present, clustering algorithms are widely applied to the fields of computer vision, data mining, pattern recognition and the like. With the rapid development of information technology, the data acquisition mode is also more and more diversified, so that massive multi-view data is generated. For example, news may be reported in different languages, but also in text, images and video, and an image may be represented by different features, such as LBP (Local Binary Patterns, 100-dimensional local binary pattern), HOG (Histogram of Oriented Gradients, 256-dimensional directional gradient histogram), etc. Although the representation of the final data may be different, they all represent the same thing. In general, multi-view data contains more information than a single view, and rationally integrating such information from different perspectives enables a more comprehensive analysis of the problem. However, most conventional approaches focus primarily on a single view and cannot be used directly to process multi-view data. In recent years, therefore, a number of multi-view learning methods have been proposed for different tasks, such as classification, clustering, and the like. In the field of Multi-view learning, multi-view clustering (Multi-view Clustering, MVC) aims at rationally fusing information of different views, adaptively classifying data into respective categories without any tag information. Multi-view clustering achieves better clustering results than single views by considering consistency and complementarity between different views. Multi-view clustering is receiving more and more attention because it reduces the time and labor costs of multi-view data annotation. With the development of multimedia, more and more data is presented in various modalities or features, and such data is called multiview data. As multi-view clustering can reduce the labeling cost of mass data, it is receiving more and more attention. The related art mostly considers the information of fusing multiple views by using consistency and complementarity of the views, typically constructing a graph based on an original data matrix or shallow features, and then fusing the multiple views by using different fusion strategies. However, the related art multi-view clustering method based on graph learning is not only computationally intensive, but also the clustering effect is severely dependent on the quality of the original composition, thereby limiting the effect of multi-view clustering. Disclosure of Invention The application provides a multi-view clustering method, a multi-view clustering device, electronic equipment and a storage medium, which are used for solving the problems that the related technology builds a graph based on an original data matrix or shallow features, the calculated amount is large, the clustering effect is seriously dependent on the quality of an original composition, and the multi-view clustering effect is limited. An embodiment of the first aspect of the application provides a multi-view clustering method, which comprises the steps of obtaining a plurality of views to be clustered, extracting at least one image feature of each view to obtain a plurality of image features, inputting the plurality of image features into a clustering model obtained through training in advance, respectively extracting complementary information and structure information between different views by utilizing a preset public hidden space and a preset structured hidden space of the clustering model, matching actual categories of each view by utilizing the complementary information and the structure information, clustering the plurality of views based on the actual categories, and outputting clustering results of the plurality of views. According to the technical means, the embodiment of the application can extract the image characteristics of a plurality of views to be clustered, input the image characteristics into a clustering model obtained by training in advance, flexibly mine the complementary information of different views through the common hidden space representation, and obtain the structure information among the views by uti