Search

CN-122020512-A - Multi-source heterogeneous information missing filling method, system, equipment and medium for mixed characteristic data

CN122020512ACN 122020512 ACN122020512 ACN 122020512ACN-122020512-A

Abstract

The invention is suitable for the field of multi-source heterogeneous information deletion filling, and discloses a multi-source heterogeneous information deletion filling method, a system, equipment and a medium for mixed characteristic data, wherein the method comprises the steps of acquiring multi-source heterogeneous information and performing independent characteristic mapping; the method comprises the steps of splicing the independent features after mapping, performing interactive processing by utilizing a multi-mode converter coding layer to obtain mixed feature data, performing feature extraction on the mixed feature data through a visual geometry group 16-layer network model to obtain depth mixed feature data, and obtaining multi-source heterogeneous information for completing deletion filling based on the depth mixed feature data by adopting an antisymmetric structure expansion model. The method breaks through the limitation of the traditional single-mode filling, can adaptively process the complex missing mode of the multi-source heterogeneous data, and remarkably improves the performance of filling results in the aspects of structural consistency, semantic consistency and context adaptability.

Inventors

  • ZHU CHANGHUI
  • Tian Changyuan
  • SUN SHOUYU
  • LIU LINJUN
  • TIAN YUE
  • YANG JIAN
  • WANG YIZHANG
  • DAI JIANLI
  • LIANG XIAOQIAN
  • HUANG LIHUANG
  • ZHANG ZHAO

Assignees

  • 贵州电网有限责任公司

Dates

Publication Date
20260512
Application Date
20251226

Claims (10)

  1. 1. A multi-source heterogeneous information missing filling method for mixed characteristic data is characterized by comprising the following steps: acquiring multi-source heterogeneous information and performing independent feature mapping; splicing the mapped independent features, and performing interactive processing by using a multi-mode converter coding layer to obtain mixed feature data; Performing feature extraction on the mixed feature data through a visual geometry group 16-layer network model to obtain deep mixed feature data; And based on the depth mixed characteristic data, adopting an antisymmetric structure expansion model to obtain multisource heterogeneous information for completing deletion filling.
  2. 2. The method for filling multi-source heterogeneous information missing for hybrid feature data according to claim 1, wherein the steps of obtaining multi-source heterogeneous information and performing independent feature mapping include: Acquiring multi-source heterogeneous information, wherein the multi-source heterogeneous information comprises a numerical matrix, a category sequence and a text sequence; The numerical matrix is projected to the hidden dimension through the full connection layer, and a numerical characteristic vector is obtained by combining an activation function; Mapping discrete categories of the category sequence into dense vectors through an embedding layer to obtain category feature vectors; And generating an initial vector of the text sequence by using the sentence-level bi-directional encoder representation model, and aligning hidden dimensions of the initial vector by using a full connection layer to obtain a text feature vector.
  3. 3. The method for filling multi-source heterogeneous information loss for mixed characteristic data according to claim 2, wherein the steps of splicing the mapped independent characteristics and performing interactive processing by using a multi-mode converter coding layer to obtain the mixed characteristic data include: Splicing the digital feature vector, the category feature vector and the text feature vector along the feature dimension; linearly transforming the spliced feature vectors to generate query vectors, key vectors and value vectors of a single-head attention mechanism; the interaction strength among the multi-source heterogeneous information modes is obtained by calculating and scaling the dot product of the query vector and the key vector; Generating a cross-modal context vector based on the interaction strength and the value vector; and carrying out residual connection and layer normalization processing on the cross-modal context vector, and obtaining mixed characteristic data through a layer converter.
  4. 4. The multi-source heterogeneous information missing padding method for hybrid feature data according to claim 3, wherein the obtaining hybrid feature data by a layer transformer comprises: by means of Layer transformer obtaining hybrid feature data : In the formula, Represent the first The layer-by-layer converter is provided, For the processed cross-modal context vector, For the n-th layer converter pair The transformation is performed such that the first and second parameters, 。
  5. 5. The multi-source heterogeneous information missing filling method for mixed feature data according to claim 3, wherein the feature extraction of the mixed feature data through a visual geometry group 16-layer network model to obtain deep mixed feature data comprises the following steps: depth blending feature data The expression is: In the formula, A 16-layer network model for a visual geometry group; is an activation function; is a pooling operation; 16-layer network model for visual geometry group A convolution kernel of the layer; 16-layer network model for visual geometry group Bias parameters of the layers.
  6. 6. The method for filling multi-source heterogeneous information loss for mixed feature data according to claim 5, wherein obtaining the intermediate variable for filling the loss based on the depth mixed feature data by adopting an antisymmetric structure expansion model comprises: Based on depth mixed characteristic data, combining an up-sampling transposed convolution layer 12 of an antisymmetric structure expansion model to obtain intermediate variables filled with multi-source heterogeneous information loss The expression is: In the formula, Transpose the convolutional layer 12 for upsampling; Is a transpose convolution operation; transpose the convolution kernel of the convolution layer 12 for upsampling.
  7. 7. The method for filling multi-source heterogeneous information missing from hybrid feature data according to claim 6, wherein obtaining multi-source heterogeneous information with missing filling by intermediate variables of missing filling includes: based on intermediate variables, the up-sampling transposed convolution layer 13 of the anti-symmetric structure expansion model is combined to obtain multi-source heterogeneous information for completing deletion filling The expression is: In the formula, Transpose the convolutional layer 13 for upsampling; Transpose the convolution kernel of the convolution layer 13 for upsampling.
  8. 8. A multi-source heterogeneous information missing filling system oriented to mixed characteristic data, applying the method as claimed in any one of claims 1 to 7, comprising: the feature mapping module is used for acquiring multi-source heterogeneous information and performing independent feature mapping; The interaction processing module is used for splicing the mapped independent features and performing interaction processing by utilizing the multi-mode converter coding layer to obtain mixed feature data; The feature extraction module is used for carrying out feature extraction on the mixed feature data through the visual geometry group 16-layer network model to obtain deep mixed feature data; and the deletion filling module is used for obtaining multi-source heterogeneous information for completing deletion filling by adopting an antisymmetric structure expansion model based on the depth mixed characteristic data.
  9. 9. An electronic device, comprising: A memory and a processor; The memory is configured to store computer-executable instructions that, when executed by the processor, implement the steps of the hybrid feature data oriented multi-source heterogeneous information loss filling method of any of claims 1 to 7.
  10. 10. A computer readable storage medium, comprising computer executable instructions stored thereon, which when executed by a processor, implement the steps of the hybrid feature data oriented multi-source heterogeneous information loss padding method of any of claims 1 to 7.

Description

Multi-source heterogeneous information missing filling method, system, equipment and medium for mixed characteristic data Technical Field The invention relates to the field of multi-source heterogeneous information deletion filling, in particular to a multi-source heterogeneous information deletion filling method, system, equipment and medium for mixed characteristic data. Background With the rapid development of information technology, various industries accumulate massive multi-source heterogeneous data. However, in the actual data acquisition, transmission and storage processes, due to the limitations of a data source, human misoperation, network interruption, equipment faults and other factors, a large number of missing values are commonly existed in multi-source heterogeneous information, so that the consistency and the integrity of data are poor, the problem of missing data not only reduces the usability of the data, but also can negatively affect the key links such as subsequent decision support, data mining and analysis, and further prevent the digital transformation and intelligent development of various industries. However, the current information deletion filling has the defects that the super-parameters in the ① multi-source heterogeneous information deletion filling task are numerous and have complex correlations, and the optimization algorithm cannot accurately find the optimal solution in a huge super-parameter space, so that the performance of the model cannot reach the best, and the deletion filling quality is further influenced. ② The attention mechanism unit and the dynamic weighting fusion unit are insufficient in processing the mixed characteristic data, and the generated characteristics cannot accurately and completely reflect the real characteristics of the multi-source heterogeneous information. The defective feature is used for replacing the data of the missing position, so that the accurate filling of the missing value cannot be realized, and the application effect of the method is poor. ③ Interaction among the features is not fully considered, the interaction among the features is ignored, and a filling value capable of accurately reflecting the interaction relation of the mixed features cannot be generated, so that filling quality is reduced. ④ Filling the missing value of a single mode, and filling the missing of the multi-source heterogeneous information can not be realized without designing a multi-source characteristic interaction layer. Therefore, with the aim of solving the problems, a multi-source heterogeneous information deletion filling method for mixed characteristic data is designed. Disclosure of Invention The present invention has been made in view of the above-described problems occurring in the prior art. Therefore, the invention provides a multi-source heterogeneous information deletion filling method, a system, equipment and a medium for mixed characteristic data, which solve the problems that the mixed characteristic data is difficult to extract due to nonlinear coupling among the multi-source data and the quality of filling the multi-source heterogeneous information deletion is reduced because only single mode characteristics are extracted in the prior art. In order to solve the technical problems, the invention provides the following technical scheme: in a first aspect, the present invention provides a method for filling a multi-source heterogeneous information loss oriented to mixed feature data, including: acquiring multi-source heterogeneous information and performing independent feature mapping; splicing the mapped independent features, and performing interactive processing by using a multi-mode converter coding layer to obtain mixed feature data; Performing feature extraction on the mixed feature data through a visual geometry group 16-layer network model to obtain deep mixed feature data; And based on the depth mixed characteristic data, adopting an antisymmetric structure expansion model to obtain multisource heterogeneous information for completing deletion filling. The method for filling the multi-source heterogeneous information loss facing the mixed characteristic data comprises the following steps of: Acquiring multi-source heterogeneous information, wherein the multi-source heterogeneous information comprises a numerical matrix, a category sequence and a text sequence; The numerical matrix is projected to the hidden dimension through the full connection layer, and a numerical characteristic vector is obtained by combining an activation function; Mapping discrete categories of the category sequence into dense vectors through an embedding layer to obtain category feature vectors; And generating an initial vector of the text sequence by using the sentence-level bi-directional encoder representation model, and aligning hidden dimensions of the initial vector by using a full connection layer to obtain a text feature vector. The invention relates to a