CN-122019781-A - Multi-label text classification method and system for electric power data
Abstract
The invention discloses a multi-label text classification method and a system for electric power data, which relate to the technical field of electric power data processing, and the invention obtains text quantum states by acquiring electric power text data and converting vocabulary, grammar structures and semantic information of the text into quantum state representations through a quantum encoder; based on the text quantum state and the predefined power domain knowledge quantum state, performing feature fusion through a multi-mode quantum fusion model to obtain a fused quantum representation, based on the fused quantum representation, performing analysis processing through a quantum topology reasoning network to obtain a quantum supergraph containing a label association relation, based on the quantum supergraph, performing multi-label classification through a topology quantum gradient to obtain an initial classification result, and based on the initial classification result, eliminating a pseudo-correlation relation through dynamic quantum causal intervention to obtain a final classification result, thereby improving the accuracy of multi-label text classification.
Inventors
- SHI YANHUI
- HUANG HUA
- Peng teng
- LIU YUCHAO
- ZHANG ZHEN
- WANG NING
- ZHANG BO
- HU MENGZHU
- SONG YU
- ZHENG XING
- KONG WEIQI
- XU CHENG
- YU JUNSONG
- ZHANG WEN
- Ruan Yanjun
- LIAO YI
- WANG HAO
- ZHAO HANGHANG
- GU ZHIPENG
- MAO XIONG
Assignees
- 中国南方电网有限责任公司超高压输电公司广州局
Dates
- Publication Date
- 20260512
- Application Date
- 20260128
Claims (10)
- 1. A method for multi-label text classification of power data, the method comprising the steps of: Step S1, acquiring electric power text data, and converting vocabulary, grammar structure and semantic information of a text into quantum state representation through a quantum encoder by the electric power text data to obtain a text quantum state; s2, based on the text quantum state and the predefined power domain knowledge quantum state, performing feature fusion through a multi-mode quantum fusion model to obtain a fused quantum representation; S3, based on the fused quantum representation, analyzing and processing through a quantum topology reasoning network to obtain a quantum supergraph containing a label association relation; s4, performing multi-label classification through topological quantum gradients based on the quantum supergraphics to obtain an initial classification result; And S5, eliminating the pseudo-correlation relationship through dynamic quantum causal intervention based on the initial classification result to obtain a final classification result.
- 2. The method for multi-tag text classification of electric power data according to claim 1, wherein in the step S1, the electric power text data is converted into a quantum state representation by a quantum encoder, and the method specifically comprises: mapping vocabulary in a text sequence to the amplitude of a quantum bit by adopting a quantum amplitude coding method, and carrying out parallel coding on the text sequence with the length of N by utilizing quantum parallelism; analyzing grammar of the text, establishing a grammar dependency relationship graph among the vocabularies, converting the grammar dependency relationship into entanglement relationship among quantum bits in quantum coding, and establishing a maximum entanglement state among the corresponding quantum bits for two vocabularies with direct grammar dependency relationship; the semantic information is represented by phase relation among quantum states, semantic relation is encoded into relative phases among the quantum states based on semantic similarity calculation of the vocabulary, and phase parameters of the quantum states are dynamically adjusted according to context of the vocabulary in a text.
- 3. The method for classifying the multi-label text of the electric power data according to claim 2, wherein in the step S2, the feature fusion is performed through a multi-mode quantum fusion model to obtain a fused quantum representation, and the method specifically comprises the following steps: initializing a text quantum state and an electric power domain knowledge quantum state into two independent quantum registers respectively, wherein the text quantum state registers contain text characteristic representations obtained through quantum amplitude coding, and the electric power domain knowledge quantum state registers contain quantum representations of pre-trained electric power domain expertise; Constructing a standard Bell state preparation circuit, firstly applying Hadamard operation to the first quantum bit in a text quantum state register to generate a uniform superposition state, then using the quantum bit as a control bit, using the corresponding quantum bit in the knowledge quantum state register in the electric power field as a target bit, applying controlled NOT gate operation, and establishing a maximum entanglement state between two quantum bits; Extracting multi-scale semantic features from a text quantum state, wherein the multi-scale semantic features comprise local features at a vocabulary level, middle features at a phrase level and global features at a sentence level; And carrying out quantum fusion on the multi-scale semantic features by adopting a combination mode of a controlled NOT gate and a controlled phase gate to obtain a fused quantum representation.
- 4. The method for classifying the multi-label text of the electric power data according to claim 3, wherein in the step S3, analysis processing is performed through a quantum topology inference network to obtain a quantum supergraph including a label association relation, and the method specifically comprises the following steps: According to the fused quantum representation, defining each tag in the multi-tag classification task as a quantum node, and initializing a corresponding quantum state representation for each quantum node through a quantum state preparation circuit, wherein the amplitude distribution of the node quantum state represents the prior probability of the tag in training data, and the phase relation represents the semantic feature of the tag; And constructing a multi-body entangled state according to the association relation between the labels and taking the entangled state as the superside of the quantum supergraph.
- 5. The method for multi-label text classification of electric power data according to claim 4, wherein in the step S4, based on the quantum supergraphics, the multi-label classification is performed by topological quantum gradient to obtain an initial classification result, and the method specifically comprises: constructing a quantum decision tree set based on controlled gate operation as a weak classifier based on the quantum superpattern; Constructing a quantum loss function of multi-label classification, calculating the gradient of the quantum loss function through quantum phase estimation, optimizing parameters of the weak classifiers, and fusing decision results of the weak classifiers through a quantum integration learning mechanism to obtain an initial classification result.
- 6. The method for classifying the multi-label text of the electric power data according to claim 5, wherein the step S5 is based on the initial classification result, and the method for eliminating the pseudo-correlation relationship by the dynamic quantum-causal intervention to obtain the optimized classification result comprises the following steps: based on the initial classification result and the quantum supergraph, calculating the condition mutual information between any two labels under the condition of controlling other labels; According to the structural characteristics of the quantum causal graph, introducing a quantum operation operator, forcedly setting a target label into a specific state by quantum projection measurement aiming at each label needing intervention, breaking the original pseudo-correlation by using quantum state reset operation, and simultaneously keeping the quantum states of other labels unaffected to obtain the quantum state of single-pass dry prognosis; According to the layering intervention result, calculating quantum causal effect, according to analysis result of causal effect quantification index, quantum damping operation is carried out on the related side with insignificant causal effect, quantum enhancement operation is carried out on the real causal side, and causal weight matrix is optimized through quantum gradient descent, thus obtaining the topological structure of the purified quantum hypergraph; and reclassifying the text labels according to the purified quantum supergraphics to obtain a final classification result.
- 7. A power data multi-tag text classification system, the system comprising: the quantum data preprocessing module is used for acquiring electric text data, converting vocabulary, grammar structure and semantic information of the text into quantum state representation through a quantum encoder to acquire a text quantum state; the multi-mode quantum fusion module is used for carrying out feature fusion through the multi-mode quantum fusion model based on the text quantum state and the predefined power domain knowledge quantum state to obtain a fused quantum representation; The quantum topology reasoning module is used for analyzing and processing through a quantum topology reasoning network based on the fused quantum representation to obtain a quantum supergraph containing a label association relation; the initial classification module is used for performing multi-label classification through topological quantum gradients based on the quantum supergraphics to obtain an initial classification result; And the final classification module is used for eliminating the pseudo-correlation relationship through dynamic quantum causal intervention based on the initial classification result to obtain a final classification result.
- 8. A computer readable storage medium, having stored thereon a computer program which when executed by a processor implements a power data multi-label text classification method according to any of claims 1 to 6.
- 9. An electronic device comprising a processor and a memory, the memory storing a plurality of instructions, the processor loading instructions from the memory to perform the steps of a method of multi-tag text classification of power data as claimed in any one of claims 1 to 6.
- 10. A computer program product comprising computer programs/instructions which when executed by a processor implement the steps of a method of power data multi-label text classification according to any of claims 1 to 6.
Description
Multi-label text classification method and system for electric power data Technical Field The invention relates to the technical field of power data processing, in particular to a multi-label text classification method and system for power data. Background Currently, the mainstream methods for multi-label classification can be roughly divided into 2 strategies, namely algorithm adaptation strategies and problem transformation strategies. The algorithm adaptation strategy directly processes the multi-label data by modifying and expanding a common learning algorithm, so that the multi-label learning problem is solved. Known algorithm adaptation strategy class algorithms include an ML-kNN algorithm, a ranking support vector machine (Rank support vector machine, rank-SVM), and the like. However, the expansion of the strategy to the original algorithm model often causes the problems of relatively complex model, relatively high calculation complexity and the like. The problem conversion strategy mainly disassembles the multi-label classification problem into a multi-classification problem or a plurality of classification problems, thereby achieving the effect of simplifying the classification task. Common problem transformation class strategies are binary correlation (binary relevance, BR) algorithms, classifier chains (CLASSIFIER CHINA, CC), etc. However, BR ignores potential dependencies between classification tags, and thus has difficulty achieving good multi-tag classification accuracy. Aiming at the characteristics of high dimensionality and sparsity of text data in the electric power field, if an algorithm adaptation strategy is adopted, the high dimensionality of multi-label classification of the data can cause excessive computational complexity, which can certainly bring difficulty to the classification task to be additionally solved. On the other hand, if the BR is combined with a simple binary classifier (such as a support vector machine, a binary Logistic Regression (LR), etc.), the semantic association between classification labels is ignored although the overall algorithm is relatively simple. Considering that text contains numerous tags across multiple knowledge domains, ignoring semantic associations between tags necessarily reduces the accuracy of multi-tag classification. The conventional multi-label text classification method based on BR-GBDT has the technical defects that firstly, a binary association method based on label independence assumption cannot effectively capture the inherent association relation between power failure labels, secondly, the text feature representation method based on conventional machine learning is difficult to deeply understand semantic information of technical terms in the power field, thirdly, as the number of labels is increased, the number of classifiers to be trained is linearly increased, so that the calculation efficiency is low, and finally, the existing method cannot effectively utilize a layering knowledge structure existing in the power failure diagnosis field. Disclosure of Invention Aiming at the prior art, the invention aims to provide a multi-label text classification method and a system for electric power data, which mainly solve the technical problems in the background art. In order to achieve the above object, the technical solution of the embodiment of the present invention is as follows: in a first aspect, the present invention provides a method for classifying text of power data in multiple tags, the method comprising the steps of: Step S1, acquiring electric power text data, and converting vocabulary, grammar structure and semantic information of a text into quantum state representation through a quantum encoder by the electric power text data to obtain a text quantum state; s2, based on the text quantum state and the predefined power domain knowledge quantum state, performing feature fusion through a multi-mode quantum fusion model to obtain a fused quantum representation; S3, based on the fused quantum representation, analyzing and processing through a quantum topology reasoning network to obtain a quantum supergraph containing a label association relation; s4, performing multi-label classification through topological quantum gradients based on the quantum supergraphics to obtain an initial classification result; s5, eliminating the pseudo-correlation relationship by dynamic quantum causal intervention based on the initial classification result to obtain a final classification result; As a preferred embodiment of the present invention, in the step S1, the converting, by a quantum encoder, the power text data into a quantum state representation specifically includes: mapping vocabulary in a text sequence to the amplitude of a quantum bit by adopting a quantum amplitude coding method, and carrying out parallel coding on the text sequence with the length of N by utilizing quantum parallelism; analyzing grammar of the text, establishing a grammar