CN-121984538-A - Smart electric meter signal enhancement method and system based on reinforcement learning

CN121984538ACN 121984538 ACN121984538 ACN 121984538ACN-121984538-A

Abstract

The invention relates to the field of data processing, in particular to a method and a system for enhancing a smart electric meter signal based on reinforcement learning, wherein the method comprises the steps of collecting a historical noise-containing carrier signal and a historical noise-free reference signal of an electric meter; the method comprises the steps of adaptively dividing frequency bands based on historical signal spectrum difference, calculating rewarding participation degree aiming at each frequency band, applying candidate filtering action to current signals, calculating time domain and frequency domain enhancement effect values of each frequency band, carrying out weighted summation on two-dimensional effect values based on the rewarding participation degree, deducting action complexity penalty to obtain an adaptive rewarding function value, calculating comprehensive rewarding value to determine optimal filtering action vectors according to discount rewards obtained by a future state prediction model, and finally achieving signal enhancement. The method effectively avoids the signal distortion problem caused by the traditional overall signal-to-noise ratio evaluation, and remarkably improves the reliability and the signal fidelity of the power line carrier communication.

Inventors

XU LINGXIANG
YANG JUN
CHEN JIE
YU XINWEI
Jin Xiaofu
YOU XIAOBO
GONG YU
JI JINENG
ZHENG YI
FANG GUOCHANG

Assignees

杭州百富电子技术有限公司

Dates

Publication Date: 20260505
Application Date: 20260407

Claims (9)

1. The intelligent ammeter signal enhancement method based on reinforcement learning is characterized by comprising the following steps: Collecting a first carrier signal sent by a current ammeter in a historical communication period and a second carrier signal received by a concentrator, and collecting the second carrier signal received by the concentrator in the current communication period, wherein the first carrier signal is a noise-free signal, and the second carrier signal is a noise-containing signal; Frequency band self-adaptive division is carried out on the first carrier signal and the second carrier signal of the historical communication period; The rewarding participation degree is calculated by the similarity degree of the current communication time period and the historical communication time period, the estimated noisiness of the frequency band in the historical communication time period and the estimated noisiness of the frequency band in the current communication time period; for each frequency band, respectively calculating a time domain enhancement effect value and a frequency domain enhancement effect value between the second carrier signal of the current communication period and the filtered signal on the frequency band; Weighting the time domain enhancement effect value and the frequency domain enhancement effect value of each frequency band based on the rewarding participation degree of the frequency band, accumulating the weighted results of all the frequency bands, and obtaining the self-adaptive rewarding function value of the candidate filtering action vector by combining the calculation complexity penalty of the candidate filtering action vector; And determining an optimal filtering motion vector from all candidate filtering motion vectors based on the adaptive reward function value, and filtering the second carrier signal of the current communication period by using the optimal filtering motion vector to realize signal enhancement.
2. The reinforcement learning-based smart meter signal enhancement method of claim 1, wherein performing frequency band adaptive division on both the first carrier signal and the second carrier signal of the historical communication period comprises: Presetting a plurality of frequency band segmentation schemes, aiming at each segmentation scheme, calculating the average value of frequency amplitude difference values corresponding to a first carrier signal and a second carrier signal on each frequency band, taking the average value as the estimated noisiness of the corresponding frequency band under the current segmentation scheme, and calculating the standard deviation of the estimated noisiness of all the frequency bands as the segmentation score of the current segmentation scheme in a historical communication period; and selecting a segmentation scheme with the maximum segmentation score as a final frequency band division scheme, and respectively dividing the first carrier signal and the second carrier signal of the historical communication period based on the final frequency band division scheme.
3. The reinforcement learning-based smart meter signal enhancement method of claim 1, wherein the obtaining of the similarity degree of the current communication period and the historical communication period includes: For the current communication period and the historical communication period, calculating the ratio of the intersection number to the union number of all the electric meters communicating in the two periods as a first characteristic value of the current communication period and the historical communication period; calculating Euclidean distance between frequency amplitudes of the second carrier signals of the current communication period and the historical communication period, and taking the Euclidean distance as a second characteristic value of the current communication period and the historical communication period; And taking the product of the first characteristic value and the second characteristic value as the similarity degree of the current communication period and the historical communication period.
4. The reinforcement learning-based smart meter signal enhancement method of claim 3, wherein the bonus participation calculation process includes: Selecting any frequency band as a target frequency band, calculating the average value of the frequency amplitude difference value between a first carrier signal and a second carrier signal of a historical communication period on the target frequency band as the first estimated noisiness of the target frequency band in the historical communication period; Multiplying the sum of the first estimated noisiness and the second estimated noisiness by the similarity of the current communication period and the historical communication period, and taking the product as the rewarding participation degree of the target frequency band.
5. The reinforcement learning-based smart meter signal enhancement method of claim 1, wherein the obtaining of the time domain enhancement effect value comprises: Selecting any frequency band as a target frequency band, and respectively calculating a second carrier signal of the current communication period and an autocorrelation sequence of the filtered signal on the target frequency band; Respectively calculating the accumulated sum of the first-order difference values of the two autocorrelation sequences; And taking the difference value of the accumulated sum of the first-order differential values corresponding to the second carrier signal of the current communication period and the accumulated sum of the first-order differential values corresponding to the filtered signal as a time domain enhancement effect value of the target frequency band.
6. The reinforcement learning-based smart meter signal enhancement method of claim 5, wherein the obtaining of the frequency domain enhancement effect value comprises: Selecting any frequency band as a target frequency band, and respectively calculating standard deviations of frequency amplitudes of a second carrier signal and the filtered signal of the current communication period on the target frequency band; and taking the difference value between the standard deviation of the frequency amplitude corresponding to the second carrier signal of the current communication period and the standard deviation of the frequency amplitude corresponding to the filtered signal as the frequency domain enhancement effect value of the target frequency band.
7. The reinforcement learning-based smart meter signal enhancement method of claim 1, wherein the determining an optimal filtering motion vector further comprises: using the second carrier signals of the historical communication period and the current communication period as input to train a prediction model, and predicting a plurality of future second carrier signals of the current communication period by using the trained prediction model; calculating a reward discount factor of each candidate filtering action vector under each future second carrier signal; For each candidate filtering action vector, calculating the sum of the adaptive reward function value of the candidate filtering action vector under the second carrier signal of the current communication period and the sum of the discount rewards of the candidate filtering action vector under all future second carrier signals to obtain the comprehensive return value of the candidate filtering action vector, wherein the discount rewards are obtained by multiplying the reward discount factor of the candidate filtering action vector under the corresponding future second carrier signal by the adaptive reward function value of the candidate filtering action vector under the corresponding future second carrier signal; And taking the candidate filtering motion vector with the largest comprehensive return value as the optimal filtering motion vector.
8. The reinforcement learning-based smart meter signal enhancement method of claim 7, wherein the predictive model is a deep neural network model.
9. A reinforcement learning based smart meter signal enhancement system comprising a processor and a memory, the memory storing computer program instructions that when executed by the processor implement the reinforcement learning based smart meter signal enhancement method of any one of claims 1-8.

Description

Smart electric meter signal enhancement method and system based on reinforcement learning Technical Field The present invention relates to the field of data processing. More particularly, the invention relates to a smart meter signal enhancement method and system based on reinforcement learning. Background The power line carrier communication is a key technology for the smart meter to realize data backhaul, but the communication channel environment is bad. The inherent power frequency current of the power line, load switching and electric appliance interference can introduce complex composite noise, and serious pollution is caused to communication signals. To cope with non-stationary noise, the prior art has attempted to employ adaptive filtering methods based on reinforcement learning, which typically uses the overall signal-to-noise ratio of the filtered signal as a reward function for reinforcement learning, and dynamically adjusts the filter parameters by training a model to improve the communication quality. However, the existing methods have significant drawbacks. Because the distribution and influence degree of the power line noise in different frequency bands are obviously different, the filter effect is evaluated only according to the overall signal-to-noise ratio, and the distinct noise suppression and signal fidelity performance of the filter in each frequency sub-band can be covered. This results in a model that may choose a filtering strategy that is better on a global scale but that causes excessive distortion or loss of information on the local frequency band, thus limiting the practical effect and reliability of signal enhancement. Disclosure of Invention In order to solve the technical problem of signal distortion caused by certain limitations in the prior art, the present invention provides the following aspects. In a first aspect, a smart meter signal enhancement method based on reinforcement learning includes: Collecting a first carrier signal sent by a current ammeter in a historical communication period and a second carrier signal received by a concentrator, and collecting the second carrier signal received by the concentrator in the current communication period, wherein the first carrier signal is a noise-free signal, and the second carrier signal is a noise-containing signal; Frequency band self-adaptive division is carried out on the first carrier signal and the second carrier signal of the historical communication period; The rewarding participation degree is calculated by the similarity degree of the current communication time period and the historical communication time period, the estimated noisiness of the frequency band in the historical communication time period and the estimated noisiness of the frequency band in the current communication time period; for each frequency band, respectively calculating a time domain enhancement effect value and a frequency domain enhancement effect value between the second carrier signal of the current communication period and the filtered signal on the frequency band; Weighting the time domain enhancement effect value and the frequency domain enhancement effect value of each frequency band based on the rewarding participation degree of the frequency band, accumulating the weighted results of all the frequency bands, and obtaining the self-adaptive rewarding function value of the candidate filtering action vector by combining the calculation complexity penalty of the candidate filtering action vector; And determining an optimal filtering motion vector from all candidate filtering motion vectors based on the adaptive reward function value, and filtering the second carrier signal of the current communication period by using the optimal filtering motion vector to realize signal enhancement. Preferably, the frequency band adaptive division of the first carrier signal and the second carrier signal of the historical communication period includes: Presetting a plurality of frequency band segmentation schemes, aiming at each segmentation scheme, calculating the average value of frequency amplitude difference values corresponding to a first carrier signal and a second carrier signal on each frequency band, taking the average value as the estimated noisiness of the corresponding frequency band under the current segmentation scheme, and calculating the standard deviation of the estimated noisiness of all the frequency bands as the segmentation score of the current segmentation scheme in a historical communication period; and selecting a segmentation scheme with the maximum segmentation score as a final frequency band division scheme, and respectively dividing the first carrier signal and the second carrier signal of the historical communication period based on the final frequency band division scheme. Preferably, the obtaining of the similarity degree of the current communication period and the historical communication period includes: For the current co