CN-122002165-A - Nonlinear distortion compensation method for radio frequency Passive Optical Network (PON) system

CN122002165ACN 122002165 ACN122002165 ACN 122002165ACN-122002165-A

Abstract

The invention relates to the technical field of radio frequency passive optical networks, in particular to a compensation method for nonlinear distortion of a radio frequency Passive Optical Network (PON) system, which is used for solving the problems that in the prior art, each node cannot be converted into a unified state parameter feature vector, data consistency and calculation feasibility cannot be guaranteed, link similarity and influence intensity cannot be quantized, importance of neighbor nodes cannot be dynamically allocated by using attention weights, and capturing capacity of complex nonlinear relations and robustness and efficiency of feature propagation are reduced.

Inventors

ZHU YU
YAN JINGHAO
HUANG XIN
WU RUIDE
LING XIAOFENG
Mai Tianle
YE JIONGYAO
WANG FU
YAO HAIPENG

Assignees

华东理工大学
雅泰歌思（上海）通讯科技有限公司
北京邮电大学

Dates

Publication Date: 20260508
Application Date: 20260318

Claims (8)

1. The method for compensating nonlinear distortion of the radio frequency PON system is characterized by comprising the following steps: Collecting system state characteristic parameters in the transmission process of a PON link, including optical power, error rate, error vector amplitude, link delay and reflection loss, and performing outlier processing, filtering, time alignment and normalization preprocessing on the collected data; Step two, constructing a PON link unit diagram structure based on the preprocessed system state characteristic parameters, determining edges and weights according to the physical connection relation and the state difference, carrying out characteristic propagation through a diagram attention network, and introducing a self-modulation mechanism to dynamically adjust propagation weights; Based on the node characteristics of the link unit output by the graph attention network, identifying the nonlinear distortion type by adopting a depth classification model, and generating a time-frequency thermodynamic diagram of the intensity distribution by combining a time sequence, a signal frequency point and a link position; And fourthly, configuring independent reinforcement learning agents for each type of distortion according to nonlinear distortion identification results and space-time distribution characteristics thereof, constructing an observation space based on the distortion characteristics and system states, designing a reward function by combining local error change and system-level performance indexes, and updating a compensation strategy of each agent by utilizing a Nash equilibrium mechanism.
2. The method for compensating nonlinear distortion of a radio frequency PON system according to claim 1, wherein the process of constructing a PON link unit diagram structure and determining edges and weights according to physical connection relationships and state differences in the second step comprises: Acquiring physical connection relations among an optical line terminal, an optical splitter and an optical network unit to form a tree topology structure, taking each link unit as one node of a graph, and constructing a node set V, wherein each node V i epsilon V represents one link unit, and adding an undirected edge between the nodes corresponding to the two link units according to the physical connection relations if direct optical fiber connection exists between the two link units to form an edge set E to form a graph structure G= (V, E); Collecting system state characteristic parameters of each node v i , including optical power, error rate, error vector amplitude, link delay and reflection loss, organizing the five state parameters into a five-dimensional real number vector according to a fixed sequence, executing normalization processing on each component, and mapping to intervals [0,1] as characteristic vectors of the node; The weight of each edge E ij is determined by the difference of the state vectors of the two nodes connected with the edge E ij , the calculated weight is assigned to the corresponding edge, and the weighted undirected graph G= (V, E, W) is obtained, wherein W is an edge weight set.
3. The method for compensating nonlinear distortion of a radio frequency PON system according to claim 2, wherein the step two of performing characteristic propagation through a graph attention network and introducing a self-modulation mechanism to dynamically adjust propagation weights comprises: Acquiring a normalized state vector of each node V i , acquiring a connection relation between the nodes to form a graph structure G= (V, E), acquiring a preset hidden layer dimension, and acquiring trainable parameters including a trainable weight matrix, an attention vector and a gating parameter vector; Multiplying the normalized state vector of each node v i by a weight matrix to obtain an initial feature representation; For each node v i and its neighbor node v j , performing the operations of stitching h i and h j into a joint vector; Calculating the dot product of the joint vector and the attention vector to obtain a scalar value, and applying LeakyReLU function processing to the scalar value to obtain an unnormalized attention score; Applying a Softmax function to the unnormalized attention scores of all neighbor nodes of v i , normalizing to a set of nonnegative weights; Generating a gating factor g ij for each edge e ij , wherein the generation process is as follows: calculating X i -X j and a dot product of the trainable weight vector theta; acquiring an edge direction feature vector d ij and a dot product of the trainable weight vector phi; adding the two dot product results, inputting a Sigmoid function, and outputting g ij ; Each node v i updates its own representation according to the characteristics of its neighbor nodes and the composite propagation coefficient, which is the product of the attention weight and the gating factor.
4. The method for compensating nonlinear distortion of a radio frequency PON system according to claim 1, wherein the identifying the type of nonlinear distortion by using a depth classification model based on the node characteristics of the link unit output by the attention network in the third step comprises: acquiring an original feature vector of each link unit node, wherein the original feature vector comprises four components, namely a transmission power value, a modulation error vector amplitude value, a spectrum envelope value and a nonlinear disturbance index value, and the connection relation among nodes in a link system is formed into a directed graph G= (V, E), and each directed edge represents a signal transmission direction; all the features are subjected to normalization processing before input and mapped to intervals [0,1] to generate normalized feature vectors; Classifying the nonlinear distortion of the link by using a graph attention network, wherein the graph attention network model updates the node representation by: T1, performing linear transformation on the characteristics of each node and neighbor nodes thereof to generate an intermediate representation; T2, splicing each target node with the intermediate representation of all the edge-entering neighbors, inputting a trainable linear function, outputting a scalar value, and performing LeakyReLU activation function processing to obtain an un-normalized attention score; T3, normalizing the attention scores of all the edge-entering neighbors of the target node; T4, carrying out weighted summation on the intermediate representation of the neighbor node by using the normalized weight, and generating a new feature representation of the target node by activating function processing on the result; the graph attention network model adopts a multi-head attention mechanism, the outputs of a plurality of heads are spliced in characteristic dimension to form final output, a plurality of graph attention network layers are sequentially connected, the output of the former layer is used as the input of the latter layer, and the output of the final layer is used as the embedded representation of each node; For the embedded representation of each node, a classification score is calculated by using a full connection layer, the score is converted into probability distribution through a softmax function, the category with the highest probability is used as the nonlinear distortion type of the node, and the nonlinear distortion type judgment result of each link unit node is output.
5. The method for compensating nonlinear distortion of a radio frequency PON system according to claim 4, wherein the generating a space-time-frequency thermodynamic diagram of an intensity distribution by combining a time sequence, a signal frequency point and a link position in step three comprises: Obtaining the embedded representation of each node in each time step and the corresponding nonlinear distortion type classification score, and obtaining the number of nodes in a link system, the total number of observed time steps and the total number of discrete frequency points; constructing a three-dimensional data structure according to the number of the link nodes, the number of observed time steps and the total number of discrete frequency points, wherein the structure comprises three dimensions, namely a time dimension, a frequency dimension and a space dimension; Extracting a classification score of a target distortion type as an original intensity value for embedded representation of each node in each time step, determining a main frequency of each node in each time step, wherein the main frequency corresponds to a fixed index in a frequency dimension, and writing the intensity value of the node in the time step into a corresponding time, frequency and node position in a three-dimensional structure; Calculating arithmetic mean and standard deviation of intensity values of all nodes for each time step, and updating the normalized values into a three-dimensional structure; Extracting two-dimensional slices from a three-dimensional structure generates thermodynamic diagrams, respectively a time-space thermodynamic diagram, a space-frequency thermodynamic diagram and a time-frequency thermodynamic diagram.
6. The method for compensating nonlinear distortion of a radio frequency PON system according to claim 1, wherein the configuring an independent reinforcement learning agent for each type of distortion in the fourth step and constructing an observation space based on distortion characteristics and system states comprises: Obtaining a linear distortion identification result and space-time frequency distribution characteristics thereof, configuring an independent reinforcement learning agent for each type of distortion in a system with multiple types of nonlinear distortion, defining M types of nonlinear distortion contained in the system, and respectively recording as M reinforcement learning intelligent agents are established and respectively marked as ; Each agent Executing the compensation task of the m-th distortion, wherein a strategy function receives the current observation state as input and outputs probability distribution of compensation action; the observation state received by each intelligent agent in the time step t is a vector with a fixed dimension, and the vector is formed by sequentially splicing three parts, namely a local feature vector of mth distortion, a system-level state feature vector and a positioning information vector, and the three parts of vectors are spliced to form a complete observation state vector which is used as the input of the mth intelligent agent in the time step t.
7. The method for compensating nonlinear distortion of a radio frequency PON system according to claim 6, wherein the designing of the bonus function by combining the local error variation and the system-level performance index in step four comprises: Acquiring intensity indexes of the mth distortion before and after the compensation action, acquiring the variation of four performance indexes of the system before and after the compensation and respective weight coefficients, and acquiring cost information introduced by the compensation action; the instant prize of the mth agent at time step t is made up of a linear combination of three parts: Calculating the difference between the intensity index of the m-th distortion before the compensation action is executed and the intensity index after the compensation action is executed; calculating the variation of four performance indexes before and after compensation, multiplying each variation by a corresponding preset weight coefficient, and summing the weighted results; the action cost is that the cost introduced by the compensation action is calculated, the cost is composed of the order of the compensation filter, the overhead of the additional signal and the calculation burden, and the three items are summed; The three results are multiplied by non-negative weight coefficients respectively and added to form a final rewarding value, and the instant rewarding value R of the mth agent in the time step t is output.
8. The method for compensating nonlinear distortion of a radio frequency PON system according to claim 7, wherein the updating the compensation policy of each agent by using a nash equalization mechanism in step four comprises: Acquiring the instant rewarding sequence obtained by each agent in all time steps from the initial time to the task ending time, the current strategy parameters of each agent, discount factors, global state tracks, parameter updating step length and strategy network structure configuration as input data; Traversing the instant rewards of each agent in time sequence at all time steps, multiplying the rewards of each time step by the corresponding time power of a discount factor, and adding all results to obtain the accumulated rewards of the agent in the whole period; For each agent, using its strategy network to receive the global state, outputting the action selection probability, calculating the log probability value of the selected action, multiplying the log probability value with the cumulative prize value to obtain the adjustment direction vector of the strategy parameter; multiplying the current strategy parameters of each intelligent agent by the adjustment direction vector, multiplying the product result by the parameter updating step length, adding the obtained vector to the original parameters to obtain updated strategy parameters, and writing the updated parameters into a strategy network; For each agent, keeping the strategy parameters of all other agents unchanged, comparing the accumulated prize value of the agent when the updated parameters are combined with other arbitrary parameters, and marking the agent as strategy stable if the accumulated prize value obtained by using the updated parameters is not smaller than the value obtained by using other arbitrary parameter combinations; And checking whether all the agents are marked as strategy stable, if so, terminating the updating process, outputting strategy parameters after updating all the agents, and otherwise, continuing the next round of updating.

Description

Nonlinear distortion compensation method for radio frequency Passive Optical Network (PON) system Technical Field The invention relates to the technical field of radio frequency passive optical networks, in particular to a method for compensating nonlinear distortion of a radio frequency Passive Optical Network (PON) system. Background Under the development of communication technology, the radio frequency PON system is widely applied to the fields of 5G deployment, internet of things and the like due to the advantages of high bandwidth and low delay, but nonlinear elements such as an optical amplifier and the like in the system are high in power and easy to saturate, nonlinear distortion is caused by nonlinear transmission media and multipath effects, so that signal quality is smooth, bit error rate is increased, communication distance is shortened, the power rollback method is poor in efficiency, the design and debugging requirements of a negative feedback and feedforward system are high, broadband signal processing is difficult, the digital predistortion technology has strict requirements on ADC/DAC sampling rate, and implementation difficulty and cost are increased. The invention discloses a compensation method, a compensator and a system for nonlinear distortion of a pulse field source, and the compensation method comprises the steps of constructing a Volterra series model for nonlinear distortion of an analog digital receiver, loading a signal to be compensated received by the digital receiver into the Volterra series model to obtain a distorted signal, wherein the distorted signal carries nonlinear distortion quantity, constructing a compensation model for representing the nonlinear distortion quantity, eliminating the nonlinear distortion quantity in the distorted signal by using the compensation model to obtain compensation output, updating a compensation kernel vector of the compensation model by using a least square method based on the compensation output, and eliminating the nonlinear distortion quantity in the distorted signal in real time by using the compensation model updated by using the compensation kernel vector; However, the above-mentioned reference patent is to analyze the frequency distribution of nonlinear components in the output signal of the digital receiver, construct a filter to extract nonlinear quantity, build a compensation model taking nonlinear distortion energy as a cost function, effectively identify and update parameters under the condition that an additional ADC is not required to collect the original input signal, obviously improve the spurious-free dynamic range performance of the system, but cannot convert each node into a unified state parameter feature vector, cannot guarantee the consistency and calculation feasibility of data, cannot quantify the link similarity and influence intensity, cannot dynamically allocate the importance of neighbor nodes by using the attention weight, reduce the capturing capacity of complex nonlinear relations and the robustness and efficiency of feature propagation, and cannot configure independent reinforcement learning agents for each type of nonlinear distortion, cannot realize differential fine compensation, cannot synthesize distortion suppression effect, system performance gain and action cost, cannot introduce a Nash equilibrium mechanism to coordinate the updating of multiple intelligent policies, and cannot achieve efficient, accurate and expandable nonlinear distortion dynamic compensation. For this reason, we propose a method for compensating nonlinear distortion of the radio frequency PON system for the above-mentioned problems. Disclosure of Invention The invention aims to provide a compensation method for nonlinear distortion of a radio frequency Passive Optical Network (PON) system, which solves the problems that the prior art cannot convert each node into a unified state parameter feature vector, cannot guarantee data consistency and calculation feasibility, cannot quantify link similarity and influence intensity, cannot dynamically allocate importance of neighbor nodes by using attention weights, reduces capturing capacity of complex nonlinear relations and robustness and efficiency of feature propagation, cannot configure independent reinforcement learning agents for each type of nonlinear distortion, cannot realize differential fine compensation, cannot synthesize distortion suppression effect, system performance gain and action cost, cannot introduce Nash equalization mechanism to coordinate multi-agent strategy updating, and cannot achieve efficient, accurate and extensible nonlinear distortion dynamic compensation. The aim of the invention is achieved by the following technical scheme: A compensation method for nonlinear distortion of a radio frequency PON system comprises the following steps: Collecting system state characteristic parameters in the transmission process of a PON link, including optical power, error