CN-121997782-A - Three-dimensional space-time distribution prediction method for algae in water body based on physical information neural network
Abstract
The invention discloses a three-dimensional space-time distribution prediction method for algae in a water body based on a physical information neural network, and belongs to the technical field of water bloom prediction and prevention and control. Aiming at the problems that a pure data driving model is poor in generalization, a pure mechanism model is difficult to fit and the fitting capability in the vertical direction is insufficient in the existing algae distribution prediction method, the invention combines the advantages of a physical mechanism and a data driving algorithm, embeds an advection-diffusion-reaction (ADR) equation of algae growth into a loss function of a Physical Information Neural Network (PINN), constructs a mixed prediction model with both data fitting precision and physical rule constraint, realizes accurate prediction of three-dimensional space-time distribution of algae, and obtains ecological parameters with physical significance by inversion. The method improves the accuracy, generalization and physical interpretability of algae distribution prediction, and can be effectively applied to monitoring, early warning and water ecological environment control of harmful algal bloom.
Inventors
- LIN LI
- GAO YU
- LI XIAOMENG
- DONG LEI
- PAN XIONG
- GUO ZIWEI
- CHEN SHIBO
Assignees
- 长江水利委员会长江科学院
Dates
- Publication Date
- 20260508
- Application Date
- 20260410
Claims (10)
- 1.A three-dimensional space-time distribution prediction method of algae in water body based on physical information neural network is characterized by comprising the following steps: s1, constructing a four-dimensional space-time characteristic tensor, namely acquiring multi-source monitoring data of a target water area, carrying out data preprocessing on the multi-source monitoring data, and constructing the four-dimensional space-time characteristic tensor taking a three-dimensional space coordinate (x, y, z) and a time dimension t of a water body as indexes, wherein the tensor comprises the four-dimensional space-time coordinate (x, y, z, t), a key environmental factor value influencing the growth of algae and an algae biomass index C after logarithmic transformation; s2, establishing a partial differential equation mechanism model of algae space-time variation, namely establishing a partial differential equation mechanism model PDE for describing algae advection transportation, space diffusion, active vertical migration and multi-environment factor coupling growth in a four-dimensional space-time domain based on water hydrodynamic characteristics and algae growth and migration ecological rules; S3, constructing and training a physical information neural network PINN model of fusion mechanism constraint, namely constructing a physical information neural network by taking four-dimensional space-time coordinates (x, y, z, t) obtained in the step S1 as input and taking an alga biomass index C after logarithmic transformation as output, calculating partial derivatives of the output and the input by utilizing an automatic differential technology, combining mechanism residual errors of a partial differential equation mechanism model PDE obtained in the step S2 with actual measurement data errors, constructing a composite loss function comprising a data fidelity term and a mechanism constraint term, carrying out model training by adopting a self-adaptive gradient balance strategy to obtain a trained PINN model M, and inverting to obtain a series of alga ecological parameters which accord with reality and normal conditions; S4, diagnosing growth limiting factors and analyzing an ecological control threshold interval, namely calculating partial derivatives of algae biomass C on environmental factor characteristics Xi by utilizing a PINN model M obtained by training in the step 3, generating a jacobian matrix J, identifying dominant growth limiting factors F under different depths by analyzing distribution characteristics of J at vertical depth z; s5, three-dimensional space-time distribution prediction and model evaluation of algae in the water body, namely inputting standardized water body environment data of the area to be predicted into a PINN model M obtained by training in the step S3, outputting a three-dimensional space-time distribution prediction result R of algae biomass, extracting inverted algae ecological parameters in the model training process, carrying out ecological interpretation and verification on the prediction result by combining a dominant growth limiting factor F and an ecological control threshold interval TS obtained in the step S4, and carrying out evaluation verification on physical consistency of the model.
- 2. The method of claim 1, wherein in step S1, the multi-source monitoring data comprises in-situ observation data, satellite remote sensing data and water environment analysis data, wherein the in-situ observation data comprises algae concentration, water temperature, nutrient salt concentration and flow rate data of different depths obtained by a profiler, the satellite remote sensing data comprises sea surface temperature, satellite inversion algae concentration and photosynthetic effective radiation data, and the water environment re-analysis data comprises water mixed layer depth, three-dimensional flow field, nutrient salt three-dimensional concentration field and water irradiance vertical distribution data.
- 3. The method according to claim 2, wherein in the step S1, the data preprocessing includes data cleaning of records with null values, abnormal values exceeding 3 sigma, repetition and inconsistency, filling of the missing values by using an optimal interpolation method and a data interpolation empirical orthogonal function method, interpolation of satellite remote sensing data and water environment re-analysis data to space-time nodes observed in situ to achieve space-time registration, standard scaling normalization of all data, and logarithmic transformation of algae concentration c=ln (1+cg), wherein Cg is measured algae concentration.
- 4. The method according to claim 1, wherein the partial differential equation in step S2 is: ; Wherein, C represents algae biomass, u, v, w are respectively the longitudinal, transverse and vertical movement speeds of water flow, w s is the vertical net migration speed of algae, D represents diffusion coefficient, mu represents physiological growth rate, I represents illumination intensity, N, P represents representative nutrient substance concentrations of nitrogen, phosphorus and the like, and T represents water body temperature.
- 5. The method according to claim 1, wherein in step S3, the expression of the composite loss function is: ; Wherein the method comprises the steps of The data fidelity term is the mean square error of the model predicted algae biomass and the measured value; the mechanism constraint term is the mean square error of the difference value of the left and right sides of the partial differential equation; is a weighting coefficient used to balance the contribution of data fitting and physical mechanism constraints.
- 6. The method of claim 4, wherein in step S3, the adaptive gradient balancing strategy is designed by designing a weight coefficient λ as a ratio of a data fidelity term to an exponential moving average of an L2 norm of a model parameter gradient of a mechanism constraint term, expressed as: ; Wherein the method comprises the steps of L is the gradient of the loss to the model parameter theta, For exponential moving average, ensuring that the data loss in the training process is equivalent to the gradient magnitude of the mechanism loss through the strategy; The initialization method of the network weight parameters comprises the steps of uniformly initializing the network weight parameters, setting the bias to 0, searching the optimal hidden layer number, the neuron number of each layer, an activation function, a learning rate and a dropout rate by adopting an automatic super-parameter optimization framework Optuna, dividing a dataset into a training set and a verification set according to a ratio of 8:2, taking root mean square error of the verification set as an optimization target, and carrying out advanced pruning on a poorly performing test by adopting MedianPruner.
- 7. The method of claim 1, wherein in step S4, the recognition logic of the dominant limiting factor is to compare partial derivatives of each environmental factor feature at a specific depth z Of (2), wherein The greatest term is the dominant limiting factor for this depth z, an environmental factor that affects algae growth.
- 8. The method of claim 7, wherein the parsing of the ecological control threshold interval TS includes ① second partial derivatives Is a physiological early warning threshold, represents algae to be most sensitive to factor Xi response near the threshold, ② first order partial derivative The area that tends to 0 and for which algal biomass C is above 75% of its predicted full scale is the saturation area.
- 9. The method of claim 1, wherein the algae ecological parameters include an algae maximum growth rate μ max , a vertical net migration velocity w s , a light half-saturation constant K I , a nitrogen half-saturation constant KN, a phosphorus half-saturation constant KP, all of which converge to a physically reasonable interval defined by the partial differential equation mechanism model PDE in steps S2, S3, and have a practical ecological interpretation meaning.
- 10. The method of claim 1, wherein in step S5, the evaluation and verification of the physical consistency of the model specifically comprises ① of physical consistency verification, namely analyzing training evolution of a weight coefficient lambda, verifying whether an output result accords with a mass conservation rule of a advection-diffusion-reaction equation, checking analysis continuity of a vertical distribution profile of a non-measured data area, ② of quantitative precision evaluation, namely calculating RMSE (R, R 2 ) indexes between a predicted value and a measured value, ensuring that the precision meets an early warning requirement, and ③ of time-space generalization verification, namely evaluating the prediction robustness of the model in different time periods or adjacent non-sampled water areas.
Description
Three-dimensional space-time distribution prediction method for algae in water body based on physical information neural network Technical Field The invention relates to the technical field of algal bloom prevention and control, in particular to a water body algae three-dimensional space-time distribution prediction method based on a physical information neural network, which is suitable for algal bloom prediction early warning and risk control of water areas such as rivers, lakes, reservoirs and the like. Background Water areas such as rivers, lakes, reservoirs and the like are used as human water supply sources. The three-dimensional space-time distribution of algae biomass in water environment can be rapidly and accurately predicted, and the three-dimensional space-time distribution has become a key technical requirement for improving the water ecological environment and promoting the ecological civilization construction and the high-quality development of water conservancy. At present, the method for predicting the algae biomass is mainly divided into two types, namely a numerical mechanism model based on hydrodynamic force change, such as EFDC, delft3D and the like, and the method has the defects of high consumption of computing resources, extremely difficult parameter calibration and the like although the physical significance is clear, and a data-driven artificial intelligent model, such as LSTM, CNN and the like, has high prediction speed, often generates results against mass conservation, physical laws and biological mechanism processes due to lack of practical mechanism constraint, is difficult to explain the mechanism due to the fact that the prediction result is derived from a black box, and is unfavorable for exploring the root cause of algae bloom and making targeted prevention and control in advance. In addition, the technology disclosed in the prior practical application and patent has the following limitations that (1) based on a time sequence frame such as LSTM and the like, the continuous distribution profile of algae in the vertical direction (Z axis) of a water body is difficult to restore, the aggregation phenomenon of algae in a specific depth cannot be captured, (2) only the biological proliferation process is considered, partial Differential Equation (PDE) description of physical-biological coupling processes such as active vertical migration (air sac adjustment, sedimentation) and spatial turbulence diffusion of the algae is lacking, 3) the prediction result is a single concentration value, dynamic diagnosis of limitation degrees of different environmental factors (light, temperature, nutrient salts and the like is lacking, and a specific ecological control threshold cannot be provided for water management. Disclosure of Invention The invention provides a three-dimensional space-time distribution prediction method for algae in a water body based on a Physical Information Neural Network (PINN), which solves the problems that the existing model ignores difference of different vertical depths, has low prediction precision, poor physical consistency and relatively poor causal interpretation, and has application value in realizing rapid prediction of the algae bloom process in a water environment and intervention of a mechanism aiming at the algae bloom outbreak. A three-dimensional space-time distribution prediction method of algae in water body based on physical information neural network comprises the following steps: s1, constructing a four-dimensional space-time characteristic tensor, namely acquiring multi-source monitoring data of a target water area, carrying out data preprocessing on the multi-source monitoring data, and constructing the four-dimensional space-time characteristic tensor taking a three-dimensional space coordinate (x, y, z) and a time dimension t of a water body as indexes, wherein the tensor comprises the four-dimensional space-time coordinate (x, y, z, t), a key environmental factor value influencing the growth of algae and an algae biomass index C after logarithmic transformation; s2, establishing a partial differential equation mechanism model of algae space-time variation, namely establishing a partial differential equation mechanism model PDE for describing algae advection transportation, space diffusion, active vertical migration and multi-environment factor coupling growth in a four-dimensional space-time domain based on water hydrodynamic characteristics and algae growth and migration ecological rules; S3, constructing and training a physical information neural network PINN model of fusion mechanism constraint, namely constructing a physical information neural network by taking four-dimensional space-time coordinates (x, y, z, t) obtained in the step S1 as input and taking an alga biomass index C after logarithmic transformation as output, calculating partial derivatives of the output and the input by utilizing an automatic differential technology, comb