Search

CN-121983177-A - Phosphate oxygen isotope prediction method based on machine learning

CN121983177ACN 121983177 ACN121983177 ACN 121983177ACN-121983177-A

Abstract

The invention belongs to the technical field of isotope prediction, and provides a phosphate oxygen isotope prediction method based on machine learning, which comprises the following steps: collecting target factors and predicting a model; the prediction model construction flow comprises the following steps: study area selection, hydrologic and water quality data measurement, watershed basic information statistics, spearman correlation analysis, core factor expansion, variance expansion factor test and RF machine learning model training; according to the invention, through spearman correlation analysis, core factor expansion and variance expansion factor inspection, the input variable of the model is optimized, the introduced non-significant factor provides a potential regulation and control mechanism, and the reliability of the model is improved; through the RF machine learning model, complex interaction effect and nonlinear relation among variables are effectively captured, and isotope prediction accuracy is improved.

Inventors

  • LIU HUANYAO
  • MENG CEN
  • YU SHAOBO
  • LI YUYUAN
  • LIU JINGKE

Assignees

  • 湖南农业大学

Dates

Publication Date
20260505
Application Date
20260114

Claims (4)

  1. 1. A machine learning-based phosphate oxygen isotope prediction method, comprising: collecting target factors of a target area, wherein the target factors comprise water oxygen isotopes, water temperature, soluble active phosphorus, potassium ions, dissolved oxygen, domestic and residential land proportion and water level data; Inputting the target factors into a prediction model for calculation to obtain a phosphate oxygen isotope prediction result, wherein the construction process of the prediction model comprises the following steps: Selecting a study area; The method comprises the steps of collecting and measuring water samples of rivers in a research area in a fixed period to obtain hydrologic and water quality data, wherein the hydrologic and water quality data comprise nutrient substance indexes, relevant water chemical variables, element indexes, water level data, flow rate data and water oxygen isotopes, the nutrient substance indexes comprise soluble active phosphorus, ammonium nitrogen and nitrate nitrogen, the relevant water chemical variables comprise water temperature, pH value, dissolved oxygen and electric conductivity, and the element indexes comprise calcium ions, magnesium ions, sodium ions, potassium ions, chloride ions, sulfate ions and soluble organic carbon; acquiring digital elevation data, livestock and poultry breeding space data and population distribution data of the research area to obtain basic information of a river basin, wherein the basic information of the river basin comprises area, population density, livestock and poultry density and land utilization type; Carrying out Szechwan correlation analysis on variables in the hydrologic and water quality data to obtain Szechwan correlation coefficients, and screening variables corresponding to the Szechwan correlation coefficients with absolute values larger than or equal to 0.25 to obtain core factors, wherein the statistical significance of the core factors is smaller than 0.01; the variable in the basic information of the river basin is fused into the core factors, wherein the core factors comprise the water oxygen isotope, the water temperature, the soluble active phosphorus, the potassium ions, the dissolved oxygen, the land proportion of residents and the water level data; performing variance expansion factor test on the core factors to obtain VIF values, and screening variables corresponding to the VIF values smaller than 5 to obtain input variables; Training a preset machine learning model by taking the phosphate oxygen isotope value of the research area as a target variable and the input variable as a prediction variable to obtain the prediction model, wherein the preset machine learning model comprises any one of an RF model and a XGBoost model.
  2. 2. The machine learning based phosphate oxygen isotope prediction method of claim 1 wherein the fixed period is once acquired in mid-month.
  3. 3. The machine learning based phosphate oxygen isotope prediction method of claim 1 wherein the water sampling and measuring of the river in the study area is performed at a fixed period to obtain hydrologic and water quality data, comprising: measuring the collected river water sample by using a full-automatic flow injector to obtain the nutrient index; measuring the river water sample by using a water quality parameter instrument to obtain the related water chemistry variable; respectively measuring the river water sample by using an inductively coupled plasma emission spectrometer, an ion chromatograph and a total organic carbon analyzer to obtain the element index; collecting the collected data of the water pressure sensor configured at the sampling point of the research area to obtain the water level data and the flow velocity data.
  4. 4. The method for predicting phosphate oxygen isotopes of claim 1, wherein the land utilization type includes woodland, dry land, paddy field, tea garden, and residential land.

Description

Phosphate oxygen isotope prediction method based on machine learning Technical Field The invention relates to the technical field of isotope prediction, in particular to a phosphate oxygen isotope prediction method based on machine learning. Background Phosphorus acts as a key limiting nutrient element for the water ecosystem, with its excessive input being the primary driver of eutrophication in the global freshwater system. The eutrophication of the water body caused by phosphorus pollution not only causes ecological problems of frequent harmful algal bloom, loss of aquatic organism diversity, hypoxia of the bottom water body and the like, but also seriously threatens the safety of drinking water and the health of human beings. The migration, conversion and retention of phosphorus in water are not only affected by exogenous input, but also regulated and controlled by temperature, hydrologic process, microbial activity, sediment-water interface reaction and other bioelectrochemical processes. In agriculture-dominant flowfields, fertilizer application, livestock and poultry farming and agricultural runoff constitute the main routes for phosphorus input, and their complex source-sink relationships and bio-geochemical conversion processes make phosphorus pollution control a great challenge. Phosphate oxygen isotopes delta 18OP as an emerging environmental tracer tool can effectively identify phosphorus sources, conversion pathways and bio-geochemical cycling processes. Compared with the traditional chemical analysis method, delta 18OP has unique fingerprint characteristics, can distinguish phosphorus input from different sources, and reflects the intensity of microorganism-mediated phosphorus circulation. The oxygen atoms in the phosphate molecules are subjected to isotope exchange with water molecules in the bioconversion process, and the isotope composition records complex biogeochemical information, so that unprecedented molecular level insight is provided for understanding the phosphorus circulation mechanism. However, the acquisition of δ 18OP relies on high precision analysis equipment such as stable isotope ratio mass spectrometers and complex sample pre-processing procedures, the high cost and high technical threshold limit the feasibility of conducting high spatial-temporal resolution monitoring on a watershed scale. Furthermore, current research has focused on single point observations over specific time periods or local regions, lacking multi-factor, quantitative resolution of delta 18OP change driving mechanisms, especially in agricultural flow domains that are subject to multiple human activity disturbances. Traditional delta 18OP research methods rely primarily on laboratory analysis and linear statistical models to explore the relationship of environmental factors to isotope values. However, the phosphorus cycling process of agricultural watershed often presents highly non-linear and multi-factor interactive features, there may be thresholding effects and synergy between different environmental variables, resulting in the linear approach being deficient in interpretation and prediction accuracy. In addition, the mode of action of some key drivers may vary significantly with season and event scale, further increasing the complexity of parsing. It is therefore necessary to introduce new methods that can handle complex nonlinear relationships and quantify the variable contributions to reveal the driving mechanism of δ 18OP. Disclosure of Invention In order to overcome the defects of the prior art, the invention aims to provide a phosphate oxygen isotope prediction method based on machine learning, which solves the problems that the conventional linear method is insufficient in interpretation effort and prediction accuracy and the action mode of part of key driving factors can be obviously changed along with seasons and event scales. In order to achieve the above object, the present invention provides the following solutions: a machine learning based phosphate oxygen isotope prediction method, comprising: collecting target factors of a target area, wherein the target factors comprise water oxygen isotopes, water temperature, soluble active phosphorus, potassium ions, dissolved oxygen, domestic and residential land proportion and water level data; Inputting the target factors into a prediction model for calculation to obtain a phosphate oxygen isotope prediction result, wherein the construction process of the prediction model comprises the following steps: Selecting a study area; The method comprises the steps of collecting and measuring water samples of rivers in a research area in a fixed period to obtain hydrologic and water quality data, wherein the hydrologic and water quality data comprise nutrient substance indexes, relevant water chemical variables, element indexes, water level data, flow rate data and water oxygen isotopes, the nutrient substance indexes comprise soluble active phosphorus,