Search

CN-121980199-A - Geological disaster prediction method and system based on optimized PU-XGBoost model

CN121980199ACN 121980199 ACN121980199 ACN 121980199ACN-121980199-A

Abstract

The invention relates to a geological disaster prediction method and a geological disaster prediction system based on an optimized PU-XGBoost model, which belong to the technical field of geological disaster monitoring and risk assessment, and comprise the steps of obtaining multi-source data comprising InSAR deformation monitoring data and historical geological disaster point data, taking InSAR significant deformation points in the historical geological disaster point and the InSAR deformation monitoring data as a positive sample set P, introducing a PU-Learning strategy of a Spy sample to construct a reliable negative sample system, carrying out optimization search on XGBoost super parameters by combining with a BTO optimization algorithm to obtain an optimal super parameter combination, utilizing the positive sample set P and the negative sample set RN to train and obtain an optimal model, and carrying out ground disaster vulnerability prediction and grading drawing on a research area based on the optimal model, so that the high-precision and high-reliability assessment of the geological disaster vulnerability of the research area is realized.

Inventors

  • WANG BO
  • CHEN SIRUI
  • SHENG QINGHONG
  • LIU YU

Assignees

  • 南京航空航天大学

Dates

Publication Date
20260505
Application Date
20260407

Claims (10)

  1. 1. The geological disaster prediction method based on the optimized PU-XGBoost model is characterized by comprising the following steps of: Acquiring multi-source data of a research area, wherein the multi-source data at least comprises InSAR deformation monitoring data and historical geological disaster point data, and preprocessing the multi-source data; Based on the preprocessed multi-source data, taking the InSAR significant deformation points in the historical geological disaster points and the InSAR deformation monitoring data as a positive sample set P and the rest data of the research area as an unlabeled sample set U; Based on a PU sample system, screening a negative sample set RN from an unlabeled sample set U by introducing a PU-Learning strategy of a Spy sample; Establishing XGBoost a susceptibility assessment model, introducing a BTO optimization algorithm to perform optimization search on XGBoost super parameters to obtain an optimal super parameter combination, and training by using a positive sample set P and a negative sample set RN to obtain an optimal model; And carrying out ground disaster susceptibility prediction and hierarchical mapping on the research area based on the optimal model.
  2. 2. The method for predicting geological disasters based on an optimized PU-XGBoost model as set forth in claim 1, wherein said multi-source data further includes DEM data, rainfall and hydrologic data, surface coverage and ergonomic activity data, said preprocessing includes SBAS-InSAR processing of a study area to output deformation rate and deformation accumulation grid, topography factor derivation calculation of DEM data, grid resampling of rainfall data, and And (5) performing at least unified coordinate system, unified resolution and study area clipping processing on all factors.
  3. 3. The geological disaster prediction method based on the optimized PU-XGBoost model as set forth in claim 2, wherein for InSAR deformation time series in InSAR deformation monitoring data, a robust statistic is adopted to construct a derivative factor, wherein the derivative factor comprises at least one of the following: Deformation mean The calculation formula is as follows: ; Standard deviation of deformation The calculation formula is as follows: ; Absolute value of deformation rate The calculation formula is as follows: ; Wherein, the Is the first The deformation is accumulated in the period of time, In order to be the number of time phases, Is the mean value of the two values, Is the standard deviation of the two-dimensional image, And To monitor the start and end times of the cycle, Is the absolute value of the deformation rate.
  4. 4. The geological disaster prediction method based on the optimized PU-XGBoost model according to claim 1, wherein the positive sample set P comprises cataloged historical geological disaster points and InSAR significant deformation points which are monitored based on InSAR technology, have the absolute value of annual deformation rate of more than 15 mm/year and are located in a slope area, the target labels of the InSAR significant deformation points are assigned to be 1, the unlabeled sample set U is all sample data except the positive sample set P, and all data in the unlabeled sample set U are uniformly assigned with an initial label of 0.
  5. 5. The geological disaster prediction method based on the optimized PU-XGBoost model of claim 1 is characterized in that the PU-Learning strategy of the Spy sample is introduced, specifically: randomly extracting Spy samples from the positive sample set P, temporarily resetting the labels of the Spy samples to be negative and mixing the Spy samples into unlabeled sample sets In forming a hybrid dataset ; Predicting the generic probability distribution of all Spy samples by using a classifier, and calculating the generic probability distribution Quantiles as screening threshold ; Traversing unlabeled sample sets To unlabeled sample set Mid-sample prediction probability Form a negative sample set RN.
  6. 6. The geological disaster prediction method based on the optimized PU-XGBoost model as set forth in claim 5, wherein the method is characterized in that the method comprises the following steps of Randomly extracting Spy samples from the positive sample set P; Presetting the proportion of a positive sample set P and a negative sample set RN, and predicting probability based on the proportion Before the lowest probability is selected from the samples of (a) The samples constitute a negative set of samples RN.
  7. 7. The geological disaster prediction method based on the optimized PU-XGBoost model as set forth in claim 1, wherein the model parameters are optimized by minimizing an objective function containing regularization terms, and the optimization search is performed on XGBoost super parameters by introducing a BTO optimization algorithm, specifically: mapping XGBoost super-parameter combinations into barrel walls of the wooden barrels; Defining the ith hyper-parameter combination in a population The fitness of the water tank is the water content of the wooden barrel : ; Calculating normalized values of all dimensions, and identifying the dimensions of the short plates for limiting the current water holding quantity : ; In the formula, Is the first Individual first The dimension value; And (3) with Respectively the populations are at the first Maximum and minimum boundary values in dimensions; Is a dimension sequence number; For short plate dimensions only Performing directional patching, calculating a function based on fitness The individual with the best index searched in the current iteration round is taken as a global optimal solution Using globally optimal solutions Guiding and introducing random cooperative disturbance, wherein a repairing formula meets the following conditions: ; In the formula, The index is evaluated for the model and, , In order to learn the coefficients of the co-ordination, , In the form of a random number, To randomly reference the corresponding dimension values of an individual, The iteration times; Is the first Individual in the short plate dimension The value of the above value.
  8. 8. The geological disaster prediction method based on the optimized PU-XGBoost model of claim 1 is characterized in that the geological disaster vulnerability prediction of the research area is carried out based on the optimal model to output a geological disaster vulnerability probability grid of the research area, and the vulnerability probability grid is classified into high vulnerability, medium vulnerability, low vulnerability and difficult vulnerability areas according to a natural breakpoint method.
  9. 9. Geological disaster prediction system based on optimized PU-XGBoost model is characterized by comprising the following steps: the data acquisition module is used for downloading multi-source data, including InSAR deformation monitoring data, historical geological disaster point data, DEM data, rainfall and hydrologic data, earth surface coverage and ergonomic activity data; The preprocessing module is used for preprocessing the multi-source data, and comprises unified coordinate system and resolution, cutting a study area, invalid value and missing value processing and standardized processing; The Spy-PU sample construction module is used for constructing a positive sample set P and an unlabeled sample set U, dynamically determining a screening threshold value by introducing Spy samples, screening a negative sample set RN from the unlabeled sample set U, and combining the positive sample set P and the negative sample set RN to establish a PU sample system; the model training and optimizing module is used for training XGBoost the susceptibility evaluation model based on the PU sample system, and optimizing and searching the super parameters through the BTO optimization algorithm to obtain an optimal model; And the output module is used for outputting the probability grid of the vulnerability and the grading division result.
  10. 10. The geological disaster prediction system based on the optimized PU-XGBoost model as set forth in claim 9, wherein the BTO optimization objects in the model training and optimization module include at least one or more of a learning rate of XGBoost, a tree depth, a sub-sampling rate, a column sampling rate, a minimum leaf sample weight, a regularization coefficient, and a number of base learners.

Description

Geological disaster prediction method and system based on optimized PU-XGBoost model Technical Field The invention relates to the technical field of geological disaster monitoring and risk assessment, in particular to a geological disaster prediction method and a geological disaster prediction system based on an optimized PU-XGBoost model. Background Traditional geological disaster susceptibility evaluation methods mainly depend on field geological investigation and a single statistical or conventional machine learning model, but the methods have the limitations of long data acquisition period, low sample label reliability, weak model generalization capability and the like. The field investigation requires a large amount of manpower and material resources, large-scale and all-weather real-time monitoring is difficult to achieve, only discrete point-shaped data can be obtained, when a single evaluation model (such as logistic regression, a support vector machine and the like) is used for constructing a training sample, a region which does not have disasters is usually simply defaulted into a non-landslide region (negative sample), the existence of potential hidden danger points in an unknown region is ignored, so that the serious problem of negative sample pollution is caused, in addition, the traditional method is focused on static environment factors (such as gradient and lithology), the driving effect of dynamic factors such as surface deformation on disaster development is often not comprehensively considered, and high-precision easy-occurrence prediction is difficult to achieve. In recent years, along with the development of InSAR (Interferometric Synthetic Aperture Radar, synthetic aperture radar interferometry) technology and artificial intelligence, the existing method still has the following problems in the aspects of multi-source data fusion and model optimization, namely, firstly, the lack of an effective sample cleaning mechanism, difficulty in accurately identifying a reliable negative sample from unlabeled samples, resulting in deviation of model training, secondly, the difficulty in optimizing super-parameter combination of a common machine learning model, the low efficiency of traditional grid search or random search, difficulty in finding a global optimal solution, and insufficient precision and stability of a prediction result, so that a geological disaster prediction method and a geological disaster prediction system based on an optimized PU-XGBoost model are needed to solve the problems. Disclosure of Invention The invention aims to provide a geological disaster prediction method and a geological disaster prediction system based on an optimized PU-XGBoost model, which solve the problems in the background technology. In order to solve the technical problems, the geological disaster prediction method based on the optimized PU-XGBoost model adopts the following technical scheme that the geological disaster prediction method comprises the following steps: Acquiring multi-source data of a research area, wherein the multi-source data at least comprises InSAR deformation monitoring data and historical geological disaster point data, and preprocessing the multi-source data; Based on the preprocessed multi-source data, taking the InSAR significant deformation points in the historical geological disaster points and the InSAR deformation monitoring data as a positive sample set P and the rest data of the research area as an unlabeled sample set U; Based on a PU sample system, screening a negative sample set RN from an unlabeled sample set U by introducing a PU-Learning strategy of a Spy sample; Establishing XGBoost a susceptibility assessment model, introducing a BTO optimization algorithm to perform optimization search on XGBoost super parameters to obtain an optimal super parameter combination, and training by using a positive sample set P and a negative sample set RN to obtain an optimal model; And carrying out ground disaster susceptibility prediction and hierarchical mapping on the research area based on the optimal model. Preferably, the multi-source data further comprises DEM data, rainfall and hydrologic data, surface coverage and ergonomic activity data, the preprocessing comprises SBAS-InSAR processing of a research area to output deformation rate and deformation accumulation amount grids, terrain factor derivation calculation of the DEM data, grid resampling of the rainfall data, and And (5) performing at least unified coordinate system, unified resolution and study area clipping processing on all factors. Preferably, for the InSAR deformation time sequence in the InSAR deformation monitoring data, a robust statistic is adopted to construct a derivative factor, wherein the derivative factor comprises at least one of the following components: Deformation mean The calculation formula is as follows: ; Standard deviation of deformation The calculation formula is as follows: ; Absolute value o