Search

CN-121638683-B - Full-automatic disaster vulnerability evaluation method and system based on PU learning

CN121638683BCN 121638683 BCN121638683 BCN 121638683BCN-121638683-B

Abstract

The invention relates to the technical field of geological disaster susceptibility evaluation, in particular to a full-automatic disaster susceptibility evaluation method and system based on PU learning; under the condition of lacking a definite non-disaster sample, a semi-supervision strategy based on a PU learning idea is adopted, a pseudo negative sample is automatically identified from an unlabeled sample, the stability of a model is improved through multiple rounds of self-adaptive sampling, and the self-adaptive optimization of a buffer threshold is realized. In the grid cell-based vulnerability assessment, a buffer threshold value is selected to be incorporated into a Bayesian optimization mechanism, the size of an optimal buffer area is automatically determined, so that points in the buffer area fully contain disaster ranges and excessive boundary samples are avoided from being introduced, and in order to enable an assessment result to have stronger practicability and accurate stability, a 5-fold cross validation calculation integrated score (AUC+F1)/2 is used, and a cross validation weighted score = integrated score mean-0.2 x integrated score standard deviation is used as an optimization target for searching optimal parameters of a model by a fully-automatic system.

Inventors

  • NIU RUIQING
  • Huo Shuhan

Assignees

  • 中国地质大学(武汉)

Dates

Publication Date
20260512
Application Date
20260202

Claims (8)

  1. 1. The full-automatic disaster susceptibility evaluation method based on PU learning is characterized by being applied to a full-automatic disaster susceptibility evaluation system based on PU learning, wherein the system comprises a processing module and a storage module which are in communication connection with each other, and the method comprises the following steps: The processing module acquires geological disaster original data from the storage module; The processing module obtains a positive sample set and an unlabeled sample set based on geological disaster original data, wherein a buffer area is arranged on the positive sample based on a geographic space distance to obtain a buffer area positive sample, and an optimal threshold value of the buffer area is searched in a set range by using a Bayesian optimization algorithm; the processing module balances the positive sample set and the unlabeled sample set and divides the positive sample set and the unlabeled sample set into a training set and a testing set; The processing module builds and trains disaster vulnerability assessment models of artificial intelligent models combining positive samples and label-free learning based on PU learning ideas, wherein a buffer area is built, all points in the buffer area are regarded as positive samples, all points outside the buffer area are regarded as unlabeled samples, and a pseudo negative sample is identified from unlabeled samples based on the PU learning ideas, wherein the disaster vulnerability assessment model is a PU-T model, the PU-T model is a PU model comprising a basic classifier T, and the basic classifier T specifically comprises a PU-RF model based on a tree model, a PU-XGB model based on gradient lifting, a PU-MLP model based on a neural network and a PU-CNN model based on a convolutional neural network; The processing module carries out self-adaptive fine tuning on the disaster vulnerability evaluation model, and tests the disaster vulnerability evaluation model through a test set to obtain an optimal disaster vulnerability evaluation model; the processing module outputs disaster vulnerability grade grid results of the target area through the optimal disaster vulnerability evaluation model based on sample data of the target area; the processing module carries out self-adaptive fine tuning on the disaster susceptibility evaluation model, and tests the disaster susceptibility evaluation model through a test set to obtain an optimal disaster susceptibility evaluation model, and the processing module comprises the following steps: The processing module introduces a Bayesian optimization algorithm to carry out self-adaptive fine tuning on disaster susceptibility evaluation model parameters, wherein the optimization parameters in the self-adaptive fine tuning process are distinguished according to different model types, and the optimization parameters comprise a buffer area threshold value, PU-Bagging positive samples, relevant parameters of a non-label-integrated learning algorithm, structures and training parameters of a corresponding base learner, and the optimization parameters comprise the buffer area threshold value, PU learning prior probability, network structures and training parameters, wherein the parameters are based on the positive samples and the non-label-integrated learning model, and the positive samples and the non-label samples are directly processed through an end-to-end training mode on the deep learning model based on prior probability PU loss; after the processing module is subjected to iterative optimization for preset times, outputting the optimal parameter configuration of the disaster susceptibility evaluation model; The processing module selects the disaster vulnerability evaluation model with optimal performance through comparing the weighted comprehensive scores of the 5-fold cross validation of the disaster vulnerability evaluation model and takes the disaster vulnerability evaluation model as the optimal disaster vulnerability evaluation model, wherein the calculation formula of the weighted comprehensive scores of the 5-fold cross validation is that the average value of the comprehensive scores Standard deviation of 0.2 x composite score, the calculation formula of the composite score is: (AUC+F1)/2。
  2. 2. The fully automatic disaster vulnerability assessment method based on PU learning according to claim 1, wherein the geological disaster raw data comprises historical geological disaster events and basic geological environment factors, and the geological disaster raw data is obtained through satellite remote sensing interpretation and field investigation.
  3. 3. The full-automatic disaster vulnerability assessment method based on PU learning according to claim 2, wherein the historical geological disaster event comprises one or more of landslide, collapse, debris flow, ground collapse, ground fissure or ground subsidence, and the basic geological environment factors comprise elevation, gradient, slope direction, curvature, terrain roughness, annual average precipitation, normalized vegetation index, engineering rock group, road, fault, water system and land utilization type.
  4. 4. The full-automatic disaster vulnerability assessment method based on PU learning according to claim 3, wherein the processing module obtains a positive sample set and an unlabeled sample set based on geological disaster raw data, comprising: the processing module constructs a geospatial data set point vector file; The processing module extracts historical geologic hazard events and basic geologic environment factors into a geospatial data set point vector file.
  5. 5. The fully automatic disaster vulnerability assessment method based on PU learning as set forth in claim 4, wherein the processing module extracts historical geologic hazard events and basic geologic environment factors into a geospatial data set point vector file, further comprising thereafter: the processing module reads a geospatial data set point vector file, wherein a geospatial data set point vector file attribute table comprises a basic geological environment factor column, and in order to distinguish geological disaster points from non-disaster points, the factors of the geological disaster points are assigned to be 1, and the factors of the non-disaster points are assigned to be 0; The processing module builds a buffer for the geological disaster points in the geospatial data set point vector file, marks all samples in the buffer as positive sample sets, and marks samples of other areas as unlabeled sample sets.
  6. 6. The full-automatic disaster vulnerability assessment method based on PU learning according to claim 2, wherein the processing module balances a positive sample set and an unlabeled sample set and divides the positive sample set and the unlabeled sample set into a training set and a test set, comprising: the processing module normalizes the basic environment factors, eliminates the basic environment factors with the correlation larger than or equal to a preset value by virtue of pearson correlation analysis, and outputs a correlation heat map; The processing module adopts an undersampling method to balance the proportion of the positive sample set to the unmarked sample set, and then divides the positive sample set and the unmarked sample set into a training set and a testing set according to a preset proportion.
  7. 7. The full-automatic disaster vulnerability assessment method based on PU learning according to claim 1, wherein the processing module outputs disaster vulnerability rating grid results of the target area through an optimal disaster vulnerability assessment model based on sample data of the target area, comprising: the processing module inputs sample data of a target area into an optimal disaster vulnerability evaluation model, firstly obtains a vulnerability probability point vector result of the target area, then converts the vulnerability probability point vector result into a grid result with the same resolution as a geological environment factor, and divides the grid result into different vulnerability grades according to a natural breakpoint method, namely finally outputs a disaster vulnerability evaluation result, wherein the vulnerability grades comprise extremely high vulnerability, medium vulnerability, low vulnerability and extremely low vulnerability.
  8. 8. A full-automatic disaster susceptibility evaluation system based on PU learning, characterized in that the full-automatic disaster susceptibility evaluation method based on PU learning as set forth in any one of claims 1-7 is applied, and the system comprises a processing module and a storage module which are in communication connection with each other.

Description

Full-automatic disaster vulnerability evaluation method and system based on PU learning Technical Field The invention relates to the technical field of disaster evaluation, in particular to a full-automatic disaster vulnerability evaluation method and system based on PU learning. Background Geological disasters such as landslide, collapse, debris flow, etc. often accompany significant damage, resulting in significant socioeconomic losses. The geological disasters are characterized by frequent occurrence, wide points and multiple faces and large prevention difficulty, so that the development of the evaluation of the susceptibility of the geological disasters (namely, the evaluation of the risk level of occurrence of the geological disasters in a certain area) is very important. The geological disaster susceptibility evaluation method is characterized in that the influence degree of each disaster causing factor on the occurrence of the geological disaster is comprehensively analyzed under certain geological environment conditions, so that the occurrence probability and tendency of the geological disaster are determined, the existing geological disaster susceptibility evaluation scheme is mostly based on manual experience setting and repeated experiments, and the steps are complicated and the efficiency is low. Disclosure of Invention The invention mainly aims to provide a full-automatic disaster susceptibility evaluation method and system based on PU learning, and aims to solve the problems that the existing geological disaster susceptibility evaluation scheme is set and repeatedly tested by means of manual experience, the steps are complicated and the efficiency is low. The technical scheme provided by the invention is as follows: a full-automatic disaster vulnerability assessment method based on PU learning is applied to a full-automatic disaster vulnerability assessment system based on PU learning, wherein the system comprises a processing module and a storage module which are in communication connection with each other, and the method comprises the following steps: The processing module acquires geological disaster original data from the storage module; the processing module obtains a positive sample set and an unlabeled sample set based on geological disaster original data; the processing module balances the positive sample set and the unlabeled sample set and divides the positive sample set and the unlabeled sample set into a training set and a testing set; The processing module builds and trains disaster vulnerability evaluation models of artificial intelligent models combining positive samples and unlabeled learning based on PU learning thought; The processing module carries out self-adaptive fine tuning on the disaster vulnerability evaluation model, and tests the disaster vulnerability evaluation model through a test set to obtain an optimal disaster vulnerability evaluation model; And the processing module outputs a disaster vulnerability grade grid result of the target area through the optimal disaster vulnerability evaluation model based on the sample data of the target area. Preferably, the geological disaster raw data comprises historical geological disaster events and basic geological environment factors, and the geological disaster raw data is obtained through satellite remote sensing interpretation and field investigation. Preferably, the historical geological disaster event comprises one or more of landslide, collapse, debris flow, ground subsidence, ground fissure or ground subsidence, and the basic geological environment factors comprise elevation, gradient, slope direction, curvature, terrain roughness, annual average precipitation, normalized vegetation index, engineering rock group, road, fault, water system and land utilization type. Preferably, the processing module obtains a positive sample set and an unlabeled sample set based on the geological disaster raw data, including: the processing module constructs a geospatial data set point vector file; The processing module extracts historical geologic hazard events and basic geologic environment factors into a geospatial data set point vector file. Preferably, the processing module extracts the historical geological disaster event and the basic geological environment factor into a geospatial data set point vector file, and then further comprises: the processing module reads a geospatial data set point vector file, wherein a geospatial data set point vector file attribute table comprises a basic geological environment factor column, and in order to distinguish geological disaster points from non-disaster points, the factors of the geological disaster points are assigned to be 1, and the factors of the non-disaster points are assigned to be 0; The processing module builds a buffer for the geological disaster points in the geospatial data set point vector file, marks all samples in the buffer as positive sample sets, and marks samples of o