Search

CN-121980894-A - Urban canopy meteorological simulation data correction method based on combination of ML and corrected WRF-BEM model

CN121980894ACN 121980894 ACN121980894 ACN 121980894ACN-121980894-A

Abstract

The invention discloses a correction method of urban canopy meteorological simulation data based on a combination ML and correction WRF-BEM model, which is implemented according to the following steps of step 1, acquisition of observation data, step 2, acquisition of simulation data, step 3, setting of a machine learning model, step 4, evaluation of an initial simulation result, and step 5, the machine learning result. The invention solves the problems of large correction effect error and large local deviation of meteorological data in the prior art.

Inventors

  • QU ZHONGKE
  • HONG CHAO
  • GU ZHAOLIN
  • LI CHENGWEI
  • XU WEN
  • XIAO RUIZHI

Assignees

  • 西安交通大学

Dates

Publication Date
20260505
Application Date
20251125

Claims (8)

  1. 1. The correction method of urban canopy meteorological simulation data based on the combination of ML and corrected WRF-BEM models is characterized by comprising the following steps: step 1, obtaining observation data; step 2, obtaining analog data; Step 3, setting a machine learning model; Step 4, evaluating an initial simulation result; And 5, machine learning results.
  2. 2. The correction method of urban canopy meteorological simulation data based on a combination of ML and a corrected WRF-BEM model according to claim 1, wherein the step 1 is specifically implemented according to the following steps: The observation data are time-by-time data of a plurality of weather station sites distributed in the selected area all year round in the selected year, and comprise air temperature and humidity measured at 2 meters and wind speed measured at 10 meters; Preprocessing observed data: the pretreatment of the observation data comprises detection and removal of abnormal values and interpolation of missing values, and the wind speed and the rainfall have no obvious daily change rule, so that the observation data is interpolated in a linear mode, the air temperature and the relative humidity have simple daily change rules, namely, the air temperature maximum value and the relative humidity minimum value usually appear at about noon in one day, and the air temperature minimum value and the relative humidity maximum value appear before sunrise, so that the daily change rule is also considered when the air temperature and the relative humidity are linearly interpolated, namely, when the missing value is an extreme point, the interpolation is carried out according to the daily change rule of a plurality of weather stations in the same weather condition for several days, and finally, the total year-round 122640 time-to-day weather observation data of a plurality of weather stations are obtained.
  3. 3. The correction method of urban canopy meteorological simulation data based on a combination of ML and a corrected WRF-BEM model according to claim 2, wherein the step 2 is specifically implemented according to the following steps: Based on WRF-ARW 4.5, simulated meteorological data are obtained, the center of a WRF nested calculation domain is determined, a model adopts triple grid nesting, a parent domain D1 comprises the center position of a first nested domain D2, the background field characteristics of a large-scale weather system are captured, the horizontal spatial resolution of the parent domain D1 is 9km, the grid number in the horizontal direction of a region is 120 x 120, the first nested domain D2 comprises the center position of D3, the purpose of mainly describing the interaction between a region-scale meteorological field and a topography effect is achieved, the D2 resolution is 3km, the grid number of the region is 118 x 118, the second nested domain D3 is an observation region, namely a research domain, the spatial resolution is 1km, the grid number of the region is 118 x 118, the purpose of finely simulating the wind field, the humidity field and the temperature field distribution characteristics of a city scale are achieved, the triple nested domains all adopt 45 layers of grids in the vertical direction, the vertical grids extend to 5000Pa from the ground, are more dense near the ground surface, and the grids are gradually sparse along with the increase of the height; The physical parameterization process in the WRF model selects a Thompson scheme to be applied to a cloud micro-physical process, a RRTMG scheme to be applied to a long wave/short wave radiation process, a modified MM5 Monin-Obukhov scheme to be applied to a near-ground layer process, a Noah-MP scheme to be applied to a land layer process, a YSU scheme to be applied to a boundary layer process, a Kain-Fritsch scheme to be applied to a cloud accumulation convection process, a standard WRF model and a modified WRF-BEM model are adopted to respectively generate simulation data, and the canopy parameters of each LCZ are set according to relevant files and field investigation results of the building design of the city in a selected area.
  4. 4. The method for correcting urban canopy meteorological simulation data based on a combination of ML and modified WRF-BEM model according to claim 3, wherein said step 3 is specifically implemented according to the following steps: Firstly, respectively training three basic models, wherein the three basic models comprise a random forest model RF, an extreme gradient lifting model XGB and a gradient lifting decision tree model GBDT, obtaining optimal prediction performance of each basic model under specific parameter configuration through Bayesian optimization, namely, predicted canopy meteorological data which are closer to actual observation values, namely, air temperature, humidity and wind speed, and then carrying out linear regression according to the results of each basic model to obtain a final integrated model, respectively carrying out machine learning training on 3 meteorological elements, adopting absolute average error MAE, root mean square error RMSE and correlation coefficient R 2 as evaluation indexes of model performance, and obtaining the optimal model through index evaluation; introducing a SHAP method in interpretable machine learning, and intuitively seeing which features have larger influence on a model prediction result and influence on trend through SHAP values; All the meteorological information is divided into a training set and a testing set, the data distribution proportion of the training set and the testing set is 80 percent and 20 percent of the total data, The input data of the machine learning model are 9 feature quantities including WRF simulation temperature, moisture content, wind speed, rainfall, site coding, local LCZ feature variables, namely building maximum height, building duty ratio, grassland duty ratio and arbor duty ratio, and mapping relations are respectively established according to weather station observation temperature, moisture content and wind speed to obtain a machine learning prediction result.
  5. 5. The method according to claim 4, wherein the specific parameter configuration in the step 3 is an optimal super parameter combination determined for each basic model through a Bayesian optimization algorithm, the optimal parameters comprise the number of trees n_ estimators, the maximum depth max_depth, the minimum split sample number min_samples_split, the leaf node minimum sample number min_samples_leaf, and a random feature selection strategy max_features for the random forest model, and the optimal parameters comprise the number of lifting phases n_ estimators, the learning rate n_rate, the maximum depth max_depth of trees, and the sub-sampling ratio of the tree for the XGBoost model, the optimal parameters can be found by using a search for the best parameter combination of the optimal parameters, the optimal learning rate n_rate, the lifting wheel number n_ estimators, the maximum depth max_depth of the tree, the sample and the feature sampling ratio subsamples colsample _ bytree, the regularization parameters reg_alpha, the sub-sampling ratio for the GBDT model, and the adjustment parameters comprise the number of lifting phases n_ estimators, the learning rate n_rate, the maximum depth_depth of the tree, the sub-sampling ratio of the tree, and the sub-sampling ratio search parameters can be found by using a search for the best parameter combination of the optimal parameters.
  6. 6. The method for correcting urban canopy meteorological simulation data based on the combination ML and correction WRF-BEM model according to claim 5, wherein the final integrated model in the step 3 is built by adopting a Stacking generalization method, and the specific process comprises the steps of firstly, respectively training three basic models of random forests, XGBoost and GBDT by using respective optimal parameter configurations, then respectively predicting a verification set by using the trained three basic models to obtain three groups of prediction results as meta-features, and finally, taking the prediction results of the three groups of basic models as input features, taking the real tag of the verification set as a target variable, training a linear regression model as a meta-learner meta-learner, wherein the linear regression model automatically learns the optimal weight coefficient of the prediction result of each basic model, and the final integrated prediction result is obtained by weighting the prediction values of the three basic models in a linear combination mode.
  7. 7. The method for correcting urban canopy meteorological simulation data based on a combination of ML and modified WRF-BEM model according to claim 6, wherein the step 4 is specifically implemented according to the following steps: compared with the result of a standard WRF model, the WRF+BEM obviously reduces the simulation error of each meteorological element and improves the correlation, the simulated air temperature and the specific humidity have strong correlation with the measured value, the wind speed has weak correlation, the accuracy of the simulated wind speed on wind level classification is additionally evaluated through TS indexes, and the TS indexes are established on the basis of classification, wherein the TS calculation mode is as follows: In the calculation of the wind speed TS, the wind speed falls into a wind level interval and is of a positive type, and is of a negative type, and is classified according to the wind speed level, 0.1m/s to 0.2m/s are of a level 0 wind, 0.3m/s to 1.5m/s are of a level 1 wind, 1.6m/s to 3.3m/s are of a level 2 wind, and 3.4m/s to 5.4m/s are of a level 3 wind; if the TS value is higher than 0.6, the model forecasting effect is better, if the TS value is between 0.2 and 0.4, the model forecasting effect is general, and the accuracy is poor when the TS value is lower than 0.2.
  8. 8. The method for correcting urban canopy meteorological simulation data based on a combination of ML and modified WRF-BEM model according to claim 7, wherein said step 5 is specifically implemented as follows: 1) Air temperature training and prediction error The ML WRF and ML WRF-UCM are respectively used for representing the machine learning result obtained by taking the simulation data obtained by adopting the standard WRF model and the corrected WRF-BEM model as training data; 2) Specific humidity training and prediction error The specific humidity error after being processed by the machine learning statistical model is reduced; 3) Wind speed training and prediction error The wind speed error after the machine learning statistical model processing is reduced.

Description

Urban canopy meteorological simulation data correction method based on combination of ML and corrected WRF-BEM model Technical Field The invention belongs to the technical field of urban canopy simulation meteorological data correction, and particularly relates to a correction method of urban canopy meteorological simulation data based on a combination ML and correction WRF-BEM model. Background With the increasing population of cities around the world, cities are faced with a range of climatic and environmental problems, such as urban heat islands, air pollution and extreme weather. These environmental problems greatly affect the health of urban residents, increase the consumption of energy and resources, and present challenges for sustainable urban development. Therefore, understanding urban climate and environmental characteristics in depth, and proposing effective mitigation strategies and scientific urban planning suggestions have become the focus of current research. However, achieving this objective requires accurate canopy weather simulation data on a city scale as a basis. The development of mesoscale mode WRF (WEATHER RESEARCH AND Forecasting Model) provides powerful tools and scientific support for this, WRF is capable of simulating meteorological conditions of urban areas, including temperature, humidity, wind speed, precipitation, etc., at high resolution (e.g., 1 km or less) taking into account the complex topography and land use patterns of the city. The high-precision simulation data provides important basis for researching urban climate phenomena such as urban heat island effect, local circulation and the like, and lays a solid foundation for formulating scientific environmental policies and urban planning. Over the last decade, researchers have done much work to improve the accuracy of WRF simulation, such as to more accurately describe the physical processes involved in heat, momentum and water vapor exchange in urban environments, and have developed urban canopy models (Urban Canopy Model, UCM) and have coupled them into WRF models in 2004. UCM takes into account the effects of urban underlying complex forms, including shadows of urban canopy buildings, reflections of short and long wave radiation, wind profiles in the canopy, and multi-layer heat transfer equations for roof, wall, and road surfaces. Extensive research has demonstrated that the inclusion of UCM can improve the correlation of WRF simulation data with observed data and significantly reduce the RMSE of air temperature and humidity, providing a more accurate reflection of the climate and environment of urban areas. However, due to the inherent complexity and uncertainty of the climate system in the present stage, the climate dynamic process is still difficult to effectively describe and describe, and the output variable of the WRF model based on the physical process still has a large deviation compared with the observed data, so that the reliability of the WRF model in practical application is limited. For this reason, some researchers have attempted to reduce simulation errors by manually correcting the simulation data later, for example Bhati et al have reduced the WRF simulation air temperature by about 2.78 ℃ and the relative humidity by about 11.23% based on the earlier results of their study, reducing the simulation errors by nearly half. However, this correction method has the obvious disadvantage that the WRF simulation errors exhibit variability in time and space, so that it is not reasonable to uniformly correct the weather data at all times and at all places inside the city. The differential correction is carried out on the meteorological data at different moments and places, a large amount of simulation and observation data needs to be processed, and a mapping relation between the simulation and observation data is established, which exceeds the processing capacity of the traditional method. In recent years, machine learning (MACHINE LEARNING, ML) has shown significant advantages in the meteorological and environmental fields, and has strong data processing and pattern recognition capabilities, so that the machine learning can automatically learn and extract features from mass data, and find out complex mapping relations between simulation and observation data. In addition, the machine learning has the characteristics of processing nonlinear relation and high-efficiency calculation, is beneficial to overcoming the limitation of an artificial correction method, and realizes more accurate and efficient data correction. Recently, the ML model and the WRF are combined, so that simulation results of rainfall, snowfall, ozone concentration, wind energy, solar energy and the like are successfully corrected, and more accurate guidance is provided for local production and life. Although ML is increasingly used in the meteorological and environmental fields, studies on ML correction for urban canopy internal air temperatur