CN-122019987-A - Soil moisture content time sequence data processing method and device
Abstract
The invention provides a method and a device for processing time sequence data of soil moisture content, which relate to the technical field of data processing, and the method comprises the steps of collecting initial time sequence data of the soil moisture content corresponding to one or more preset monitoring points through a preset sensor; the method comprises the steps of obtaining initial soil moisture content time sequence data, deleting data which are outside a first preset threshold value interval in the initial soil moisture content time sequence data to obtain first soil moisture content time sequence data, deleting data which are continuously repeated for preset times in the first soil moisture content time sequence data to obtain second soil moisture content time sequence data, deleting jump values in the second soil moisture content time sequence data to obtain third soil moisture content time sequence data, and deleting abnormal fluctuation values in the third soil moisture content time sequence data to obtain target soil moisture content time sequence data. The soil moisture content time sequence data processing method and device provided by the invention can accurately identify and remove abnormal data in the soil moisture content time sequence data, and improve the quality of the processed soil moisture content time sequence data.
Inventors
- Yu Jiashuo
- XIONG CHUNHUA
- LI RUI
- LIU HUANHUAN
- ZHAI SHUHUA
- XU SHANGZHI
- MAO JIAN
- WANG QIANGQIANG
- WANG YUNTAO
- MA SIMING
- WANG YANMEI
Assignees
- 北京市地质灾害防治研究所
Dates
- Publication Date
- 20260512
- Application Date
- 20260414
Claims (10)
- 1. The method for processing the time sequence data of the water content of the soil is characterized by comprising the following steps of: Acquiring initial soil moisture content time sequence data corresponding to one or more preset monitoring points through a preset sensor; deleting data outside a first preset threshold interval in the initial soil moisture content time sequence data to obtain first soil moisture content time sequence data, wherein the first preset threshold interval is determined based on the measuring range of the preset sensor and the soil moisture history data of the preset monitoring point; deleting the data of the continuous repeated preset times in the first soil moisture content time sequence data to obtain second soil moisture content time sequence data; deleting the jump value in the second soil moisture content time sequence data to obtain third soil moisture content time sequence data; And deleting the abnormal fluctuation value in the third soil moisture content time sequence data to obtain target soil moisture content time sequence data.
- 2. The method for processing time series data of soil moisture content according to claim 1, wherein the step of determining the jump value comprises: based on the second soil moisture content time sequence data, determining a first window width and a second window width respectively; taking smaller values of the first window width and the second window width as adaptive moving window widths; And determining the jump value based on the moving window width.
- 3. The method for processing the time series data of the soil moisture content according to claim 2, wherein the determining the first window width and the second window width based on the time series data of the second soil moisture content, respectively, comprises: Determining response data of which the water content change value is larger than a second preset threshold value in the second soil water content time sequence data; taking the preset monitoring point corresponding to the response data as a target monitoring point; determining the soil moisture diffusivity based on the soil texture type of the target monitoring point; determining soil moisture response time based on the soil layer depth of the target monitoring point and the soil moisture diffusivity; the first window width is determined based on the soil moisture response time.
- 4. The method for processing the time series data of the soil moisture content according to claim 2, wherein the determining the first window width and the second window width based on the time series data of the second soil moisture content, respectively, further comprises: determining each monitoring period corresponding to the second soil moisture content time sequence data; Extracting data corresponding to the median of each monitoring period from the second soil moisture content time sequence data to serve as resampling data; performing autocorrelation analysis on the resampled data to obtain a target curve of the autocorrelation coefficient changing along with a lag time, wherein the lag time is determined based on the monitoring period; Intersecting a horizontal line corresponding to a preset autocorrelation coefficient threshold with the target curve, and taking the maximum lag time corresponding to a point above the horizontal line as the second window width.
- 5. The method for processing the time-series data of the soil moisture content according to claim 2, wherein the determining the jump value based on the moving window width comprises: Extracting time period data corresponding to each piece of data to be processed in the second soil moisture content time sequence data, wherein the time period data are data in a moving interval corresponding to the data to be processed, the moving interval comprises a first interval and a second interval, the first interval is an interval which takes the collection time corresponding to the data to be processed as a center and is before the collection time, the second interval is an interval which takes the collection time corresponding to the data to be processed as a center and is after the collection time, and the interval length of the first interval and the interval length of the second interval are half of the width of the moving window; calculating a difference value between the average value of the time period data and the data to be processed corresponding to the time period data; calculating a multiple between a first standard deviation of the time period data and the difference value; Calculating absolute deviation between the average value of the time period data and the data to be processed corresponding to the time period data; And taking the data to be processed, which corresponds to the multiple being greater than or equal to a third preset threshold value and the absolute deviation being greater than or equal to a fourth preset threshold value, as the jump value.
- 6. The method for processing time-series data of soil moisture content according to claim 5, wherein the step of determining the abnormal fluctuation value comprises: dividing the third soil moisture content time sequence data into a plurality of window data based on the moving window width; calculating a data steering rate and a second standard deviation of the window data; carrying out linear trend regression on the window data to obtain a corresponding regression line; t-checking the slope of the regression line to obtain a significance check P value; and taking the window data corresponding to the data steering rate being greater than or equal to a fifth preset threshold value, the second standard deviation being greater than or equal to a sixth preset threshold value and the significance test P value being greater than or equal to a seventh preset threshold value as the abnormal fluctuation value.
- 7. A soil moisture content time series data processing device, characterized by comprising: the acquisition module is used for acquiring initial soil moisture content time sequence data corresponding to one or more preset monitoring points through a preset sensor; The first processing module is used for deleting data outside a first preset threshold value interval in the initial soil moisture content time sequence data to obtain first soil moisture content time sequence data, wherein the first preset threshold value interval is determined based on the measuring range of the preset sensor and the soil moisture history data of the preset monitoring point; The second processing module is used for deleting the data of the continuous repeated preset times in the first soil moisture content time sequence data to obtain second soil moisture content time sequence data; The third processing module is used for deleting the jump value in the second soil moisture content time sequence data to obtain third soil moisture content time sequence data; And the fourth processing module is used for deleting the abnormal fluctuation value in the third soil moisture content time sequence data to obtain target soil moisture content time sequence data.
- 8. An electronic device comprising a memory, a processor and a computer program stored on the memory and running on the processor, wherein the processor implements the soil moisture content time series data processing method according to any one of claims 1 to 6 when executing the computer program.
- 9. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the soil moisture content time series data processing method according to any one of claims 1 to 6.
- 10. A computer program product comprising a computer program which, when executed by a processor, implements a soil moisture content time series data processing method as claimed in any one of claims 1 to 6.
Description
Soil moisture content time sequence data processing method and device Technical Field The invention relates to the technical field of data processing, in particular to a method and a device for processing time sequence data of soil moisture content. Background The soil moisture content is one of core parameters in the fields of geological disaster monitoring and early warning, agricultural production and precise irrigation, hydrological research and the like, and the quality of monitoring data is directly related to the accuracy of analysis results and the reliability of application decisions. Therefore, accurate anomaly identification and effective cleaning of the soil moisture content time sequence data are required, so that the data quality is improved. At present, the existing soil moisture abnormal data identification method mostly adopts range inspection based on a fixed threshold value or simple statistical outlier detection. Although the method can remove part of obvious abnormal values (such as the overscan data generated by sensor failure), the identification accuracy of most abnormal data is low, and more misjudgment and missed judgment exist, so that the soil moisture content time sequence data cannot be effectively cleaned, and the quality of the soil moisture content time sequence data is low. Disclosure of Invention The invention provides a method and a device for processing time sequence data of soil moisture content, which are used for solving the technical problem of lower quality of the time sequence data of the soil moisture content caused by lower identification accuracy of abnormal data in the time sequence data of the soil moisture content in the prior art. The invention provides a soil moisture content time sequence data processing method, which comprises the following steps: Acquiring initial soil moisture content time sequence data corresponding to one or more preset monitoring points through a preset sensor; deleting data outside a first preset threshold interval in the initial soil moisture content time sequence data to obtain first soil moisture content time sequence data, wherein the first preset threshold interval is determined based on the measuring range of the preset sensor and the soil moisture history data of the preset monitoring point; deleting the data of the continuous repeated preset times in the first soil moisture content time sequence data to obtain second soil moisture content time sequence data; deleting the jump value in the second soil moisture content time sequence data to obtain third soil moisture content time sequence data; And deleting the abnormal fluctuation value in the third soil moisture content time sequence data to obtain target soil moisture content time sequence data. According to the soil moisture content time sequence data processing method provided by the invention, the step of determining the jump value comprises the following steps: based on the second soil moisture content time sequence data, determining a first window width and a second window width respectively; taking smaller values of the first window width and the second window width as adaptive moving window widths; And determining the jump value based on the moving window width. According to the soil moisture content time sequence data processing method provided by the invention, the first window width and the second window width are respectively determined based on the second soil moisture content time sequence data, and the method comprises the following steps: Determining response data of which the water content change value is larger than a second preset threshold value in the second soil water content time sequence data; taking the preset monitoring point corresponding to the response data as a target monitoring point; determining the soil moisture diffusivity based on the soil texture type of the target monitoring point; determining soil moisture response time based on the soil layer depth of the target monitoring point and the soil moisture diffusivity; the first window width is determined based on the soil moisture response time. According to the soil moisture content time sequence data processing method provided by the invention, the first window width and the second window width are respectively determined based on the second soil moisture content time sequence data, and the method further comprises the following steps: determining each monitoring period corresponding to the second soil moisture content time sequence data; Extracting data corresponding to the median of each monitoring period from the second soil moisture content time sequence data to serve as resampling data; performing autocorrelation analysis on the resampled data to obtain a target curve of the autocorrelation coefficient changing along with a lag time, wherein the lag time is determined based on the monitoring period; Intersecting a horizontal line corresponding to a preset autocorrelation coefficient t