CN-122019682-A - Mark isotherm extraction and semantic text generation method
Abstract
The invention discloses a method for extracting a marked isothermal line and generating semantic words, which comprises the steps of identifying effective points, isolated points and ineffective points from an original data matrix A and marking the effective points, the isolated points and the ineffective points on corresponding positions of an accompanying matrix B, interpolating the accompanying matrix B and modifying the corresponding positions of the original data matrix A, smoothing the interpolated data matrix A1, modifying a marking value on the corresponding positions of the accompanying matrix B1, removing internal points in a warm area in the smoothed accompanying matrix B2, storing the effective points in the fine-screened accompanying matrix B3, addressing and splicing the effective points in the accompanying matrix B3 to realize isothermal line connection through the step P6 and the step P7, finding out a first isothermal line crossing a Chinese border line for 5 km buffer area to obtain the marked isothermal line, and generating semantic information, wherein the marked isothermal line is obtained through the step P8. The method can solve the problem that the existing method for extracting the landmark isotherms is difficult to meet the application requirements of meteorological service.
Inventors
- CHEN JINGHUA
- CAO LEI
- ZHANG ZHIFU
- YANG HEPING
- CHEN NAN
- ZHOU XIAOKANG
Assignees
- 国家气象信息中心(中国气象局气象数据中心)
Dates
- Publication Date
- 20260512
- Application Date
- 20260305
Claims (10)
- 1. A method for extracting a marking isothermal line and generating semantic words is characterized by comprising the following steps: Step P1, coarse screening, namely storing an air temperature value obtained by analyzing meteorological grid data into an original data matrix A, screening and identifying effective points, isolated points and invalid points from the original data matrix A, and marking the effective points, the isolated points and the invalid points at corresponding positions in an accompanying matrix B; Step P2, interpolation, namely performing eight-neighborhood interpolation calculation on the periphery of an effective point of the adjoint matrix B to obtain an adjoint matrix B1, traversing the adjoint matrix B1, and modifying the air temperature value of the effective point at a corresponding position in the original data matrix A into a temperature value of a marked isotherm to obtain a data matrix A1; step P3, smoothing, namely performing eight-neighborhood smoothing calculation on the periphery of the effective point in the data matrix A1 to obtain a data matrix A2, traversing the data matrix A2, and modifying the mark value of the corresponding position of the effective point in the accompanying matrix B1 to obtain an accompanying matrix B2; Step P4, fine screening, namely traversing the adjoint matrix B2, and judging the arbitrary point M as the inner point of the isothermal surface area if all the points in the eight neighborhood ranges of the arbitrary point M in the adjoint matrix B2 are valid points; Step P5, warehousing, namely storing the effective points filtered out from the accompanying matrix B3 into a spatial database, and sorting according to latitude, and making a GIST index for the point type field; Step P6, addressing, namely searching a starting point in the accompanying matrix B3 through a spatial database, and searching the next effective point according to the advancing direction and a counterclockwise eight-neighborhood addressing method to realize isotherm connection; P7, leading, namely leading the isothermal line by neighborhood retrieval through a spatial database under the condition that the target value is not found in addressing; Step P8, identifying, namely sequencing the extracted isotherms according to the generation sequence from low latitude to Gao Wei, and finding out that the first isotherm crossing the 5 km buffer of the east-west national line is the marking isotherm; And P9, generating semantics, namely performing superposition calculation on the extracted landmark isotherms and the geospatial data to generate the semantics information.
- 2. The method for extracting a landmark isotherm and generating semantic words as defined in claim 1, wherein the step P1 comprises the steps of: Step P101, analyzing meteorological grid data to obtain an air temperature value, storing the air temperature value into a two-dimensional array A as an original data matrix A, defining a two-dimensional array B with dimension completely consistent with the two-dimensional array A as an accompanying matrix B, and synchronously obtaining grid space resolution, data time, observation element codes, data starting longitude, data starting latitude, observation element grid line numbers and observation element grid line numbers of the meteorological grid data; step P102, defining a section value of a marking isotherm as a target value range, wherein the range of the target value range is a target value +/-0.05 ℃, traversing an original data matrix A, marking the corresponding position of any point M in an accompanying matrix B as an effective point and endowing the effective value if the air temperature value of any point M in the original data matrix A is in the target value range and at least one of the air temperature values of any point M in eight neighborhood positions in the original data matrix A is in the target value range, marking the corresponding position of any point M in the accompanying matrix B as an isolated point and endowing an invalid value if the air temperature value of any point M in the original data matrix A is not in the target value range and the air temperature value of any point M in the original data matrix A is in the target value range or marking the corresponding position of any point M in the original data matrix A as an invalid point and endowing the invalid value if the air temperature value of any point M in the original data matrix A is NaN or 999999; The eight neighborhood is a neighborhood grid point in eight directions, namely, the east, the west, the south, the north, the southeast, the northeast, the southwest and the northwest of the position of any point.
- 3. The method for extracting a landmark isotherm and generating semantic words as defined in claim 2, wherein the step P2 comprises the steps of: Step P201, setting interpolation operators, namely selecting grid points of eight adjacent areas around any point to be interpolated, and modifying the point to be interpolated into an effective point and endowing the effective value if the air temperature values of at least two grid points in the eight adjacent areas are within a target value; Step P202, defining a temporary matrix C with dimension completely consistent with the accompanying matrix B, traversing the accompanying matrix B, carrying out interpolation calculation on each point to be interpolated in the accompanying matrix B by using an interpolation operator, and writing the interpolation calculation result into a position corresponding to the temporary matrix C; Step P203, after interpolation calculation of all points to be interpolated in the accompanying matrix B is finished, assigning the value of the temporary matrix C to the accompanying matrix B; step P204, repeating the step P202 and the step P203 for 8-12 times of interpolation iteration processing to obtain an accompanying matrix B1; And step P205, traversing the accompanying matrix B1, and if the position of the point to be interpolated in the accompanying matrix B1 is an effective value, modifying the corresponding position of the point to be interpolated in the original matrix A into a temperature value of a marked isotherm to obtain a data matrix A1.
- 4. The method for extracting a landmark isotherm and generating semantic words as defined in claim 3, wherein the step P3 comprises the steps of: step P301, setting a smoothing operator, namely defining an average value of air temperature values of eight neighborhood grid points around any point as the smoothing operator; step P302, defining a temporary matrix D with dimension size completely consistent with the data matrix A1, traversing the data matrix A1, carrying out smooth calculation on each point in the data matrix A1 by using a smoothing operator, and writing a smooth calculation result into a position corresponding to the temporary matrix D; Step P303, after the smooth calculation of all points in the data matrix A1 is finished, assigning the value of the temporary matrix D to the data matrix A1; Step P304, repeating the step P302 and the step P303 for smoothing iteration processing 25-35 times to obtain a data matrix A2; And P305, traversing the data matrix A2, and marking the valid points, the isolated points and the invalid points in the adjoint matrix B1 according to the air temperature value of each point in the data matrix A2 again by the method in the step P102 to obtain the adjoint matrix B2.
- 5. The method for extracting a landmark isotherm and generating semantic words as defined in claim 4, wherein the step P5 comprises the steps of: In the step P501, the data storage class is constructed and assigned, wherein variables of the data storage class comprise index numbers, row numbers, column numbers, longitudes, latitudes and point types, corresponding parameters of each point in the accompanying matrix B3 are assigned to objects generated by the data class storage, the objects are sequentially stored in an array list A of a space database, and the index numbers of the points in the accompanying matrix B3 are as follows: ID=i*n+j (1); In the formula (1), ID is an index number, i is a row number, n is a column number, and j is a column number; step P502, filtering the national boundary data, namely traversing the array list A, screening and removing objects with geographic positions not in the national boundary to obtain an array list B; Step P503, space database storage and index creation, wherein the array list B is stored in the space database in batches; And step P504, searching the national border line buffer area and sorting the latitudes, namely, based on the searching function of the spatial database, performing spatial inclusion relation calculation on the point position stored in the step P503 and the 5 km buffer area range of the national border line of China, extracting the target point position obtained by searching, and finishing sorting according to the order of latitudes from low to high.
- 6. The method of landmark isotherm extraction and semantic text generation as claimed in claim 5, wherein the addressing in step P6 comprises the steps of: the starting point and the initial addressing direction are determined in the step P601, wherein the ordered point in the step P504 is used as the starting point, the right travel from each starting point is defined as the initial travel direction of the marking isotherm addressing, and the first point on the right side of the travel direction is defined as the starting point of the eight-neighborhood addressing; Step P602, calculating the traveling direction of any point except for the isotherm addressing starting point, and determining the traveling direction according to the azimuth angle between the longitude and latitude of the current point P2 and the longitude and latitude of the last point P1; Step P603, according to the travelling direction calculated in step P602, using the first neighborhood point on the right side of the current point position P2 as a starting point, sequentially judging eight neighborhood points around the current point by aiming at the accompanying matrix B3 anticlockwise, wherein the first effective point O found is the next target point addressed by the isotherm, and the effective point O is marked as P2, and the original P2 is updated as P1; Step P604, the loop iterates and executes step P602 and step P603, and the automatic extension addressing of the isotherm is realized by continuously updating the travelling direction, searching the neighbor valid points, updating the point coordinate marks and storing the point cloud data, until the complete track searching and the point cloud set construction of the whole isotherm are completed.
- 7. The method for extracting a landmark isothermal line and generating semantic words according to claim 6, wherein in the step P602, assuming that the longitude and latitude coordinates of the current point position P2 are P2 (lon 2, lat 2) and the longitude and latitude coordinates of the previous point position P1 are P1 (lon 1, lat 1), the following calculation method is performed: in step P6021, the latitude and longitude differences between the current point position P2 and the previous point position P1 are calculated as follows: Difference in altitude: The positive value of the altitude difference indicates that the current point position p2 is on the north side of the last point position p1, the negative value of the altitude difference indicates that the current point position p2 is on the south side of the last point position p1, and the longitude difference is as follows: The longitude difference is positive value to indicate the eastern side of the current point position p2 at the last point position p1, and the longitude difference is negative value to indicate the western side of the current point position p2 at the last point position p 1; if Δlat=0 and Δlon=0, directly determining that the current point position p2 and the last point position p1 are the same position, and terminating the algorithm; and P6022, calculating an initial azimuth angle, wherein the azimuth angle is an angle from the clockwise direction of the point position P1 to the P1P2 connecting line, and the initial azimuth angle theta deg0 is calculated by taking Deltat and Dellon as the vertical axis and the horizontal axis of the right triangle: θdeg0=θrad×180/π,θrad=arctan2(Δlon,Δlat); correcting the initial azimuth angle smaller than 0, wherein the corrected azimuth angle theta deg=theta deg0+360 degrees; In step P6024, according to the section where the corrected azimuth angle θdeg is located, the direction of the current point P2 relative to the previous point P1, i.e. the travelling direction, is determined according to the following rule: (1) When the azimuth angle theta deg is 337.5-360 degrees or 0-22.5 degrees, the direction of the current point position p2 relative to the previous point position p1 is north, and at the moment, deltalat is more than 0 and Deltalon is approximately equal to 0; (2) When the azimuth angle theta deg is in the range of 22.5-67.5 deg, the direction of the current point position p2 relative to the previous point position p1 is northeast, and at this time, deltat >0 and Dellon >0; (3) When the azimuth angle theta deg is 67.5-112.5 deg, the direction of the current point position p2 relative to the previous point position p1 is east, and at this time, deltan is more than 0 and Deltat is about 0; (4) When the azimuth angle theta deg is 112.5-157.5 deg, the direction of the current point position p2 relative to the previous point position p1 is southeast; (5) When the azimuth angle theta deg is in 157.5-202.5 deg, the direction of the current point position p2 relative to the previous point position p1 is south, and at this time, deltat <0 and Dellon approximately equal to 0; (6) When the azimuth angle theta deg is in the range of 202.5-247.5 deg, the direction of the current point position p2 relative to the previous point position p1 is southwest, and at this time, deltat <0 and Dellon <0; (7) When the azimuth angle theta deg is in 247.5-292.5 deg, the current point position p2 is in Western direction relative to the previous point position p1, and at this time, deltan <0 and Deltat approximately equal to 0; (8) When the azimuth angle theta deg is 292.5-337.5 deg, the direction of the current point position p2 relative to the previous point position p1 is northwest, and at this time, deltat >0 and Dellon <0.
- 8. The method for extracting and generating semantic words from a landmark isothermal line according to claim 7, wherein in step P7, if none of eight neighbor points around the current point in step P603 is a valid point, then performing neighbor search through a spatial database to realize connection of isothermal lines, the method for connection of isothermal lines comprises the following steps: In the step P701, the current point is set to be P0 (lon 0, lat 0), the adjacent point of the current point P0 is searched by using a spatial database, the searched adjacent points are sequenced from near to far according to the spatial distance from the current point P0, an ordered adjacent point sequence (P1, p2...pn) is obtained, longitude and latitude coordinates corresponding to each adjacent point in the ordered adjacent point sequence are P1 (lon 1, lat 1), P2 (lon 2, lat 2)..pn (lonn, latn), and the spatial distance between each adjacent point and the current point P0 is calculated according to the following formula: (4); in the formula (4), p0pn represents the spatial distance from the adjacent point pn to the current point p 0; And The longitude and latitude coordinate values of the adjacent points pn are respectively, And Longitude and latitude coordinate values of the current point position p0 are respectively; The method comprises the steps of P702, selecting an access point from an ordered adjacent point sequence through longitude, latitude and space distance, and storing the found access point into an array list S, wherein the access point judging principle is that a point which is larger in longitude, smaller in latitude and closest to a current point P0 in comparison with the current point P0 in the ordered adjacent point sequence is selected as the access point, and if no point which accords with the access point judging principle is found, the point which is larger in longitude and closest to the current point P0 in the ordered adjacent point sequence is selected as the access point; Step P703, repeating step P602 to step P702 to realize automatic extension addressing of the isotherm, when the step P602 to step P702 can not find the next point or meet the 5 km buffer of the east China border line, the addressing of the first isotherm is finished; step P704, for the target point location retrieved in step P504, repeat step P6 and step P7 in turn until all isotherm addressing ends.
- 9. The method for extracting a landmark isothermal line and generating semantic words according to claim 8, wherein in step P702, the specific selection determination method of the access point is as follows: Sequentially judging whether the approach point judging principle is met or not according to the sequence order from the ordered approach point sequence, namely if lon1> lon0 and lat1< lat0, directly selecting the approach point p1 as the approach point, otherwise, continuously judging the next approach point according to the sequence, and if lonn > lon0, latn < lat0 and And if all the adjacent points in the ordered adjacent point sequence do not meet the judgment principle of the access point, selecting the adjacent point with the longitude larger than lon0 and closest to the current point p0 as the access point, otherwise, ending the whole isothermal line addressing.
- 10. The method for extraction of a landmark isotherm and generation of semantic words according to any one of claims 1 to 9, characterized in that step P9 comprises the steps of: p901, performing superposition calculation on the extracted marked isotherm and meteorological geospatial data through a spatial database, and judging provinces of marking isotherm line segment capping through topological relations; And step P902, arranging the found provinces according to the space sequence from west to east, and organizing the provinces into semantic characters by combining with the time information of the weather grid data.
Description
Mark isotherm extraction and semantic text generation method Technical Field The invention relates to the technical field of meteorological grid data processing. In particular to a method for extracting a marked isotherm and generating semantic words. Background In the weather service and related industry applications, the landmark isotherms specifically refer to the complete representative isotherms extending from west to east in the chinese area, such as typically 0 ℃, -5 ℃,10 ℃, etc. The position of the marking isotherm can provide key reference basis for agricultural production layout, traffic transportation scheduling, public production and living arrangement and the like in China. Before and after the season alternates, the central weather station forecaster can develop a live weather complex and future weather trend analysis around the marked isotherm. Therefore, the accurate extraction of the landmark isotherms, the complete plotting and the automatic generation of related semantic words are realized, and the method becomes a core requirement for guaranteeing the efficiency of weather consultation and improving the service quality. However, in the weather service consultation scenario, there are a number of issues to be addressed in the processing and application of the landmark isotherms. On the one hand, marking isothermal line marking standards are not uniform due to different service systems and different statistical methods, and in the process of isothermal line extraction and drawing of live grid data, a segmentation convergence problem is commonly existed, namely, due to the difference of extraction methods, the generated drawing line segments are uneven and are arranged in a chaotic manner, so that the reading analysis efficiency of a predictor is seriously influenced. On the other hand, the generation of semantic characters related to the marking isotherms is still mainly performed manually, and a predictor or a consultation assurance personnel needs to understand the isotherms and the features through manual image reading and then write the isotherms manually. At present, the existing isotherm extraction technology is mainly divided into a numerical calculation-based method and an image processing-based method, but the specific requirements of meteorological service guarantee business on the marked isotherm processing cannot be met. The method based on the image processing is used for extracting isotherms by aiming at carriers such as weather fax images and the like through interference elimination, traversal recognition and other technologies, the realization is simple, the efficiency is higher, the dependence on the image quality is extremely strong, noise and interference information can seriously affect the extraction effect, and the shape of complex isotherms is difficult to adapt. The two methods are designed aiming at discrete point data and image products respectively, have obvious differences with application requirements of a marked isotherm in weather service in terms of extraction precision, plotting integrity and semantic generation suitability, and cannot meet requirements of early weather consultation guarantee and various industries on weather data refinement service. Based on the method, a method capable of adapting to grid data with different resolutions (including live, forecast and analysis grid data) and realizing accurate extraction, complete plotting and automatic semantic text generation of the marked isotherm is developed, and the method becomes a key subject to be broken through in the technical field of current weather services. Disclosure of Invention Therefore, the technical problem to be solved by the invention is to provide a method for extracting the marked isotherm and generating the semantic words, so as to solve the problem that the existing method for extracting the marked isotherm is difficult to meet the application requirements of meteorological service in terms of extraction precision, plotting integrity and semantic generation suitability, and further meet the requirements of early weather consultation guarantee and each industry on the meteorological data refinement service. In order to solve the technical problems, the invention provides the following technical scheme: A method for extracting a marked isothermal line and generating semantic words comprises the following steps: Step P1, coarse screening, namely storing an air temperature value obtained by analyzing meteorological grid data into an original data matrix A, screening and identifying effective points, isolated points and invalid points from the original data matrix A, and marking the effective points, the isolated points and the invalid points at corresponding positions in an accompanying matrix B; Step P2, interpolation, namely performing eight-neighborhood interpolation calculation on the periphery of an effective point of the adjoint matrix B through an interpolation operator to obt