CN-116821732-B - Self-adaptive space-time dimension ship frequent activity area extraction method
Abstract
The invention belongs to the technical field of ship AIS data clustering, and designs a self-adaptive time-space dimension ship frequent activity region extraction method which is used for solving the problem that the extracted frequent activity region is inaccurate due to the fact that the current frequent activity region extraction method only focuses on track data space information and ignores time information. The method comprises the following steps of S1) preprocessing ship AIS data, S2) extracting a ship frequent active area with single space-time granularity by adopting a grid density peak clustering algorithm, and S3) extracting the ship frequent active area with multiple space-time granularity. The invention adopts a self-adaptive space-time dimension ship frequent activity area extraction method, and performs cluster analysis to extract the ship frequent activity area on the premise of ensuring that the ship AIS data information is relatively complete. Compared with the traditional method for extracting the frequent active areas of the ships, the method can more effectively extract the frequent active areas of the ships under the multi-space-time granularity, achieves the purpose of extracting the frequent active areas of the ships under the multi-space-time granularity, and improves the accuracy of extracting the frequent active areas of the ships.
Inventors
- XIONG XUANRUI
- ZHANG FAN
- XIONG LIAN
- JIA YUMEI
- ZHANG YUAN
Assignees
- 重庆邮电大学
Dates
- Publication Date
- 20260508
- Application Date
- 20220318
Claims (1)
- 1. A method for extracting a frequent active area of a ship in a self-adaptive space-time dimension comprises the following steps: Step1, ship AIS data preprocessing; step S1.1, extracting a single track from the ship AIS data by judging whether the ship navigation speed is 0 and the acquisition time of two adjacent data points is greater than a given time threshold value, wherein the ith track is expressed as: ; Wherein n represents the number of track points contained in the track, and the track points , The time of acquisition of the trace point is indicated, , , Respectively representing longitude, latitude and speed of the ship i at the time t; S1.2, deleting abnormal values, traversing track data and deleting abnormal points which do not accord with a physical movement rule, wherein the abnormal points comprise numerical abnormal points with a speed value of negative values, a longitude absolute value of more than 180 degrees and a latitude absolute value of more than 90 degrees, and track drift points with a navigation speed of more than a preset maximum navigation speed threshold value, which are calculated according to the time interval and the space distance of adjacent track points; And S1.3, track interpolation, namely filling missing data by adopting a linear interpolation method, wherein the method comprises the following steps of: setting a time threshold If (if) Calculating the interpolation quantity: ; The average sampling time interval representing the normal sampling of the track, and calculating the attribute value of the inserted track point: ; indicate the inserted first Track points; Step2, extracting a ship frequent activity area with single space-time granularity; Step S2.1, extracting frequent ship activity areas with single space-time granularity, and assuming a ship track data set The time span is Given time granularity Spatial granularity, get the next time , ; S2.2, extracting frequent active areas of the ship, and acquiring time periods Data join set within In the method, a grid density peak value clustering algorithm is adopted to extract a frequent ship movement area, and the longitude and latitude in a research area are uniformly divided into Equal parts, obtain The number of the grid objects is used as a density statistical standard, a density peak clustering algorithm is adopted to perform clustering analysis on the grid objects, and the grid division satisfies that the number of grids with data is not less than 1/5 of the total division number of grids; a point which is truly present in the grid and has the smallest distance with other data points in the grid is selected as a grid object representative point, The mesh object density is expressed as: ; in the formula, Representing a current grid , A function representing the number of vessels within the statistical grid; The relative distance representation method is changed from the distance between the data points to the distance between the grid objects, and is expressed as follows: ; in the formula, Represented as mesh objects And grid objects Is a Euclidean distance of (2); selecting cluster-like centers by adopting a method combining an elbow method and a box diagram, and firstly calculating density relative distance values: ; in the formula, The grid density and the relative distance after normalization are respectively, wherein the normalization method adopts a minimum maximum normalization method, and the method is expressed as follows: ; Wherein the method comprises the steps of For the normalized data characteristic values, For the data characteristic values before normalization, For the maximum value of the values in the data feature of this type, The minimum value of the values in the data characteristics of the class is taken, Then will The values are arranged in descending order, inflection points are found through an elbow method, and grid objects at the left side of the inflection points are added into a cluster-like center candidate set Then combining the box method to center candidate set of class cluster Screening all Value join set Respectively calculate the sets Upper quartile of (2) And lower quartile Where n is the length of the set, defining an outlier cutoff threshold for the set: ; Will be A value exceeding a threshold value Is added to the cluster center candidate set In the method, the cluster center candidate set And cluster center candidate set Is the intersection of (1) As a preliminary candidate center, the rest grid objects are distributed into class clusters of the nearest high-density neighbors to obtain cluster clusters, Counting grids, and calculating the number of times of each grid serving as the nearest high-density grid of other grids in a preset time period, wherein the nearest high-density grid is a grid with the number of ships larger than that of adjacent grids in the preset time period, when the number of times of a certain grid serving as the nearest high-density grid of other grids is equal to 0, determining the maximum value of the grid density of the grid as a hot spot area screening threshold value, and the number of times of the nearest high-density grid serving as the other grids is expressed as: ; in the formula, The frequent active area is defined as: ; in the formula, The clusters of the classes resulting from the clustering are represented, Representing class clusters Is often used in the active area of the vehicle, Saving the obtained frequent active area to the collection as a maximum function And determine A density minimum and a density maximum over a period of time, Order the The next time period for extracting the frequent active area is determined as the starting time End time Repeating step S2.2, extracting frequent active region in the next time period until the time span is traversed ; Step 3, extracting a ship frequent activity area with multiple time-space granularity; s3.1, extracting the frequent ship activity areas with multiple time-space granularity, and fusing the frequent ship activity areas in the continuous adjacent time periods to obtain the frequent ship activity areas with larger time granularity; step S3.2 determining the fusion start time Calculate the next time , ; Step S3.3 determining a fusion start time period of the frequently active region Determining a time period for a next fusion , ; Step S3.4 if time period And There is intersection of frequently active regions of (a) Calculated at Grid density minimum value min_density and grid density maximum value max_density in a time period , A, min_density, max_density) to the set S, resulting in a time period If not, fusing the frequent active areas of the next time period, Returning to the step S3.3; step S3.5, when there is an intersection of the frequently active areas in the two time periods in step S3.4, continuing to extend to the time axis, Repeating step S3.4 until the time span is traversed ; Step S3.6, fusing the frequent active areas in the next time period, Returning to the step S3.3; step S3.7, until the time span is traversed And ending the fusion of the frequent active areas to obtain a ship frequent active area set S under the multi-time-space granularity.
Description
Self-adaptive space-time dimension ship frequent activity area extraction method Technical Field The invention belongs to the technical field of ship AIS data clustering, and relates to a self-adaptive space-time dimension ship frequent activity area extraction method. Background The shipping industry is an indispensable role in global economy development, and as global economy development progresses, the number of ships increases sharply, and since in some areas where ships are dense, the collision of the ships frequently occurs, the safety of the ship navigation becomes a problem that we have to consider. The automatic ship identification system (Automatic Identification System, AIS) is characterized in that the system contains various navigation information of the ship, such as ship number, longitude and latitude, speed, acquisition time and the like, and the navigation rules of the ship in a frequent active area and an analysis area of the ship are extracted by carrying out cluster analysis on historical AIS data of the ship, so that orderly planning of ship navigation can be realized, the efficiency of ship in and out is improved, the occurrence of collision accidents is effectively reduced, and the automatic ship identification system has important significance for ship navigation safety. At present, research on extraction of frequent active areas is mainly focused on resident track and vehicle track data, but few methods for extracting frequent active areas by utilizing ship tracks exist, and a part of methods are realized by a clustering technology, but only spatial information of track data is concerned in the clustering process, and time information is not considered, so when ship AIS data with spatial and time attributes are faced, the clustering effect is poor, and the distribution condition of the frequent active areas of the ship under different time granularity cannot be embodied. Therefore, a self-adaptive space-time dimension ship frequent activity area extraction method is designed. According to the method, space and time information are simultaneously considered when the frequent active areas are extracted, a grid density peak value clustering method is adopted, the density threshold values of the frequent active areas in different time periods can be selected in a self-adaptive mode, the frequent active areas in different time periods are obtained, meanwhile, the frequent active areas in adjacent time periods can be automatically fused on a time axis, and the frequent active areas under the granularity of multiple time spaces are obtained. The method is simple to implement, can embody the distribution of the frequent ship movement areas under different time-space granularities, and provides a new solution and a research thought for extracting the frequent ship movement areas from the ship AIS data. Disclosure of Invention The invention aims to provide a method for extracting a frequent active region of a ship in a self-adaptive time-space dimension. Firstly, preprocessing ship history AIS data, extracting frequent ship activity areas in different time periods through grid density peak clustering, and finally, adaptively fusing the frequent ship activity areas on a time axis to obtain frequent activity areas under multi-time-space granularity. In order to achieve the above purpose, the present invention provides the following technical solutions: A method for extracting frequent active areas of a ship with adaptive space-time dimensions, comprising the following steps: step 1) ship AIS data preprocessing; step 2) extracting a ship frequent activity area with single space-time granularity; Step 3) extracting the frequent active areas of the ship with multiple time-space granularity. Further, the step 1) specifically includes the following steps: Step 11) single track extraction. A single trajectory of a ship refers to a course of the ship from one port to another. The method is carried out aiming at a specific sea area, and a plurality of single tracks can be formed by the ship in the observation time, so that whether the ship sailing speed is 0 and whether the acquisition time of two adjacent data points is greater than a given time threshold value is judged, and the single tracks are extracted from the ship AIS data. The ith trace may be expressed as: Wherein n represents the number of track points contained in the track, and the track points ,The time of acquisition of the trace point is indicated,,,The longitude, latitude and speed of the ship i at time t are respectively indicated. Step 12) outlier deletion. Abnormal points in the track data are deleted, such as track points with a negative speed, a longitude exceeding 180 degrees and a latitude exceeding 90 degrees, and deviate from the whole track. Step 13) track interpolation. Partial data may be missing in a single track of a ship, and the track is also missing after abnormal points are deleted, so that the track data quality is r