CN-121997077-A - Traffic state-oriented multi-mode self-adaptive hierarchical track similarity calculation method
Abstract
The invention discloses a traffic state-oriented multi-mode self-adaptive hierarchical track similarity calculation method which comprises the steps of preprocessing track data, injecting traffic state and POI semantics, constructing a multi-mode feature sequence, calculating complexity scores and self-adaptively determining layering depth through extracting speed dispersion, heading change strength and traffic state fluctuation strength, realizing self-adaptive downsampling through micro TopK, aggregating discarded nodes into segment-level attributes, generating multi-level characterization, utilizing a masking mechanism guided by the traffic state to preferentially shade a high fluctuation area, combining intra-layer prediction and cross-layer collaborative prediction to enhance traffic event perception, weighting and fusing all levels of characterization through multi-task joint loss and self-adaptive weight optimization models, outputting unified track vectors and calculating similarity, and realizing self-adaptive compression and multi-scale semantic modeling while keeping the continuity of traffic events, and improving the robustness and the discriminance of track similarity calculation.
Inventors
- LIU LIYAN
- ZHANG CHENG
- ZHANG HONGXIN
- Luo Ruicheng
- ZHANG ZHE
- QIU RUQIN
Assignees
- 湖南工商大学
Dates
- Publication Date
- 20260508
- Application Date
- 20260410
Claims (8)
- 1. A traffic state-oriented multi-mode self-adaptive hierarchical track similarity calculation method is characterized by comprising the following steps: s1, acquiring an original GPS track point, preprocessing the original GPS track point, searching traffic states from a historical traffic state database by taking a road section identifier and a time stamp as keys, assigning values, and obtaining a track sequence after semantic enhancement according to POI category distribution vectors of a grid identifier association area; S2, taking the track points as a unified alignment unit, performing multi-scale periodical time representation on the time stamp to obtain time-frequency semantics, converting the road section identifications and the grid identifications into space/topology semantic vectors through an embedding layer, mapping POI category distribution vectors into regional function semantic vectors, mapping continuous traffic state quantities into state semantic vectors, aligning the time-frequency semantics, the space/topology semantics and the regional function semantics with the traffic state semantics, and projecting the time-frequency semantics, the space/topology semantics and the regional function semantics to a unified dimension for joint fusion to form a time-aligned multi-mode feature vector; S3, constructing a track self-adaptive layering mechanism, namely extracting complexity indexes from geometric changes and traffic state changes of the track to form a complexity index vector, wherein the complexity index vector comprises speed dispersion, heading change strength and traffic state fluctuation strength, inputting the complexity index vector into an evaluation module to obtain a complexity score, and then mapping the complexity score into a target layer number; S4, constructing a differential sequence pooling mechanism of traffic perception, generating a new level from bottom to top according to the target layer number, calculating a node importance score by combining geometric changes and traffic state changes in a time neighborhood of each level, generating soft selection weights through micro TopK to realize self-adaptive downsampling, outputting a reserved node sequence, and adding discarded nodes into a segment level attribute according to interval aggregation between adjacent reserved nodes to obtain a multi-level track representation of layer-by-layer length compression and layer-by-layer semantic summarization; S5, constructing a traffic state guiding mask and a cross-layer collaborative prediction mechanism, namely analyzing the change intensity and change rhythm of the traffic state along the track, identifying a high fluctuation area with remarkable mutation of road conditions, preferentially selecting the high fluctuation area as a mask shielding target, constructing an intra-layer horizontal prediction task, constructing a cross-layer vertical prediction task by utilizing the shielded context of the same layer to restore the shielded characteristic, and guiding the reduction of the fine-grained characteristic of the bottom layer by utilizing the high-layer generalized representation; S6, constructing a multi-task collaborative optimization joint loss function, dynamically balancing each loss item by adopting a self-adaptive weight mechanism, and training to obtain a hierarchical track representation model and stable representation of the track in each layer; and S7, calculating fusion weights of different levels of representations according to the complexity scores, carrying out weighted fusion on each level of track characterization, outputting uniform track vector characterization, and outputting final track similarity through a similarity calculation layer.
- 2. The traffic state-oriented multi-mode adaptive hierarchical track similarity calculation method according to claim 1, wherein preprocessing the original GPS track points and generating a semantically enhanced track sequence specifically comprises the following steps: S101, denoising and data cleaning are carried out on the original GPS track points, invalid sampling points with abnormal drift points and null longitude and latitude or time are removed, coordinate conversion is carried out on the longitude and latitude of the track points after pretreatment, geographic coordinates are mapped to track point coordinates under the same plane coordinate system, and the rest sampling points are sequenced according to ascending sequence of time stamps, so that a track point sequence is obtained, wherein the track point sequence comprises space-time information, longitude, latitude, coordinates and time of each track point; s102, extracting a road network topological graph from OpenStreetMap data, projecting the track point sequence onto the road network topological graph according to longitude and latitude, performing map matching by using a hidden Markov model, acquiring a road section identifier of each track point to obtain an optimal road section sequence of a track, and extracting static characteristics comprising road section length, road type, lane number and speed limit of each road section; s103, given an H3 resolution parameter, defining a grid mapping function, and calculating a grid mark of each track point by using the grid mapping function to obtain a grid sequence of the track; s104, after obtaining the road section identification and time, calculating a discrete time bucket, and retrieving the traffic state of the road section from a historical traffic state database, wherein the traffic state comprises average speed, congestion index and road saturation; S105, counting the distribution of K-type POIs in the interior according to the space range of each grid unit to obtain POI category distribution vectors; And S106, combining the space-time information, the road section identification, the grid identification, the traffic state quantity and the POI category distribution vector of the track points on the same granularity to obtain the track sequence with enhanced semantics.
- 3. The traffic state-oriented multi-modal adaptive hierarchical trajectory similarity calculation method according to claim 1, wherein constructing the time-aligned multi-modal feature vector specifically comprises the steps of: S201, converting an absolute timestamp into time in each day and time in each week by utilizing sine and cosine periodical coding time, and constructing a time semantic vector of the track point; S202, converting the road section identifications and the grid identifications of the track points into topological semantic vectors and space semantic vectors through a learnable parameter matrix; s203, mapping the road section static features of the track points into static road semantic vectors through MLP; S204, mapping the POI category distribution vector of the track point into a regional function semantic vector through MLP; s205, mapping the historical traffic state characteristics of the track points into traffic semantic vectors through MLP; S206, merging the modal feature vectors into point-level composite representation on the granularity of the track points after the modal feature vectors are aligned in unified dimensions, and obtaining the input sequence representation of the track according to time sequence arrangement.
- 4. The traffic-state-oriented multi-modal adaptive hierarchical trajectory similarity calculation method of claim 1, wherein the trajectory adaptive layering mechanism comprises: S301, extracting complexity indexes of the track points and the corresponding traffic state vectors thereof, wherein the complexity indexes comprise dispersion, heading change intensity indexes and traffic state fluctuation intensity indexes, and the complexity index vectors are formed; s302, inputting the complexity index vector into an evaluation module to obtain a complexity score, wherein the evaluation module comprises any one of a linear model, a two-layer fully-connected network or a lightweight multi-layer perceptron; S303, presetting a minimum layer number and a maximum layer number, converting the complexity score into a target layer number through a preset mapping function, and rounding and cutting off the boundary of the target layer number to enable the target layer to fall into a preset range.
- 5. The traffic-state-oriented multi-modal adaptive hierarchical trajectory similarity calculation method of claim 1, wherein the traffic-aware differential sequence pooling mechanism generates multi-hierarchical trajectory characterizations comprising the steps of: S401, determining pooling times through the multi-mode feature vector of the track point and a target layer number, constructing a hierarchical representation from bottom to top according to the target layer number, and performing sequence pooling operation on a previous layer sequence to generate a new layer; S402, constructing a time neighborhood of a fixed window on each layer of sequence indexes; S403, using the current layer input sequence as a pooling object, constructing a pooling window/neighborhood relation in a time neighborhood, calculating the track movement speed by adjacent point displacement and time difference, defining local geometric change intensity, calculating adjacent difference amplitude of traffic state vectors, and carrying out linear fusion after normalizing the geometric change and the traffic state mutation to obtain a node importance score; S404, generating soft selection weights by adopting a microtopok selection operator, determining the target pooling retention quantity of the layer and completing self-adaptive downsampling to obtain a pooling output sequence formed by retention nodes; S405, determining a reserved index set by soft selection weight, connecting reserved nodes according to the original sequence order, and maintaining the original time sequence topology; s406, defining and reserving node coverage areas, calculating segment duration time and distance, carrying out weighted statistics on traffic piles in the intervals to obtain segment state characteristics, and mapping the segment attributes into segment semantic vectors; s407, pooling and aggregating the unreserved nodes according to the time neighborhood/adjacent reserved node intervals, converging the unreserved node characteristics into segment-level attributes, adding the segment-level attributes between the adjacent reserved nodes, fusing the point semantics and the segment semantics of the reserved nodes to obtain the next-layer node representation, repeatedly executing the steps S401 to S406 for M times, and obtaining the multi-level track representation of the layer-by-layer length compression and the layer-by-layer semantic summarization.
- 6. The traffic state-oriented multi-modal adaptive hierarchical trajectory similarity calculation method according to claim 1, wherein the traffic state guidance mask and cross-hierarchy collaborative prediction mechanism specifically comprises the following steps: s501, calculating traffic state change intensity and change rhythm along traffic state vectors of the track point sequence; S502, constructing a point-level fluctuation score based on the change intensity and the change rhythm, determining a high fluctuation area index set, and preferentially selecting the high fluctuation area as a mask shielding target to generate a point-level mask vector; S503, establishing point-to-level node coverage mapping for the multi-level track characterization, mapping a point-level mask set into a level mask set and generating a level mask; s504, using a mask mark vector to replace a masked node in any hierarchy to obtain a mask input sequence, setting an intra-layer horizontal predictor, and recovering the blocked feature by using the same-layer non-blocking context; s505, establishing father-son mapping from a low-level node to a high-level node, setting a cross-layer vertical predictor, and guiding the recovery of the low-level blocked features by utilizing high-level generalized characterization; S506, reconstructing training samples by taking the mask input sequence as a mask in any hierarchy, and constructing training sample-target pairs by taking the real representation of the masked nodes in the original sequence as a reconstruction target.
- 7. The traffic state-oriented multi-modal adaptive hierarchical trajectory similarity calculation method according to claim 1, wherein the process of constructing the multi-tasking collaborative optimization joint loss function and training specifically comprises: s601, constructing a mask input sequence according to the multi-level track representation, mask sets of all levels and the level mask vector, and defining a supervision traffic state vector for each level of nodes; S602, constructing mask reconstruction loss; s603, setting traffic state regression decoding, decoding the hidden space representation into traffic state prediction, and constructing traffic state regression loss; s604, defining a track level representation extraction operator, introducing a projection head, constructing two random views as positive sample pairs for the same track, and constructing contrast learning loss; s605, constructing diversity regularization prevention representation collapse, and synthesizing structural regularization items by using hierarchical consistency regularization prevention excessive or insufficient abstraction; s606, introducing a learnable uncertainty parameter, constructing an adaptive weighted joint loss, and updating an encoder, a pooling module, a predictor, a decoder and weight parameters by minimizing the joint loss, wherein the adaptive weighted joint loss expression is as follows: ; Wherein, the The weight parameter is represented by a number of weight parameters, Representing the loss of the reconstruction of the mask, Representing a return loss of traffic status and, A comparison of the learning loss is indicated, Representing the structured regularization term(s), S607, after training is completed, a hierarchical track characterization model is obtained, and stable representation and track level vectors of each layer of any track are input and output.
- 8. The traffic state-oriented multi-modal adaptive hierarchical trajectory similarity calculation method according to claim 1, wherein the steps of calculating fusion weights of different hierarchical representations according to the complexity score, weighting and fusing each level of trajectory characterization, and outputting trajectory similarity comprise: S701, constructing each layer of track level representation according to the multi-layer track representation and the track complexity score; s702, generating un-normalized weights of all layers according to the complexity scores, and obtaining hierarchical fusion weights by adopting Softmax normalization processing; S703, carrying out weighted fusion on each layer of track level representation according to weights to obtain unified track vector characterization; S704, outputting unified vector representation of any two tracks through a similarity calculation layer, wherein the similarity calculation layer is any one of cosine similarity, bilinear similarity and MLP similarity; And S705, outputting the unified vector representation of each track and the similarity score of any two tracks.
Description
Traffic state-oriented multi-mode self-adaptive hierarchical track similarity calculation method Technical Field The invention relates to the technical field of intelligent traffic and space-time big data analysis, in particular to a traffic state-oriented multi-mode self-adaptive hierarchical track similarity calculation method. Background With the popularization of network vehicles, mobile terminals and urban perception infrastructures, a traffic system continuously generates large-scale track data, tracks comprise continuous geometric space-time signals formed by longitude, latitude and time stamps, the tracks are naturally constrained by the accessibility of a road network and the running state of traffic, namely, the same space position can correspond to distinct average speeds, congestion indexes and road saturation in different time periods, meanwhile, road section sequences formed by matching the rules through maps reflect road network topological structures and steering/communication limitations, in order to further describe traveling purposes and regional activity intentions, POI category distribution in grid units also becomes an important source of track semantics, and the existing track similarity calculation and retrieval technology is widely applied to path retrieval, traveling portraits, anomaly detection and traffic management, but still has robustness and discriminant of insufficient influence similarity results under the real scene of strong traffic state time variation, semantic multisource isomerism and obvious track length difference. In the aspect of traffic state semantic modeling, a large number of methods still take geometric similarity or road segment sequence similarity as main, and the traffic state is regarded as an additional attribute or is completely ignored, so that measurement deviation of similar space paths but large traffic event difference appears in the scenes of sudden congestion, tidal traffic, accident detouring and the like, and particularly when the traffic state needs to be bound with road identification and time granularity, if a state retrieval and injection mechanism taking road segment representation and time bucket/time stamp as keys is lacked, the information of speed, congestion and the like is difficult to be stably aligned to track points or road segment fragments, and the expression capability of subsequent characterization learning on traffic physical semantics is further limited. In the aspect of track level division, in order to reduce the calculation cost of long track retrieval, downsampling or level abstraction is often needed, but two common problems of fixed layer number and fixed window pooling in the prior art are that firstly, track complexity is obvious in different samples and different road segments, a fixed layer depth cannot achieve 'complex track detail preservation and simple track redundancy reduction', secondly, hard selection such as maximum pooling is not conductive or gradient sparse, a complete congestion or acceleration event is easily cut off by a fixed window/step mechanical cutting block, and only local extremum points are reserved, so that semantic loss and abstract distortion are caused. In terms of self-supervision pre-training mask strategies, mask reconstruction type pre-training is used for improving characterization generalization in recent years, but random masking often cannot guarantee coverage of abrupt change areas of traffic states, namely, a model can mainly reconstruct smooth fragments, and key traffic events such as congestion abrupt change, speed abrupt decrease and the like are inadequately learned, so that similarity calculation is insensitive to event difference, and the key fragments are easy to dilute by noise. Disclosure of Invention In view of the above, the invention provides a traffic state-oriented multi-mode self-adaptive hierarchical track similarity calculation method, which solves the problems of traffic state semantic deficiency, hierarchical abstraction rigidification, incapacitation of pooling and no traffic perception of a pre-training mask in the prior art, and provides a track representation and similarity calculation technical scheme capable of considering traffic event depiction and multi-mode semantic consistency while ensuring efficiency. In order to achieve the above purpose, the present invention provides a traffic state-oriented multi-modal adaptive hierarchical trajectory similarity calculation method, which includes the following steps: s1, acquiring an original GPS track point, preprocessing the original GPS track point, performing coordinate conversion on longitude and latitude of the preprocessed track point to obtain a track point sequence ordered in time, searching traffic states from a historical traffic state database by taking road section identifiers and time stamps as keys, assigning values, and obtaining a track sequence after semantic enhancement according to POI cate