Search

CN-121980614-A - Privacy protection and road network constraint track generation method based on double-guide diffusion model

CN121980614ACN 121980614 ACN121980614 ACN 121980614ACN-121980614-A

Abstract

The invention discloses a privacy protection and road network constraint track generation method based on a dual-guide diffusion model, which comprises the following steps of converting a geographic coordinate sequence into a structured road network constraint track, constructing a graph structure capable of reflecting a real movement rule, carrying out mixed embedding processing on the track, designing a preference extraction mechanism based on temperature scaling, constructing a denoising network based on conditional residual hole convolution, aiming at learning conditional distribution and learning road network topology through a space constraint forced model, wherein the output of the diffusion model is a numerical matrix in a continuous space, and the numerical matrix needs to be accurately mapped back to a discrete road network segment sequence to form available track data. According to the invention, by introducing a semantic guidance mechanism and a road network constraint perception mechanism for privacy protection, on the premise of completely not depending on single sensitive stroke input of a user, a high-quality synthetic track with high fidelity personalized semantics and strict adherence to urban road network topology rules is generated.

Inventors

  • CHENG JINGXIAN
  • ZHU YONGHUA
  • ZHAO HAOJIE
  • ZHAO JUAN

Assignees

  • 长安大学

Dates

Publication Date
20260505
Application Date
20260202

Claims (7)

  1. 1. The privacy protection and road network constraint track generation method based on the double-guide diffusion model is characterized by comprising the following steps of: step 1, source track data acquisition and standardized pretreatment: converting the geographic coordinate sequence into a structured road network constraint track; step 2, constructing and vectorizing a track based on a graph of a group movement mode: Constructing a graph structure capable of reflecting a real movement rule, and performing mixed embedding treatment on the track; Step 3, privacy protection semantic condition coding: designing a preference extraction mechanism based on temperature scaling; step 4, training a road network constraint-aware conditional diffusion model: Constructing a denoising network based on condition residual error hole convolution, aiming at learning condition distribution and forcing a model to learn road network topology through space constraint; and 5, reconstructing a discrete road network based on similarity: the output of the diffusion model is a matrix of values in continuous space that need to be mapped precisely back to the discrete network segment sequence to form usable trajectory data.
  2. 2. The privacy protection and road network constraint track generation method based on the dual-guide diffusion model according to claim 1, wherein the step 1 specifically comprises the following steps: Step 1.1, map matching based on a Hidden Markov Model (HMM): map matching algorithm based on hidden Markov model HMM is adopted to search the most probable road section sequence Original track point Corrected to , wherein, For the road segment ID, For the movement Ratio Moving Ratio, the relative position of the point on the road segment is represented, A time stamp is used for indicating the specific time when the track point occurs; Step 1.2, multidimensional noise cleaning and outlier filtering: performing filtering strategies, namely eliminating static points, and if the displacement between two continuous points is smaller than a preset threshold value, treating the displacement as static drift and removing the static drift so as to eliminate redundant data; Filtering abnormal strokes, namely only keeping tracks with the duration within a reasonable interval T_ { min }, T_ { max }, and eliminating invalid short strokes or abnormal long strokes; the user filtering is started cold, and the users with insufficient total historical track number are removed, so that user preference can be constructed effectively; Step 1.3 time resampling Temporal Resampling: and (3) carrying out time resampling on the track, generating a standardized equidistant sequence by utilizing linear interpolation, and unifying the input dimensions of the model.
  3. 3. The privacy protection and road network constraint track generation method based on the dual-guide diffusion model according to claim 2, wherein in the step 1.1, the movement ratio is The calculation is as follows: Wherein, the The network distance is calculated and, Is the road segment length.
  4. 4. The privacy protection and road network constraint track generation method based on the dual-guide diffusion model according to claim 1, wherein the step 2 specifically comprises the following steps: Step 2.1, user movement transfer diagram UMTG is constructed: Defining user movement transition graphs Wherein, the node For road segment collection, edge collection Representing the observation transfer relation based on the real track data between road sections; edge construction is based on trace data sets Is a true transfer behavior of (a): If road section And Continuously appearing in the track, a directed edge is established ; Weighting of edges The prevalence of this transfer pattern was quantified: Wherein, the Is an indication function; step 2.2, road section embedding and mixing vectorization: At the position of Node2Vec algorithm is applied to learn each road section Dimension embedding vector matrix ; Subsequently, the track is followed Each point in (a) Conversion into a hybrid vector: Search road segment embedding , Indicating on which road section the vehicle is located in particular at the t-th moment of the track sequence, which is an integer index and which will move the ratio Normalized to ; The final trajectory is represented as a concatenation of the two in the channel dimension: 。
  5. 5. the privacy protection and road network constraint track generation method based on the dual-guide diffusion model according to claim 1, wherein the step 3 specifically comprises the following steps: step 3.1, top-K road section frequency statistics: for the user Statistics of historical track set Global access frequency of all road segments in a network Only the highest frequency is reserved Each road section is recorded as a set ; Step 3.2, temperature scaling Softmax weighting and condition generation: Introducing temperature parameters Carrying out smooth normalization processing on the frequency of the Top-K road section; Weighting of The calculation is as follows: Final user preference condition embedding The calculation is a weighted sum of Top-K road segment embeddings: 。
  6. 6. The privacy protection and road network constraint track generation method based on the dual-guide diffusion model according to claim 1, wherein the step 4 specifically comprises the following steps: step 4.1, network architecture and condition injection: Using Conditional Residual Dilated Conv Block as backbone network, spreading time step Sinusoidal position coding of (c) and user preference vector Unified conditional embedding by MLP fusion In the first network In the residual blocks, this condition is injected to modulate the feature: The long-term dependencies are then captured by the hole convolution layer Dilated Conv; step 4.2 space effectiveness Loss SPATIAL VALIDITY Loss: introducing UMTG-based constraint loss, and decoding a path segment sequence from a prediction result by utilizing Argmax during training And checking adjacent road segments according to UMTG And Whether there is a legal edge: Wherein, the As an indication function, if legal connection exists, the loss is 1, otherwise, the loss is 0, and illegal road network jump is directly punished; Step 4.3, joint optimization: The overall optimization objective incorporates noise predictive loss Loss of structural reconstruction And space efficiency loss End-to-end training is performed.
  7. 7. The privacy protection and road network constraint track generation method based on the dual-guide diffusion model according to claim 1, wherein the step 5 specifically comprises the following steps: step 5.1, vector decomposition and similarity calculation: will generate a matrix Splitting into road segment embedded parts And a movement ratio section Embedding matrix with pre-trained road segments Cosine similarity matrix between generated vector and all road section embeddings is calculated : Step 5.2, argmax mapping and coordinate recovery: for each time step of the track Selecting a road section with the highest similarity score as a final prediction result: ; Combining decoded road segments Geographic information of (c) and generated movement ratio And recovering accurate longitude and latitude coordinates through linear interpolation to complete track synthesis.

Description

Privacy protection and road network constraint track generation method based on double-guide diffusion model Technical Field The invention relates to the fields of artificial intelligence, space-time data mining and data privacy protection, in particular to a privacy protection and road network constraint track generation method based on a double-guide diffusion model. Background With the rapid development of the mobile internet and the internet of things, smart phones, vehicle navigation units and various wearable devices have become ubiquitous social sensors, and massive personal position track data are continuously generated. These data reflect the law of time-space movement behavior of human population, and have extremely high mining value. In the field of business intelligence, the track data provides accurate data support for retail location decision, advertisement delivery point position optimization and tourist flow prediction, and in addition, in the aspects of public health and safety, the track data plays an irreplaceable role in epidemic propagation tracing, large-scale movable personnel management and disaster emergency response coordination. However, high value trajectory data is accompanied by extremely high privacy sensitivity. The original track records often imply sensitive attributes such as the daily life law, home address, work units and the like of users, and the direct release or sharing of the data faces serious privacy disclosure risks. The track generation aims at synthesizing track data which reserves the statistical characteristics of the real world movement modes, and the existing method is mainly divided into a model based on physical rules and a model based on deep learning, wherein the basic thought is that the former simulates human movement by enforcing clear field constraint and general movement rules, and the latter autonomously captures the movement modes by utilizing a neural network architecture. Early physical models relied on anchor point inference or exploration and preferential return mechanisms, but it was difficult to capture complex semantic dependencies, and therefore deep learning approaches became the mainstream. The deep learning method mainly comprises the steps of changing the self-encoder (VAEs), generating the countermeasure network (GANs) and Denoising Diffusion Probability Model (DDPMs). Since diffusion models exhibit better sample generation quality and training stability than GAN and VAE through forward noise injection and backward denoising processes, most current research is directed to high quality trajectory synthesis using diffusion models. Jiang et al, in document "The TimeGeo modeling framework for urban mobility without travel surveys," propose a TimeGeo framework whose track generation process generates guest tracks primarily by inferring temporal patterns of anchor points and population levels. Wang et al in document "An extended exploration AND PREFERENTIAL return model for human mobility simulation" propose EPR variant models to enhance spatial realism by introducing distance attenuation and heterogeneous modeling. With the advent of diffusion models Zhu et al in document "DiffTraj: GENERATING GPS Trajectory with Diffusion Probabilistic Model" proposed DiffTraj, which implements personalized generation by building the generation process on top of the conditions of a single trip attribute (e.g., start-end pair). Wei et al in the document "Diff-RNTraj: A Structure-Aware Diffusion Model for Road Network-Constrained Trajectory Generation" focus on physical constraints, and enforce road network constraints by integrating road segment embedded representations with loss of space effectiveness to improve road adherence and physical reachability of the generated trajectories. However, the above techniques still have the following drawbacks in the track generation process: (1) Methods based on physical rules (e.g., timeGeo, EPR) typically enforce general movement laws, essentially fail to capture personalized semantics or long-term behavioral dependencies, resulting in a generated trajectory that lacks individual-level behavioral fidelity; (2) Early deep learning models such as VAEs often produced topologically distorted outputs, while GANs was prone to pattern collapse problems, resulting in a lack of diversity in the generated samples and a disruption of longitudinal consistency; (3) The existing diffusion model method has serious trade-off (Trilemma) among privacy, individuation and physical reality, namely DiffTraj realizes individuation through attribute individuation, but the explicit condition input makes tracks easily subject to association attack so as to reveal sensitive information, diff-RNTraj gives up example-level condition guidance for protecting privacy, so that a generated result is too general in a behavior mode and cannot meet the requirement of individuation modeling, and the existing method is difficult to simultaneously consid