CN-122024854-A - Spatial multi-group mathematical data prediction method, model and application based on spatial constraint and countermeasure learning

CN122024854ACN 122024854 ACN122024854 ACN 122024854ACN-122024854-A

Abstract

The present disclosure relates to a spatial multi-mathematics data prediction method, model and application based on spatial constraint and countermeasure learning, wherein the prediction method comprises mode specific coding and potential variable inference; the method comprises the steps of spatial dependency modeling based on a sparse variation Gaussian process, potential variable modeling based on spatial dependency constraint, multi-modal potential representation alignment based on countermeasure learning, modal specific decoding and cross-modal characteristic prediction, and joint objective function construction and model training. According to the embodiment, the space coordinate information is fully utilized to constrain potential representation, an anti-learning mechanism is introduced to realize effective alignment of potential features of different modes, cross-mode feature prediction is realized on the basis, and the accuracy, stability and application value of space multi-group chemical data analysis are improved.

Inventors

CHEN JING
WANG TAO
LIU ZHENGXUAN

Assignees

西北工业大学

Dates

Publication Date: 20260512
Application Date: 20260312

Claims (9)

1. The spatial multi-group mathematical data prediction method based on spatial constraint and countermeasure learning is characterized by comprising the following steps: Step 1, mode specific coding and potential variable inference, namely acquiring a spatial multi-mode histology data set and corresponding coordinate information, designing a special encoder aiming at each mode, adopting differential distribution modeling, extracting potential characterization of each mode data through the encoder, and generating Gaussian variation distribution parameters of potential variables to form mode approximate posterior distribution; Step 2, space dependency modeling based on a sparse variation Gaussian process, namely performing space dependency modeling on coordinate information, decomposing potential variables output by an encoder into a sum of space dependency dimension and non-space dimension, constructing a space covariance matrix by adopting zero-mean Gaussian process prior to the space dependency dimension and using a Cauchy kernel function, and introducing the sparse variation Gaussian process to perform approximate inference on the space Gaussian process; Step 3, modeling the potential variables of the space dependence constraint, namely introducing the space dependence information into the potential variable modeling as a priori constraint, applying a multi-element Gaussian priori to the potential variables, linearly superposing the space potential dimension and the non-space potential dimension to obtain a final potential variable representation, and simultaneously introducing a space smoothness regularization term to enhance the space smoothness and consistency of the potential representation; Step 4, based on alignment of multi-modal potential representation of countermeasure learning, constructing a discriminator network to identify modal sources of potential variable representations, enabling an encoder to learn and strip redundant information related to the modal sources in the potential representation through countermeasure game training of the encoder-discriminator, extracting shared characteristics among different modalities, and realizing alignment of the multi-modal potential representations in a unified potential space; Step 5, specific decoding and cross-modal feature prediction, namely designing special decoders aiming at different spatial multi-group chemical modes, inputting potential variables obtained by the observation data of the initial mode through the encoders of the special decoders into the decoders of the target modes, and generating predicted values of the target modes; and 6, constructing a combined objective function comprising evidence lower bound loss, a spatial smoothness regularization term and an antagonism loss, initializing model parameters, and training an encoder, a decoder and a discriminator to obtain a final reasoning model.
2. The spatial multi-mathematical data prediction method based on spatial constraint and countermeasure learning according to claim 1, wherein the step 1 comprises the following specific steps: S101, acquiring a space multi-mode histology data set and corresponding two-dimensional space coordinate information; s102, preprocessing each mode data, and filtering low-quality samples and sites; S103, sorting the preprocessed data to form an input data set containing each modal feature matrix and a corresponding space coordinate matrix; s104, defining the mode data types, and modeling by adopting differential distribution aiming at different mode data; s105, designing a special encoder aiming at each mode, wherein each encoder adopts a depth full-connection network structure, extracts potential characterization of high-dimensional input by combining a nonlinear activation function, and generates Gaussian variation distribution parameters of potential variables to form mode approximate posterior distribution.
3. The spatial multi-mathematical data prediction method based on spatial constraint and countermeasure learning according to claim 1, wherein step 2 comprises the following specific steps: s201, performing spatial correlation on the acquired two-dimensional spatial coordinate information and performing system modeling; S202, decomposing potential variables output by an encoder into a sum of space-dependent dimensions and non-space dimensions, and constructing a space covariance matrix by adopting a zero-mean Gaussian process prior for the space-dependent dimensions and a Cauchy kernel function; S203, passing through the space coordinate domain And (3) taking the clustering center as an induction point, respectively calculating a covariance matrix between the induction points and a cross covariance matrix between the induction points and sampling sites, constructing a variation posterior distribution to approximate the true posterior distribution, and modeling a space dependent structure in the large-scale space transcriptome data.
4. The spatial multi-mathematical data prediction method based on spatial constraint and countermeasure learning according to claim 1, wherein the step 3 comprises the following specific steps: S301, introducing space dependency information as prior constraint into a potential variable modeling process, and applying a multi-element Gaussian prior with zero mean and unit covariance to the potential variable; s302, linearly superposing the space potential dimension and the non-space potential dimension to obtain a final potential variable representation; s303, introducing a spatial smoothness regularization term to enhance the spatial smoothness and consistency of the potential representation.
5. The spatial multi-mathematical data prediction method based on spatial constraint and countermeasure learning according to claim 1, wherein the step 4 comprises the following specific steps: S401, introducing an antagonism learning mechanism, and constructing a discriminator network to identify a modal source represented by a potential variable; s402, through the anti-game training of the encoder-arbiter, the encoder learns and strips redundant information related to modal sources in the potential representation, extracts shared features among different modalities, and realizes the alignment of the multi-modal potential representation in a unified potential space.
6. The spatial multi-mathematical data prediction method based on spatial constraint and countermeasure learning according to claim 1, wherein the step 5 comprises the following specific steps: s501, designing special decoders for different spatial multi-group chemical modes; S502, inputting the observation data of the initial mode into the corresponding encoder, deducing the potential variable, inputting the potential variable into the decoder of the target mode, and generating the predicted value of the target mode.
7. The spatial multi-mathematical data prediction method based on spatial constraint and countermeasure learning according to claim 1, wherein the step 6 comprises the following specific steps: S601, constructing a combined objective function comprising evidence lower bound loss, a spatial smoothness regularization term and resistance loss, and realizing overall convergence of model parameters through an alternate optimization strategy; S602, initializing model parameters by adopting a He initialization method, initializing induction point positions on a space coordinate domain through k-means clustering, and simultaneously setting initial values for kernel functions and related super parameters in joint loss; And S603, performing parameter updating by using an Adam optimization algorithm, and training an encoder, a decoder and a discriminator by using an alternate updating mechanism in a training process to obtain a final reasoning model.
8. A spatial multi-set of mathematical data prediction model based on spatial constraint and countermeasure learning, the model comprising: The modal coding module is used for acquiring a spatial multi-modal histology data set and corresponding coordinate information, designing a special encoder aiming at each modal, adopting differential distribution modeling, extracting potential characterization of each modal data through the encoder, and generating Gaussian variation distribution parameters of potential variables to form modal approximate posterior distribution; The space dependence modeling module is used for carrying out space dependence modeling on the coordinate information, decomposing potential variables output by the encoder into the sum of space dependence dimension and non-space dimension, constructing a space covariance matrix by adopting zero-mean Gaussian process prior to the space dependence dimension and a Cauchy kernel function, and carrying out approximate inference on the space Gaussian process by introducing a sparse variation Gaussian process; the potential variable modeling module is used for introducing the space dependency information as prior constraint into the potential variable modeling, applying multiple Gaussian prior to the potential variable, linearly superposing the space potential dimension and the non-space potential dimension to obtain a final potential variable representation, and simultaneously introducing a space smoothness regularization term to enhance the space smoothness and consistency of the potential representation; the multi-mode alignment module is used for constructing a discriminator network to identify the mode source of the potential variable representation, and enabling the encoder to learn and strip redundant information related to the mode source in the potential representation through the countermeasure game training of the encoder-discriminator, extracting the shared characteristics among different modes and realizing the alignment of the multi-mode potential representation in a unified potential space; the decoding and predicting module is used for designing special decoders aiming at different spatial multi-group chemical modes, inputting potential variables obtained by the observation data of the initial mode through the encoders of the special decoders into the decoder of the target mode, and generating predicted values of the target mode; And the model training module is used for constructing a combined objective function comprising evidence lower bound loss, a spatial smoothness regularization term and antagonism loss, initializing model parameters, and training an encoder, a decoder and a discriminator to obtain a final reasoning model.
9. Use of a spatial multi-mathematical data prediction model based on spatial constraints and challenge learning, characterized in that the model of claim 8 is used for prediction of spatial multi-mathematical data.

Description

Spatial multi-group mathematical data prediction method, model and application based on spatial constraint and countermeasure learning Technical Field The invention relates to the technical field of data processing, in particular to a spatial multi-group mathematical data prediction method, model and application based on spatial constraint and antagonism learning. Background In recent years, with the continuous maturation of high-throughput sequencing technology, in situ hybridization technology and spatial molecular imaging technology, spatial transcriptome, spatial chromatin accessibility, spatial protein expression and other spatial multi-group data are becoming important data sources for studying complex biological tissues. The data can not only provide quantitative information on a molecular level, but also can reserve physical position information of a molecular signal in a tissue section, so that researchers can study a molecular expression mode and a spatial distribution rule thereof on a tissue in-situ scale. Compared with the traditional single-cell sequencing technology, the spatial multi-mode histology data has obvious advantages in the aspects of tissue structure continuity, functional area division, intercellular space interaction and the like, and provides a new research means for analyzing the spatial tissue mechanism of complex tissues. In biological tissues, the changes in molecular expression levels are often not randomly distributed, but rather are affected by the combination of tissue structure, microenvironment, and intercellular interactions, exhibiting significant spatial correlation and hierarchical tissue characteristics. The joint analysis of the spatial multi-group data is helpful to reveal the cooperative change rule of different molecular layers in the spatial dimension, thereby deepening the understanding of tissue development, steady-state maintenance and disease occurrence mechanism. For example, in oncology, developmental biology, and neuroscience research, spatially coordinated changes in different molecular modalities typically correspond to a particular functional region or pathological state. Therefore, how to realize effective prediction of multi-modal molecular characteristics while maintaining spatial structure information has become a key problem in spatial biological data analysis. At present, prediction methods for spatial multi-group data are mainly divided into three types, namely a method based on a graph neural network, a method based on deep learning and a method based on probability modeling. The method has been widely used, but has some defects, such as (1) most methods have limitations on modeling of spatial dependence, lack of continuity of a prediction result due to introduction of subjective deviation depending on manually defined neighborhood rules, and difficulty in accurately describing complex dependence relation of tissue space, and (2) a large number of existing models have insufficient heteroplasmy adaptation to multi-mode data, so that mode data with different statistical properties are often processed by adopting simple feature stitching, and the relation between mode alignment and specificity is difficult to balance, so that mode specific biological signals are lost or cross-mode mapping capability is reduced. Accordingly, there is a need to improve one or more problems in the related art as described above. It is noted that this section is intended to provide a background or context for the technical solutions of the present disclosure as set forth in the claims. The description herein is not admitted to be prior art by inclusion in this section. Disclosure of Invention The invention aims to provide a spatial multi-group data prediction method, a model and application based on spatial constraint and antagonism learning, so as to overcome one or more problems caused by the limitations and defects of related technologies at least to a certain extent, for example, the problems that in the existing method, spatial dependency relationship is difficult to describe at the same time and multi-mode characteristic alignment is realized in the spatial multi-group data analysis process are solved. The invention firstly provides a space multi-mathematics data prediction method based on space constraint and antagonism learning, which comprises the following steps: Step 1, mode specific coding and potential variable inference, namely acquiring a spatial multi-mode histology data set and corresponding coordinate information, designing a special encoder aiming at each mode, adopting differential distribution modeling, extracting potential characterization of each mode data through the encoder, and generating Gaussian variation distribution parameters of potential variables to form mode approximate posterior distribution; Step 2, space dependency modeling based on a sparse variation Gaussian process, namely performing space dependency modeling on coordinate in