CN-121983146-A - Reconstruction method, reconstruction device and storage medium for spatial protein expression profile

CN121983146ACN 121983146 ACN121983146 ACN 121983146ACN-121983146-A

Abstract

The application discloses a reconstruction method, a reconstruction device and a storage medium of a spatial protein expression map. The method comprises the steps of receiving multichannel high-resolution tissue images of tissue areas obtained based on tissue sections, receiving aggregate values of expression amounts of target proteins on the first and second direction tissue strips obtained after the first and second direction tissue strips are cut on the first and second tissue sections, training a deep learning network based on the aggregate values of the protein expression amounts on the first and second direction tissue strips and the multichannel high-resolution tissue images, and predicting the expression amounts of the target proteins on each image unit based on the multichannel high-resolution tissue images by the trained deep learning network to reconstruct a two-dimensional protein expression map. The application only uses the aggregation value of protein expression quantity on as few as two tissue sections and tissue strips which are easy to obtain, and can reconstruct the two-dimensional space protein map with high precision without protein expression quantity of each unit.

Inventors

GUO TIANNAN
Wang Shuaiyao
CHEN YI
DONG ZHEN

Assignees

西湖实验室(生命科学和生物医学浙江省实验室)

Dates

Publication Date: 20260505
Application Date: 20260116

Claims (10)

1. A method for reconstructing a spatial protein expression profile, comprising: receiving a multi-channel high resolution tissue image comprising a tissue region based on at least one tissue slice; Receiving a polymerized real value of the expression amount of each target protein on each first-direction tissue strip obtained after the first-direction tissue strip of the tissue region is cut on the first tissue section, and a polymerized real value of the expression amount of each target protein on each second-direction tissue strip obtained after the second-direction tissue strip of the tissue region is cut on the second tissue section; Training a deep learning network based on the aggregate true values of the expression amounts of the target proteins on the first-direction tissue strips and the second-direction tissue strips and the multi-channel high-resolution tissue images; based on the multichannel high-resolution tissue image, the expression quantity of each target protein on each image unit of the tissue region is predicted by using a trained deep learning network and is used for reconstructing a protein expression map of a two-dimensional space of the tissue region.
2. The reconstruction method according to claim 1, wherein identification or marking related to a cell phenotype or tissue structure morphology is not performed on the multi-channel high resolution tissue image before the deep learning network training using the multi-channel high resolution tissue image and before the prediction of a protein expression profile using the multi-channel high resolution tissue image, wherein the identification or marking related to a cell phenotype or tissue structure morphology is performed manually by a human or automatically by a computer.
3. The reconstruction method according to claim 1 or 2, wherein the deep learning network comprises, in order, a feature extraction module and a protein-specific prediction module, wherein the feature extraction module is shared for each target protein, the protein-specific prediction module comprises an independent prediction head corresponding to each target protein, The training of the deep learning network based on the aggregate true values of the expression amounts of the target proteins on the first directional tissue strips and the second directional tissue strips and the multichannel high-resolution tissue image specifically comprises the following steps: dividing a tissue region in the multichannel high-resolution tissue image into a plurality of image units according to a cutting mode of the first-direction tissue strip and the second-direction tissue strip; Training the deep learning network for multiple times until reaching a training target or reaching a specified number of rounds, randomly selecting a specified number of first-direction tissue strips and second-direction tissue strips in each round of training, taking image units corresponding to the selected tissue strips and aggregate true values of expression amounts of target proteins on the tissue strips as training samples, and executing the following steps: Step 1, generating feature coding vectors corresponding to each image unit in a current first direction organization strip/a second direction organization strip by utilizing the feature extraction module; Step 2, for each target protein, based on the feature coding vector corresponding to each image unit, generating a predicted value of the target protein expression quantity of each image unit in the current first-direction tissue strip/second-direction tissue strip by each independent prediction head of the protein specificity prediction module; Step 3, calculating a first loss function for each target protein based on the predicted sum of the target protein expression amounts of each image unit in the current first-direction tissue strip/the second-direction tissue strip and the aggregate true value of the expression amounts of the target protein on the current first-direction tissue strip/the second-direction tissue strip; And 4, calculating a second loss function of deep learning network training based on the first loss function of each target protein on each first direction organization strip and each second direction organization strip, and judging whether a training target is reached or not based on the second loss function.
4. The reconstruction method as set forth in claim 3 further comprising generating an effective tissue region mask based on the multi-channel high resolution tissue image, The dividing the tissue region in the multi-channel high-resolution tissue image into a plurality of image units according to the cutting mode of the tissue strips in the first direction and the tissue strips in the second direction further comprises dividing the tissue region in the multi-channel high-resolution tissue image into a plurality of image units according to the cutting mode of the tissue strips in the first direction and the tissue strips in the second direction, and applying the effective tissue region mask to the plurality of image units so that the plurality of image units only comprise the image units in the effective tissue region.
5. The reconstruction method according to claim 4, wherein predicting the expression amounts of the respective target proteins on the respective image units of the tissue region using the trained deep learning network based on the multi-channel high-resolution tissue image and using for reconstruction of a protein expression map of a two-dimensional space of the tissue region further comprises: Inputting each image unit in the effective tissue area into a trained deep learning network according to a preset sequence to generate a proteome abundance matrix composed of the expression quantity of each target protein on each image unit in the effective tissue area; Reconstructing a protein expression map of the two-dimensional space of the effective tissue area based on the protein abundance matrix.
6. Reconstruction method according to claim 1 or 2, characterized in that the first direction tissue strip cut and the second direction tissue strip cut are capable of reaching a single cell scale, and Under the condition that the first direction tissue strip cutting and the second direction tissue strip cutting reach a single cell scale, the reconstruction of the single cell scale can be carried out on the protein expression map of the two-dimensional space of the tissue region.
7. The method for reconstructing according to claim 3, further comprising, in the case of adding a target protein, Adding an independent prediction head corresponding to the newly added target protein in the protein specificity prediction module; Performing supplementary training on the deep learning network with the newly added independent pre-measuring heads by utilizing the aggregate true value of the expression quantity of the newly added target protein on each first-direction tissue strip and each second-direction tissue strip and the multichannel high-resolution tissue image; based on the multichannel high-resolution tissue image, predicting the expression quantity of the newly increased target protein on each image unit of the tissue region by utilizing a supplemented trained deep learning network, and reconstructing a protein expression map of the tissue region in a two-dimensional space.
8. Reconstruction method according to claim 1 or 2, characterized in that the multichannel high resolution tissue image is acquired as follows: designing a reference protein marker based on tissue spatial structure of the tissue region, cell type heterogeneity, and complementarity between protein markers; immunofluorescent antibody staining of the tissue sections with respective reference protein markers; and obtaining a multichannel high-resolution tissue image based on the tissue section dyed by the immunofluorescence antibody.
9. A spatial protein expression profile reconstruction device, comprising: An interface configured to receive a multi-channel high-resolution tissue image including a tissue region obtained based on at least one tissue section, a aggregate value of an expression amount of each target protein on each first-direction tissue strip obtained after a first-direction tissue strip of the tissue region is cut on a first tissue section, and a aggregate value of an expression amount of each target protein on each second-direction tissue strip obtained after a second-direction tissue strip of the tissue region is cut on a second tissue section; At least one processor configured to perform the steps of the method of reconstructing a spatial protein expression profile according to any one of claims 1 to 8.
10. A non-transitory computer readable storage medium having stored thereon computer executable instructions, wherein the computer executable instructions, when executed by a processor, perform the steps of the method of reconstructing a spatial protein expression profile according to any one of claims 1 to 8.

Description

Reconstruction method, reconstruction device and storage medium for spatial protein expression profile Technical Field The application relates to the technical field of space group analysis. More particularly, the present application relates to a reconstruction method, a reconstruction device and a storage medium for spatial protein expression profile. Background Spatially resolved proteomics techniques enable high resolution measurement of protein expression and distribution in situ in tissues, which is of great importance for a deep understanding of tissue microenvironment, cellular heterogeneity, and interactions between cells. However, achieving deep (thousands of proteins) and high resolution (microns) protein mapping of Whole tissue sections (white-tissue-levels) has been a great challenge in the art, with the core bottleneck being the flux limitations of mass spectrometry analysis and the non-amplifiability of protein molecules, making point-by-point scanning of Whole tissues impractical in time and cost. In order to break through this bottleneck, a representative advanced technology in the art is to generate parallel micro-channels on two adjacent tissue slices at different angles by using a micro-fluidic chip, and perform in-situ cleavage and proteomics analysis on the tissues in the channels, so as to obtain two sets of orthogonal protein projection data. The microfluidic chip is complex to operate, the processing yield is often low (for example, less than 30%), the submicron channels are easy to block and deform, and the chip cost and the debugging cost are high. In addition, this technique requires the aid of a "transcriptomic transfer learning" strategy, i.e. other spatial histology analyses such as hematoxylin-eosin (H & E) imaging or spatial transcriptome as reference maps are required using a third adjacent slice, or a pathologist to circle class labels such as tumor area, stroma area, etc. on histological images such as H & E imaging, and models to guide the reconstruction of protein maps by learning known spatial patterns of different classes. It follows that this technique requires a large number of tissue slices (three consecutive slices), a complex experimental procedure (two slices are designed to operate with a microfluidic chip, and another slice requires an independent spatial histology experiment as a reference), and in the case of histological pictures imaged with H & E, it also requires the clinician to perform segmentation and label delineation of different regions on the image, which not only increases the experimental complexity, but also creates an additional manpower burden, and even more disadvantageous, its dependence on a strong assumption of cross-histology relevance (i.e. it is considered that the spatial pattern of the reference histology or histological picture is highly correlated with the spatial pattern of the target proteome) may have a significant impact on reconstruction accuracy, especially when there is a spatial expression inconsistency between the protein and its transcript, which is not uncommon in living bodies. Another representative technique for front-edge is to employ a sparse sampling strategy to separate a series of parallel tissue strips from multiple (e.g., 8) consecutive tissue slices at different angles for proteomic analysis by laser microdissection techniques. The reconstruction algorithm is to reconstruct a two-dimensional protein map by a depth learning model through pure mathematical inversion by utilizing projection data from multiple angles and multiple slices. This technique requires up to 8 consecutive slices to obtain enough projection angles to ensure reconstruction stability, which not only increases the experimental effort, but also the resulting map is an "average projection" of the 8 slices (up to 80 μm thick), which may mask fine biological structures and cellular heterogeneity at a single level, affecting the reconstruction accuracy of the protein map. Therefore, although the above technology greatly promotes the development of spatial proteomics technology and improves the reconstruction of protein maps to higher resolution, the technical scheme of reconstructing protein maps with higher resolution, which can be simplified in the experimental process, has lower requirements on the number of tissue slices, does not need manual sketching and does not depend on inter-histology association assumption, is not found in the field. Disclosure of Invention The present application has been made to solve the above-mentioned drawbacks of the prior art. There is a need for a method, apparatus and storage medium for reconstructing a spatial protein expression profile that can utilize fewer tissue slices to achieve high-precision reconstruction of a spatial protein expression profile without requiring any form of delineation of histological images by a clinician and without relying on inter-histologic association assumptions. According to a fi