Search

CN-122025109-A - System and construction method for constructing fusion model for breast cancer patient classification typing prediction

CN122025109ACN 122025109 ACN122025109 ACN 122025109ACN-122025109-A

Abstract

The invention belongs to the field of medical detection, and particularly relates to a system and a construction method for constructing a fusion model for classification and parting prediction of breast cancer patients. The system comprises an image data acquisition and import module, a tumor segmentation module, a multi-source feature extraction module, a feature standardization and screening module and a fusion prediction module, wherein the image data acquisition and import module is used for importing DCE-MRI image data, the tumor segmentation module is used for acquiring a three-dimensional volume region of interest of a breast tumor, namely a VOI, the multi-source feature extraction module is used for respectively extracting radiological features, deep learning features and habitat features for representing heterogeneity of internal space of the tumor based on the VOI. The invention enables breast cancer patients to predict ER, PR, HER and Ki-67 states of breast cancer tumors noninvasively, automatically and repeatedly based on DCE-MRI before treatment starts, and deduces molecular subtypes according to the states, thereby screening high-value populations possibly benefiting from new auxiliary strengthening treatment in advance.

Inventors

  • YI WENJUN
  • QU LIMENG
  • HE YI
  • LI YANHUI

Assignees

  • 中南大学湘雅二医院

Dates

Publication Date
20260512
Application Date
20260413

Claims (10)

  1. 1. A system for constructing a fusion model for classification and typing prediction of breast cancer patients, the system comprising: the image data acquisition and importing module is used for acquiring and importing dynamic contrast enhancement MRI (DCE-MRI) image data of a breast cancer patient; The tumor segmentation module is used for obtaining a three-dimensional volume region of interest of the breast tumor, namely a VOI, from the image data of the DCE-MRI; The multi-source feature extraction module is used for respectively extracting three types of features, namely a radiology feature, a deep learning feature and a habitat feature for representing the heterogeneity of the internal space of the tumor based on the VOI; The feature standardization and screening module is used for carrying out standardization, single factor screening, redundancy elimination and dimension reduction on the three types of features to obtain a fusion feature subset for modeling; And the fusion prediction module is used for inputting the fusion feature subset into a later fusion classifier and outputting the prediction probabilities and the prediction states of ER, PR, HER and Ki-67 respectively.
  2. 2. The system of claim 1, wherein the image data acquisition and importing module selects gadolinium-based contrast agent injection and then dynamically enhances phase 2 as image data for analysis input, wherein the VOI is obtained in the tumor segmentation module based on manual segmentation or automatic segmentation based on deep learning and manual correction thereof, wherein the manual segmentation is completed in software ITK-SNAP, the automatic segmentation is completed in a learning model U-Net, and a later fusion classifier in the fusion prediction module is a support vector machine SVM.
  3. 3. The system of claim 1, further comprising a molecular subtype inference module for use in the fusion model application for inferring molecular subtypes of breast cancer according to a predetermined typing rule based on predicted states of ER, PR, HER and Ki-67, the molecular subtypes including HR positive, HER2 positive, and TNBC.
  4. 4. The system of claim 1, further comprising a report output module for the fusion model application for outputting a predictive report containing "molecular marker status, molecular subtype, and neoadjuvant therapy NAT preference hint".
  5. 5. The system of claim 1, wherein the radiological features comprise shape features, first-order intensity statistics, and texture matrix features, the deep learning features are obtained by encoding the VOI by a convolutional neural network, the habitat features are obtained by computing local descriptors at a voxel level and dividing the tumor interior into a plurality of habitat sub-regions using unsupervised clustering, the unsupervised clustering being K-means clustering, and extracting the radiological features again within each habitat sub-region to form a habitat feature.
  6. 6. The system of claim 1, wherein 19 local descriptors for habitat partitions are categorized into five classes, specifically a first class of first-level intensity statistics including entropy, mean absolute deviation, median, a second class of gray level co-occurrence matrix features including difference average, difference entropy, difference variance, information correlation metric 1, information correlation metric 2, inverse variance, joint energy, joint entropy, and entropy, a third class of gray level run-length matrix features including long run emphasis, run Cheng Shang, run Cheng Fangcha, a fourth class of gray level size region matrix features including normalized size region non-uniformity, small area high gray emphasis, a fifth class of neighborhood gray level difference matrix features including contrast, intensity.
  7. 7. A construction method for constructing a fusion model for classifying and typing prediction of a breast cancer patient, comprising constructing a fusion model using the system according to any one of claims 1 to 6, and comprising: s1, acquiring DCE-MRI image data of a breast cancer patient; s2, performing tumor three-dimensional VOI segmentation on the DCE-MRI image data to obtain a three-dimensional mask of a tumor area; s3, respectively extracting radiology characteristics, deep learning characteristics and habitat characteristics based on the VOI; s4, carrying out standardization, screening and dimension reduction on the features to obtain a fusion feature subset; S5, inputting the fusion feature subset into a later fusion classifier to respectively obtain the prediction probability and the prediction state of ER, PR, HER and Ki-67, and obtaining four fusion models of ER, PR, HER and Ki-67 of the breast cancer patient based on MRI.
  8. 8. The method of claim 7, further comprising a preprocessing step of the DCE-MRI image data, the preprocessing step including resampling and intensity normalization in step S1.
  9. 9. The construction method according to claim 7, wherein in step S5, the determination threshold of ER positive and PR positive is equal to or greater than 1% of the positive ratio of tumor cell nuclei, the determination threshold of Ki-67 high expression is equal to or greater than 20% of Ki-67, and HER2 positive is immunohistochemical 3+ or FISH amplification positive.
  10. 10. The construction method according to claim 7, wherein, in steps S3 to S5, The radiological features selected in the ER fusion model comprise large-dependency high-gray-scale emphasis features in a gray-scale dependency matrix and minimum-value features in first-level statistics, and the habitat features selected in the ER fusion model comprise joint energy features in a gray-scale co-occurrence matrix of a habitat subarea I, long-run low-gray-scale emphasis features in a gray-scale run-length matrix of the habitat subarea I, difference variance features in a gray-scale co-occurrence matrix of a habitat subarea III, root mean square features in the first-level statistics of the habitat subarea III, long-run low-gray-scale emphasis features in a gray-scale run-length matrix of the habitat subarea III and gray-scale non-uniformity features in a gray-scale area matrix of the habitat subarea IV; the radiological features selected in the PR fusion model comprise intensity features in a neighborhood gray scale difference matrix, and the habitat features selected in the PR fusion model comprise roughness features in the neighborhood gray scale difference matrix of the habitat subarea I, flatness features in the shape features of the habitat subarea III and surface area features in the shape features of the habitat subarea IV; The radiological features selected in the HER2 fusion model comprise sphericity features in shape features, and the habitat features selected in the HER2 fusion model comprise maximum correlation coefficient features in a gray level co-occurrence matrix of a habitat subregion IV; The radiological features selected in the Ki-67 fusion model comprise elongation features in shape features, and the habitat features selected in the Ki-67 fusion model comprise region percentage features in a gray scale size region matrix of a habitat sub-region II and flatness features in shape features of a habitat sub-region III.

Description

System and construction method for constructing fusion model for breast cancer patient classification typing prediction Technical Field The invention belongs to the field of medical detection, and particularly relates to a system and a construction method for constructing a fusion model for classification and parting prediction of breast cancer patients. Background Breast cancer has significant molecular heterogeneity, with different molecular subtypes corresponding to disparate neoadjuvant therapy (NAT) pathways, drug combinations, and risk of recurrence. Breast cancer is generally divided into three molecular subtypes, the first is HR positive, namely hormone receptor positive, wherein HR comprises ER, namely estrogen receptor and PR, namely progestin receptor, HR positive patients are generally followed by endocrine therapy, the second is HER2 positive, namely human epidermal growth factor receptor positive, HER2 positive patients are generally followed by targeted therapy, and the third is TNBC, namely triple negative breast cancer, specifically ER, PR, HER2 is negative. The rate of pathologic complete remission (pCR) has been widely used as a key outcome indicator for NAT because patients who reach pCR have better survival rates for subsequent no-event, distant recurrence rates, survival benefits. Numerous clinical studies have demonstrated that HER2 positive and Triple Negative Breast Cancers (TNBC) tend to be highly sensitive to NAT regimens consisting of platinum-containing chemotherapy, anti-HER 2 targeting agents, immune checkpoint inhibitors (e.g., PD-1/PD-L1 inhibitors), etc., relatively high pCR can be achieved, and that even if pCR is not achieved, both high risk subtypes have a mature enhancement strategy post-operatively, specifically, patients who are HER2 positive but still have residual tumors post-NAT can be treated with post-operative enhancement such as T-DM1, while patients with TNBC still have residual lesions can consider enhanced chemotherapy such as capecitabine. In contrast, the pCR rate is generally lower in breast cancer patients of HR positive subtype after receiving NAT. That is, if the "high risk population of true HER2 positive/true TNBC" can be accurately identified before the start of treatment of a breast cancer patient, i.e. the breast cancer patient is accurately typed, the patient who is most likely to have a high response and still has a subsequent strengthening route can be preferentially screened out, and the personalized NAT channel can be walked in advance. The current clinical decisions on breast cancer molecular subtypes are highly dependent on immunohistochemical/FISH and other histological tests, including ER, PR, HER and Ki-67 proliferation index. These criteria determine whether NAT is recommended, whether dual-target anti-HER 2 regimens are used, whether combined immunotherapy, and whether post-operative intensive therapy (e.g., T-DM1, capecitabine, etc.) is required. The method has three key defects, namely, the biopsy needs to be punctured, the sampling limitation is that the puncture only takes part of the tumor, the local heterogeneity of the whole tumor cannot be represented, the real HER2 positive or high proliferation area is easy to leak, for example, in breast tumors with uneven HER2 expression, the puncture sample is likely to be false negative, namely, the pathological result is a conclusion that the point sampling is carried out, the NAT scheme really acts as the whole tumor, and the third is that the time cost is high, the pathological/immunohistochemical report needs time, and the time for a high-risk patient to enter the most suitable NAT scheme can be delayed. There are also prior art techniques for typing molecular subtypes of breast cancer patients using imaging information such as MRI. DCE-MRI images are themselves three-dimensional information of "whole tumors" covering the breast and armpits. However, most existing radiological/deep learning studies are based on extracting the average texture or morphological features of a single ROI from a whole tumor, or single-center, small sample exploration. That is, although the existing imaging typing method can solve the problems of invasiveness and high time cost, the existing imaging typing method does not solve the problem of spatial heterogeneity of tumor, and the existing imaging typing method does not use pCR as a true downstream efficacy endpoint to prove whether the "model predicted typing" is truly consistent with efficacy. Thus, there is a need in the art for a new method of typing molecular subtypes in breast cancer patients or for a new system and method of constructing models for classifying and typing predictions in breast cancer patients. To solve the above-mentioned problems in the prior art, the inventors considered how to accurately screen the "neoadjuvant therapy preferred population", especially the highly responsive subtypes HER2 positive and TNBC, in advance by relying onl