Search

CN-122024859-A - Method, system, equipment and medium for detecting drug resistance of klebsiella pneumoniae

CN122024859ACN 122024859 ACN122024859 ACN 122024859ACN-122024859-A

Abstract

The invention relates to the technical field of computer information processing, in particular to a method, a system, equipment and a medium for detecting the drug resistance of klebsiella pneumoniae, which comprise the steps of acquiring and processing original mass spectrum data of a strain to be detected, determining topology persistence therefrom to screen characteristic peaks and constructing an ordered characteristic sequence; generating view pairs by data enhancement, mapping the view pairs into position codes and intensity embedded tensors, combining embedded marks to construct an input sequence, inputting the input sequence into an interpretable contrast learning network, optimizing characterization consistency by using an attention mechanism and contrast learning, updating a feature extractor, freezing extractor parameters, accessing a classification head, and adjusting by using a labeling sample to realize judgment of collaborative characterization. Therefore, the invention utilizes the attention mechanism and the aggregation logic to eliminate the black box attribute in the decision process by combining the topology persistence feature extraction and the interpretable contrast learning network, thereby realizing the drug resistance detection with high accuracy and clinical reliability.

Inventors

  • SU LI
  • GUO FENGKAI
  • Wu Cunjin
  • Jia Yuchi
  • LU JING

Assignees

  • 北京化工大学
  • 天津医科大学第二医院

Dates

Publication Date
20260512
Application Date
20260413

Claims (10)

  1. 1. The method for detecting the drug resistance of klebsiella pneumoniae is characterized by comprising the following steps: acquiring and preprocessing original mass spectrum data of a strain to be detected to obtain a mass spectrum signal sequence; Determining topology persistence based on the difference value of the local maximum value point in the mass spectrum signal sequence relative to the higher one of the adjacent local minimum value points at two sides, and screening characteristic peaks according to the topology persistence to construct an ordered characteristic sequence; performing enhancement processing on the ordered feature sequence to obtain enhancement view pairs, mapping each enhancement view into a position coding tensor and an intensity embedding tensor respectively, and constructing an input sequence by combining the embedding marks; Inputting the input sequence into an interpretable contrast learning network comprising a feature extractor and a projection head, modeling spatial dependence and co-expression association between feature peaks by using an attention mechanism, performing mapping on association information obtained by global aggregation of embedded marks by the projection head to obtain contrast characterization vectors, and optimizing consistency of the contrast characterization vectors by using contrast learning to update feature extractor parameters; freezing the parameters of the feature extractor, removing the projection head, accessing the classification head, performing supervision adjustment by using the labeling samples, performing mapping on the associated information re-aggregated by the embedded marks through the classification head to obtain a collaborative characterization vector, and performing discrimination on the collaborative characterization vector to output a drug resistance detection result.
  2. 2. The method for detecting drug resistance of klebsiella pneumoniae of claim 1, wherein obtaining and preprocessing raw mass spectrum data of a strain to be detected to obtain a mass spectrum signal sequence comprises: Acquiring protein fingerprints of the strain by using a mass spectrometer, and resampling the protein fingerprints of different samples to align all the samples to a mass-to-charge ratio axis to generate original mass spectrum data; Performing local polynomial least square fitting on the original mass spectrum data in a preset sliding window by adopting a polynomial smoothing filtering algorithm, and updating the amplitude of each data point in the original mass spectrum data through convolution mapping to generate smoothed mass spectrum data; Performing background estimation on the smooth mass spectrum data by adopting an asymmetric weighting punishment least square algorithm, calculating residual distribution by utilizing an estimated baseline which is initially established and the smooth mass spectrum data amplitude value in the iterative calculation process of the background estimation so as to distribute asymmetric weights, and carrying out weighting punishment on the smooth mass spectrum data by utilizing the asymmetric weights so as to construct a background baseline, and subtracting the background baseline from the smooth mass spectrum data so as to extract a real ion intensity signal; And carrying out normalization processing and characteristic scaling processing on the real ion intensity signal after background deduction to obtain a mass spectrum signal sequence consisting of a data tag, a mass-to-charge ratio and corresponding ion intensity.
  3. 3. The method for detecting resistance to klebsiella pneumoniae of claim 2, wherein determining the topological persistence based on a difference between a local maximum point in a mass spectrum signal sequence and a higher one of two adjacent local minimum points comprises: Traversing each data sampling point in a mass spectrum signal sequence, comparing the ionic strength of the current sampling point with the ionic strength of the previous adjacent point and the ionic strength of the next adjacent point, identifying local maximum points with ionic strength larger than that of the adjacent points on two sides, and recording corresponding sampling position indexes to construct a candidate peak set; For each candidate peak in the candidate peak set, performing bidirectional gradient detection along a mass-to-charge ratio axis by taking a corresponding sampling position index as a starting point until sampling points with the first ionic strength smaller than that of adjacent points on two sides are respectively acquired, and determining the sampling points as a left local minimum point and a right local minimum point which are closest to the candidate peak point; Acquiring the intensity values of the left local minimum point and the right local minimum point, comparing the amplitudes, and selecting the intensity of the side with higher amplitude as the terrain reference boundary intensity representing the local waveform envelope; and calculating the intensity difference between the candidate peak points and the terrain reference boundary to quantify the topological height of the candidate peak points relative to the local waveform envelope, and obtaining the topological persistence corresponding to each candidate peak point.
  4. 4. The method for detecting resistance to klebsiella pneumoniae according to claim 3, wherein the step of screening characteristic peaks according to topology persistence to construct an ordered characteristic sequence comprises: Extracting candidate peak points with the topology persistence greater than zero as effective characteristic peaks reflecting the fingerprint heterogeneity of the discrete drug-resistant protein of the klebsiella pneumoniae, and executing bit sequence rearrangement on the effective characteristic peaks according to the topology persistence value from large to small; Intercepting a preset number of effective characteristic peaks before according to the sequence after bit sequence rearrangement to form a fixed-length characteristic set; when the total number of the effective characteristic peaks does not reach the preset input quantity, zero value bit filling mapping is carried out on the fixed-length characteristic set by using virtual zero peak nodes composed of zero position mass-to-charge ratio and zero position topology persistence until the fixed-length characteristic set meets the preset input quantity; And extracting the mass-to-charge ratio and the topology persistence of each effective characteristic peak in the fixed-length characteristic set meeting the preset input quantity, and assembling according to the sequence positions after bit order rearrangement to generate an ordered characteristic sequence represented by a numerical value pair corresponding to the mass-to-charge ratio and the topology persistence.
  5. 5. The method for detecting resistance to klebsiella pneumoniae according to any one of claims 1 to 4, wherein performing enhancement processing on the ordered feature sequences to obtain enhanced view pairs, mapping each enhanced view to a position-coding tensor and an intensity-embedding tensor, respectively, and constructing an input sequence in combination with an embedding label, comprises: Executing random peak loss processing on the ordered characteristic sequences, determining the removal quantity according to a preset masking proportion, executing non-return random sampling to remove characteristic peaks of the corresponding bit sequences, and generating the masking processed characteristic sequences; Performing amplitude disturbance processing on topology persistence components in the feature sequence subjected to masking processing, and obtaining a semantically consistent enhancement view pair by superposing random noise with zero mean and positive correlation between standard deviation and average topology persistence and performing numerical correction based on zero lower limit discrimination; extracting topological persistence components of each characteristic peak in the enhancement view pair, converting the topological persistence components into characteristic intensity scalar quantities representing peak response amplitude values, mapping the characteristic intensity scalar quantities into characteristic representation space with multidimensional components by executing linear space dimension transformation, and generating intensity embedding tensors; Extracting the mass-to-charge ratio of each characteristic peak in the enhancement view pair, executing periodic coordinate mapping based on sine-cosine logic, synthesizing a position space vector with multidimensional components by utilizing periodic dimensional components generated by mapping, and generating a position coding tensor reflecting the space distribution characteristics of the mass-to-charge ratio; Initializing an embedded mark representing full-spectrum global associated information, splicing the embedded mark to the first position of a combined sequence of an intensity embedded tensor and a position coding tensor, and constructing an input sequence through dimension cascading.
  6. 6. The method for detecting resistance to klebsiella pneumoniae according to any one of claims 1 to 4, wherein inputting the input sequence into an interpretable contrast learning network comprising a feature extractor and a projection head, modeling spatial dependence and co-expression correlation between feature peaks using an attention mechanism, performing mapping on correlation information obtained by global aggregation of embedded markers by the projection head to obtain contrast characterization vectors, optimizing consistency of the contrast characterization vectors using contrast learning to update feature extractor parameters, comprising: performing parallel interactive mapping on feature vectors containing mass-to-charge ratio position information and response intensity information in an input sequence by utilizing a multi-head attention mechanism in a feature extractor, and modeling spatial dependence and co-expression correlation among feature peaks by calculating correlation weights among different feature vectors and performing weighted summation; Performing position-by-position nonlinear transformation on the sequence obtained by weighting and summing by using a position feedforward network in the feature extractor, and combining layer normalization processing to extract an embedded mark corresponding vector output by the last layer of the feature extractor as a global feature vector for aggregating global associated information; Performing dimension scale adjustment on the global feature vector by utilizing a linear adapter layer in the projection head, inputting the global feature vector into a multi-layer perceptron to perform nonlinear mapping so as to obtain contrast characterization vectors corresponding to different enhancement views in a feature space; cosine similarity between enhanced view pairs is calculated by using the contrast characterization vector, and scaling processing of similarity distribution is performed by introducing temperature super parameters so as to construct normalized consistency loss indexes for measuring the consistency of the distribution between the enhanced views; and performing characterization consistency optimization by taking the loss index as a target, and synchronously updating the weight parameters of the feature extractor and the projection head by utilizing the gradient of the counter-propagation calculation loss index to the network model parameters.
  7. 7. The method for detecting resistance to klebsiella pneumoniae according to claim 2, wherein freezing the feature extractor parameters, removing the projection head, accessing the classification head, performing a supervised adjustment using the labeling samples, performing mapping on the associated information re-aggregated by the embedding markers by the classification head to obtain a collaborative characterization vector, and performing discrimination on the collaborative characterization vector to output a resistance detection result, comprising: Setting model parameters of a feature extractor to be in an unclonable state so as to lock collaborative correlation modeling capability among feature peaks obtained by contrast learning, removing a projection head, and accessing a classification head comprising a linear layer at an output end of the feature extractor; Inputting a klebsiella pneumoniae drug-resistant sample sequence with a data tag into a feature extractor, performing parallel mapping on feature vectors by utilizing a multi-head attention mechanism, and performing global aggregation on feature components reflecting global associated information by using embedded marks to obtain feature vectors aggregating global associated information; Mapping the feature vector of the aggregated global association information into a collaborative characterization vector for discriminating drug resistance by using a classification head, and constructing a classification loss index for measuring discrimination accuracy based on the difference between the collaborative characterization vector and the drug resistance class in the data tag; keeping the parameters of the feature extractor fixed, and carrying out optimization iteration on model parameters of the classification head only by utilizing back propagation according to the classification loss index so as to realize supervision and adjustment on the judgment capability of the collaborative characterization vector; and performing class probability mapping on the collaborative characterization vector by using the classification head after supervision and adjustment, and outputting a drug resistance detection result of klebsiella pneumoniae.
  8. 8. A klebsiella pneumoniae drug resistance detection system, comprising: the pretreatment module is used for acquiring and preprocessing original mass spectrum data of the strain to be detected to obtain a mass spectrum signal sequence; The characteristic screening module is used for determining topology persistence based on the difference value of the local maximum value point in the mass spectrum signal sequence relative to the higher one of the adjacent local minimum value points at two sides, and screening characteristic peaks according to the topology persistence so as to construct an ordered characteristic sequence; The input construction module is used for performing enhancement processing on the ordered characteristic sequence to obtain enhancement view pairs, mapping each enhancement view into a position coding tensor and an intensity embedding tensor respectively, and constructing an input sequence by combining the embedding marks; The model training module is used for inputting an input sequence into an interpretable contrast learning network comprising a feature extractor and a projection head, modeling spatial dependence among feature peaks and co-expression association by using an attention mechanism, performing mapping on association information obtained by global aggregation of embedded marks by the projection head to obtain contrast characterization vectors, and optimizing consistency of the contrast characterization vectors by using contrast learning to update feature extractor parameters; The classification recognition module is used for freezing the parameters of the feature extractor, removing the projection head, accessing the classification head, performing supervision adjustment by using the labeling sample, performing mapping on the associated information re-aggregated by the embedded marks through the classification head to obtain the collaborative characterization vector, and performing discrimination on the collaborative characterization vector to output a drug resistance detection result.
  9. 9. A klebsiella pneumoniae resistance detection system apparatus, comprising: At least one processor; And a memory communicatively coupled to the at least one processor; Wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the klebsiella pneumoniae resistance detection method of any one of claims 1-7.
  10. 10. A computer-readable storage medium having stored thereon computer-executable instructions, which when executed by a processor, implement a method of detecting klebsiella pneumoniae resistance according to any one of claims 1-7.

Description

Method, system, equipment and medium for detecting drug resistance of klebsiella pneumoniae Technical Field The invention relates to the technical field of computer information processing, in particular to a method, a system, equipment and a medium for detecting drug resistance of klebsiella pneumoniae. Background Klebsiella pneumoniae (Klebsiella pneumoniae, KP) as an important gram-negative pathogen can cause serious infections such as pneumonia and septicemia. With the worldwide prevalence of carbapenem-resistant klebsiella pneumoniae (CRKP), clinical treatment presents a significant challenge. Because of extremely high mortality after CRKP infection, rapid and accurate identification of bacterial strain resistance is important for guiding clinical rational medication. At present, the technology based on matrix assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) has become a mainstream scheme for microorganism identification because of high detection speed and low cost. However, there are significant shortcomings in the prior art when using mass spectral data to construct deep-learning classification models. Because the traditional feature extraction usually adopts a uniform box division method or cluster-based feature center extraction, the resolution of the feature extraction has a limitation, so that key adjacent peaks are easy to be combined by mistake. Meanwhile, because the mass spectrometer is easy to be disturbed by noise, the peak position often drifts, and the clustering center is highly dependent on the distribution of a training set, when new strains or variant strains appear in the testing set, the model is difficult to effectively capture the unique mass spectrum characteristics of the new strains or variant strains, so that the characteristic information is lost, the classification performance is reduced, and the generalization capability is weak. In addition, because the traditional supervised learning model is highly dependent on a high-quality sample set marked by a standard drug sensitivity test, and the cost for acquiring large-scale marked data in clinic is high and the time consumption is huge, under the condition of small sample size, the training of the existing model is often insufficient, and the robust feature representation is difficult to learn by using massive unmarked data. In addition, the existing deep learning classification model is mostly regarded as a 'black box', the decision process is invisible, key biomarkers related to drug resistance cannot be revealed, and the trust and the application depth of the model in medical auxiliary decision are limited due to lack of clinical interpretability. Disclosure of Invention First, the technical problem to be solved In view of the above-mentioned shortcomings and disadvantages of the prior art, the invention provides a method, a system, a device and a medium for detecting the drug resistance of klebsiella pneumoniae, which solve the technical problems that the generalization performance of a model is limited and clinical decision trust is difficult to obtain when complex variant strains are processed due to the insufficient feature extraction resolution and noise immunity, high dependence on large-scale labeling samples and lack of interpretability in a prediction process in the existing technology for detecting the drug resistance of klebsiella pneumoniae based on mass spectrum data. (II) technical scheme In order to achieve the above purpose, the main technical scheme adopted by the invention comprises the following steps: According to the first aspect, the embodiment of the invention provides a method for detecting the drug resistance of klebsiella pneumoniae, which comprises the steps of obtaining and preprocessing original mass spectrum data of strains to be detected to obtain a mass spectrum signal sequence, determining topology persistence based on a difference value of a local maximum value point in the mass spectrum signal sequence relative to a higher one of two adjacent local minimum value points, screening characteristic peaks according to the topology persistence to construct an ordered characteristic sequence, performing enhancement processing on the ordered characteristic sequence to obtain enhancement view pairs, mapping each enhancement view to a position coding tensor and an intensity embedding tensor respectively, constructing an input sequence by combining an embedding mark, inputting the input sequence into an interpretable contrast learning network comprising a characteristic extractor and a projection head, performing mapping on associated information obtained by global aggregation of the embedding mark by using an attention mechanism to obtain a contrast characterization vector, optimizing the consistency of the contrast characterization vector by using contrast learning to update the characteristic extractor parameter, freezing the characteristic extractor parameter, r