Search

CN-121999916-A - Multi-view drug synergistic effect prediction model construction method based on contrast learning framework

CN121999916ACN 121999916 ACN121999916 ACN 121999916ACN-121999916-A

Abstract

The invention discloses a multi-view drug synergistic effect prediction model construction method based on a contrast learning framework, which comprises the steps of integrating drug, cell lines and protein multi-source biological information to construct a unified mixed graph; extracting a high-order collaborative association hypergraph view and a protein interaction context view, wherein the high-order collaborative association hypergraph view adopts a HGNN-ECA module, the protein interaction context view adopts a double-aggregator encoder to extract local structure dependent embedded features of an entity, the embedded features of the same entity in the double-view are aligned through contrast learning, the contrast loss of medicines and cell lines is minimized, fusion features are obtained, the fusion features are input into a double-linear attention module, and the multi-layer perceptron is used for predicting the collaborative score. The invention integrates multi-view complementary information, breaks through the limitation that the traditional method neglects higher order relation and multi-source interaction, constructs a medicine collaborative prediction framework with high prediction precision and strong generalization capability, and is beneficial to systematically identifying clinical potential collaborative medicine combinations.

Inventors

  • XING JIA
  • LI XIAN
  • YANG ZHENGUO
  • JIANG SHAN

Assignees

  • 青岛酒店管理职业技术学院

Dates

Publication Date
20260508
Application Date
20260104

Claims (10)

  1. 1. A multi-view drug synergistic effect prediction model construction method based on a contrast learning framework is characterized by comprising the following steps: s1, integrating multi-source biological information of drugs, cell lines and proteins to construct a unified mixed graph; S2, respectively extracting a high-order collaborative association hypergraph view and a protein interaction context view based on the hyperedge and the common edge based on the unified mixed graph; S3, capturing global collaborative embedding characteristics of an entity by adopting a HGNN-ECA module formed by a hypergraph neural network and effective channel attention, and extracting local structure dependent embedding characteristics of the entity by adopting a double-aggregator encoder formed by a LightGCN module and a bilinear aggregator by adopting a protein interaction context view; s4, introducing a contrast learning mechanism, aligning spatial representation of the same entity in the double views, and optimizing embedding characteristics of double-view learning by minimizing a cross-view contrast loss function of the medicine and the cell line; S5, fusing the embedded features of the same entity which are learned and optimized in the double views, inputting the fused features into a bilinear attention module to capture nonlinear paired interaction information of the internal views of the drug-cell line triplets, and predicting the synergy score of the drug combination by using a multi-layer perceptron.
  2. 2. The method of claim 1, wherein the unified hybrid map is represented as Wherein: Node set Comprising three classes of entities, drugs, cell lines and proteins; edge set Including drug-cell line association supersides, drug-protein association common sides, cell line-protein association common sides, and protein-protein interaction common sides.
  3. 3. The method of claim 1, wherein the high-order collaborative association hypergraph view is constructed based on a hypergraph structure that characterizes high-order associations between entities by defining drug-cell line triplet hyperedges, expressed as Wherein, the For hypergraph node sets, both drug and cell line entities are contemplated, Represents the hyperedge set defined by drug-cell line triplets, For the super-edge weight, the node-super-edge membership is expressed as an incidence matrix 。
  4. 4. The method of claim 1, wherein the S3 higher-order collaborative association hypergraph view employs a HGNN-ECA module capture entity global collaborative embedding feature consisting of a hypergraph neural network and effective channel attention, comprising the steps of: s31, aggregating multi-entity information based on a convolution layer in the hypergraph neural network, wherein the multi-entity information is expressed as follows: Wherein, the Is the first The node feature matrix learned by the layer convolution layer, For the function to be activated by the ReLU, And (3) with The diagonal matrix is composed of node degree and superside degree respectively, In order for the matrix of parameters to be learnable, Inputting features for a previous layer; S32, introducing the attention of the effective channel, and performing weighted calculation, wherein the weighted calculation is expressed as follows: Wherein, the HGNN-ECA module The node feature matrix of the layer learning, Representing the multiplication by element, The function is activated for Sigmoid, For the global averaging pooling operation, Is a one-dimensional convolution operation; S33, pass by Layer propagation, each node obtains higher order context information from multiple supersides, obtains drug-cell line triplets The embedded features of the internal entity are expressed as: Wherein, the Is a medicine Medicament And cell lines The corresponding global collaborative embedding feature.
  5. 5. The method of claim 4, wherein the protein interaction context view in S3 employs a dual aggregator encoder consisting of LightGCN modules and a bilinear aggregator to extract local structure-dependent embedded features of entities, comprising the steps of: s34, aiming at the drug node to be detected and the cell line node to be detected, searching the protein node related to the drug node to be detected in the unified mixed graph, and extracting a corresponding local protein interaction subgraph according to a preset neighborhood depth; s35, lightGCN module adopts linear neighborhood aggregation to gather basic features of protein neighbors to a central node, and the basic features are expressed as: Wherein, the As a central node The features after the LightGCN modules are aggregated, Is a node Is a set of protein neighbors of a group (a), Is a node Is used to determine the number of neighbor nodes, Is a protein neighbor node Is characterized by; s36, modeling paired interactions between protein neighbors by the bilinear aggregator, wherein the paired interactions are expressed as: Wherein, the As a feature of the central node after bilinear cross-aggregation, For the bilinear mapping matrix, Representing element-by-element multiplication, for capturing second order interactions between neighboring nodes, Activating a function for Sigmoid; S37, obtaining the embedded characteristic of the central node by carrying out weighted fusion on LightGCN module linear structure information and bilinear interaction information of a bilinear aggregator, wherein the embedded characteristic is formed by the following steps: Wherein the method comprises the steps of Is a fusion weight coefficient; S38, acquiring a drug-cell line triplet based on a characteristic learning mode of a central node in the protein interaction context view The local structure of the internal entity depends on embedded features, expressed as: Wherein, the Is a medicine Medicament And cell lines The corresponding local structure depends on the embedded features.
  6. 6. The method according to claim 1, wherein the aligning the spatial representation of the same entity in the dual view in S4 by contrast learning mechanism, optimizing the embedded features of dual view learning by minimizing cross-view contrast loss functions of drug and cell lines, comprises: The embedded features learned by the same drug or cell line in the high-order collaborative association hypergraph view and the protein interaction context view are taken as positive sample pairs, all other physical features are taken as negative samples, the drug or cell line contrast loss is calculated, the representation similarity of the positive sample pairs is maximized, the representation similarity of the negative sample pairs is minimized, and the embedded feature alignment and feature discriminant enhancement in the double views are realized.
  7. 7. The method according to claim 6, wherein: the drug contrast loss is expressed as: the cell line contrast loss is expressed as: the overall loss is expressed as: Wherein the method comprises the steps of As a function of the cosine similarity, Is a temperature coefficient, and is characterized by comparison, study and optimization Becomes as follows 。
  8. 8. The method of claim 5, wherein the bilinear attention module in S5 models a drug-cell line triplet Internal medicine Medicament And cell lines Is expressed as: Wherein, the As a matrix of bilinear parameters that can be learned, Respectively representing fusion characteristics of two drug nodes and cell line nodes in the triplet, embedding the characteristics by a high-order collaborative association hypergraph view after comparison, learning and optimization Contextual view embedding features with proteins The splicing fusion is obtained by the following steps: Wherein, the After the multi-layer perceptron mapping, the final collaborative prediction probability of the drug-cell line triplet is generated For evaluating drugs Medicament In cell lines Is a synergistic effect of (a).
  9. 9. The method of claim 1, wherein the predictive model final loss function is defined as: Wherein, the As a binary cross-entropy loss function, Controlling the weight of contrast learning loss function, model by minimizing overall loss And (5) completing end-to-end training.
  10. 10. The method of any one of claims 1-9, wherein the method performs a classification prediction on a DrugCombDB reference dataset, wherein a drug co-label is defined as positive by a zero interaction efficacy score >0 and < 0 is defined as negative.

Description

Multi-view drug synergistic effect prediction model construction method based on contrast learning framework Technical Field The invention relates to the technical field of medicine synergistic effect prediction, in particular to a multi-view medicine synergistic effect prediction model construction method based on a contrast learning framework. Background The drug combination therapy has become an important means for treating complex diseases such as cancers by virtue of the remarkable advantages of improving the treatment effect, reducing the toxicity of single drugs, delaying the generation of drug resistance and the like. Along with the rapid development of biomedical technology, massive multi-source biological information such as drug target information, cell line gene expression data, protein interaction data and the like is continuously accumulated, a rich data base is provided for predicting the drug synergistic effect through a calculation method, and the calculation prediction of the drug synergistic effect becomes a research hotspot in the field of biological medicine. Traditional drug synergistic effect prediction methods mainly focus on modeling of pairwise relationships between drug-drug or drug-cell lines, and infer synergistic effects by analyzing single-dimensional information such as drug molecular structural similarity, cell line gene expression differences and the like. However, the generation of a drug synergistic effect is a complex biological process involving multiple entities such as drugs, cell lines, proteins, etc., and not only involves binary interactions between entities, but also has higher-order association relationships such as drug-cell lines, etc., and is influenced by topological neighborhood relationships in protein interaction networks and local intermolecular structure dependencies. The existing method has obvious limitations that firstly, the high-order synergistic relationship between the drug combination and the cell line is ignored, complex biological mechanisms generated by multi-entity interaction are difficult to capture, and secondly, complementary characteristics in multi-source biological information cannot be fully fused, dependence on a protein-mediated local structure is not characterized enough, so that the prediction precision and generalization capability of a model are limited. And thirdly, the part of methods lack effective alignment and optimization on different view information in the feature learning process, so that the extracted features are not strong in discrimination, and the biological consistency among entities is difficult to accurately reflect. For example, the application number 202411715115.9 discloses a method for constructing a multimode comparison drug synergistic prediction model, a prediction method and a device, and by adopting the application scheme, the influence of invalid characteristics which have no effect or negative influence on drug synergistic effect prediction in any mode on drug prediction can be effectively reduced, and the problem that the accuracy of a prediction result is influenced due to the fact that equal treatment strategies are adopted in all modes in the related technology is avoided. However, the scheme simultaneously exists, the multi-level characteristic expression capability is insufficient, the high-order characteristic expression capability is poor, and the node embedding discrimination capability is required to be improved. Therefore, in reality, a method for predicting the drug synergistic effect, which can comprehensively integrate high-order topological relation and local protein context information, effectively capture a complex synergistic action mechanism through a multi-view learning and feature optimization mechanism, is needed, so that the limitation of the traditional method is broken through, the accuracy, the robustness and the generalization capability of a prediction model are improved, and powerful support is provided for systematic identification of clinical potential synergistic drug combinations. Disclosure of Invention Aiming at the problems, the invention aims to provide a multi-view drug synergistic effect prediction model construction method based on a contrast learning framework, which realizes the synergistic effect prediction of drug combinations in a cell line by constructing a high-order synergistic association hypergraph view and a protein interaction context view and combining multi-view contrast learning and a bilinear attention mechanism. The embodiment of the invention provides a multi-view drug synergistic effect prediction model construction method based on a contrast learning framework, which comprises the following steps: s1, integrating multi-source biological information of drugs, cell lines and proteins to construct a unified mixed graph; S2, respectively extracting a high-order collaborative association hypergraph view and a protein interaction context view based