Search

CN-116401448-B - Data encoding method, device, electronic equipment and medium

CN116401448BCN 116401448 BCN116401448 BCN 116401448BCN-116401448-B

Abstract

The invention discloses a data coding method, a data coding device, electronic equipment and a medium, and relates to the technical field of computers. The method comprises the steps of constructing a first graph network based on a first search path and first position information of a job seeker, wherein the first position information is position information in recruitment information clicked by the job seeker, constructing a second graph network based on a second search path and second position information of the recruiter, wherein the second position information is job seeker information in resume information clicked by the recruiter, fusing the first graph network and the second graph network to obtain a fused graph network, sampling nodes in the fused graph network based on a random walk strategy to obtain a plurality of node sequences, and performing vector learning by using the plurality of node sequences to generate coding vectors of the nodes in the fused graph network. The method combines bilateral information of recruitment end and job application end to construct a graph network, improves correlation among search words, improves coding accuracy, and is beneficial to improving effects of follow-up task algorithms.

Inventors

  • HUANG YONGCONG

Assignees

  • 北京五八赶集信息技术有限公司

Dates

Publication Date
20260512
Application Date
20230314

Claims (11)

  1. 1. A method of encoding data, comprising: Constructing a first graph network based on a first search path of a job hunting terminal and first position information related to the job hunting terminal, wherein the first position information is position information in recruitment information clicked by the job hunting terminal; Constructing a second graph network based on a second search path of a recruitment terminal and second position information related to the recruitment terminal, wherein the second position information is job seeking information in resume information clicked by the recruitment terminal; fusing the first graph network and the second graph network to obtain a fused graph network; Normalizing the weight of the connecting edge in the fusion graph network, including: normalizing the weight of a connecting edge between the starting node and the adjacent node, and determining the direction of the connecting edge to be pointed to the adjacent node by the starting node; sampling nodes in the fusion graph network based on a random walk strategy to obtain a plurality of node sequences, wherein the sampling of the nodes in the fusion graph network is based on the random walk strategy, the direction of a connecting edge and the normalized weight to obtain a plurality of node sequences; And carrying out vector learning by using the plurality of node sequences to generate the coding vector of the nodes in the fusion graph network.
  2. 2. The method of claim 1, wherein the constructing a first graph network based on the first search path of the job seeker and the first job information related to the job seeker comprises: responding to input operation of a job hunting terminal, and acquiring search words input by the job hunting terminal; acquiring a first search path of the job hunting terminal based on the sequence of search words input by the job hunting terminal; responding to clicking operation of the job hunting terminal, determining recruitment information clicked by the job hunting terminal, and acquiring first position information in the recruitment information; and using the search words in the first search path and the first position information as nodes, connecting adjacent search words in the first search path, and connecting the first position information and the search words corresponding to the first position information to construct a first graph network.
  3. 3. The method according to claim 2, wherein the method further comprises: determining the weight of a connecting edge between search words in the first graph network as a first weight; And determining the weight of a connecting edge between the first position information and the search word corresponding to the first position information in the first graph network as a second weight, wherein the second weight is smaller than the first weight.
  4. 4. The method of claim 3, wherein the constructing a second graph network based on the recruiter second search path and second job information associated with the recruiter comprises: responding to the input operation of a recruitment terminal, and acquiring search words input by the recruitment terminal; Acquiring a second search path of the recruitment terminal based on the sequence of the search words input by the recruitment terminal; responding to clicking operation of the recruitment terminal, determining resume information clicked by the recruitment terminal, and acquiring second position information in the resume information; And using the search words in the second search path and the second position information as nodes, connecting the search words in the second search path, and connecting the second position information and the search words corresponding to the second position information to construct a second graph network.
  5. 5. The method according to claim 4, wherein the method further comprises: Determining the weight of a connecting edge between the search words in the second graph network as a third weight; And determining the weight of a connecting edge between the second position information and the search word corresponding to the second position information in the second graph network as a fourth weight, wherein the fourth weight is smaller than the third weight.
  6. 6. The method of claim 5, wherein the first weight is equal to the third weight.
  7. 7. The method of claim 5 or 6, wherein the fusing the first graph network and the second graph network comprises: Aiming at a first target search word connected with first position information in the first graph network, taking a job hunting end for inputting the first target search word as a target job hunting end, and determining whether a recruitment end for issuing the first position information clicks resume information of the target job hunting end; If the recruitment terminal for issuing the first position information clicks the resume information of the target job hunting terminal, taking the search word input by the recruitment terminal for issuing the first position information as a second target search word; And connecting the first target search word in the first graph network and the second target search word in the second graph network to fuse the first graph network and the second graph network.
  8. 8. The method of claim 7, wherein the method further comprises: Determining that a weight of a connection edge connecting the first graph network and the second graph network is a fifth weight, the fifth weight being greater than the first weight and the fifth weight being greater than the third weight.
  9. 9. A data encoding apparatus, comprising: The first construction module is used for constructing a first graph network based on a first search path of the job hunting terminal and first position information related to the job hunting terminal, wherein the first position information is position information in recruitment information clicked by the job hunting terminal; The second construction module is used for constructing a second graph network based on a second search path of the recruitment end and second position information related to the recruitment end, wherein the second position information is job seeking information in resume information clicked by the recruitment end; the fusion module is used for fusing the first graph network and the second graph network to obtain a fused graph network; the normalization module is used for normalizing the weight of the connecting edge in the fusion graph network; the sampling module is used for determining a starting node and adjacent nodes of the starting node, normalizing the weights of connecting edges between the starting node and the adjacent nodes, and determining the direction of the connecting edges to be from the starting node to the adjacent nodes; The sampling module is further used for sampling nodes in the fusion graph network based on a random walk strategy to obtain a plurality of node sequences, and the sampling module is used for sampling the nodes in the fusion graph network based on the random walk strategy, the direction of a connecting edge and the normalized weight to obtain a plurality of node sequences; And the coding module is used for carrying out vector learning by utilizing the plurality of node sequences and generating a coding vector of the nodes in the fusion graph network.
  10. 10. An electronic device, comprising: one or more processors; storage means for storing one or more programs, When executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-8.
  11. 11. A computer readable medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any of claims 1-8.

Description

Data encoding method, device, electronic equipment and medium Technical Field The present invention relates to the field of computer technologies, and in particular, to a data encoding method, apparatus, electronic device, and medium. Background Embedding refers to a method of representing an object with a vector of values. The figure is a basic and common structure, and many scenarios can be abstracted into a figure structure, such as the relationship between users and objects in an e-commerce platform. Graph Embedding (graph coding) refers to expressing nodes in a graph in the form of low-dimensional dense vectors, requiring similar nodes in the graph to be also close in low-dimensional expression space, and the resulting expression vectors can be used for downstream tasks such as node classification, link prediction, visualization, or reconstruction. In the search scene, the search words of the user construct a graph network structure through clicking, delivering and the like, a plurality of paths are established through random walk from one node, the search words on each path form a sentence, and then vector representations (namely coding vectors) far away from training search words are related through context. At present, most of search scenes are matched with objects, however, recruitment scenes have specificity, the recruitment scenes are matched with the objects, only search words at job seekers are considered in the prior art, the search words at recruiters are not considered, semantic information related to the recruiters is lost, the diversity of texts is lost, and vector representation accuracy obtained through training is not high, so that downstream data mining tasks are not facilitated. Disclosure of Invention In order to solve the above technical problems or at least partially solve the above technical problems, embodiments of the present invention provide a data encoding method, apparatus, electronic device, and medium. In a first aspect, an embodiment of the present invention provides a data encoding method, including: Constructing a first graph network based on a first search path of a job hunting terminal and first position information related to the job hunting terminal, wherein the first position information is position information in recruitment information clicked by the job hunting terminal; Constructing a second graph network based on a second search path of a recruitment terminal and second position information related to the recruitment terminal, wherein the second position information is job seeking information in resume information clicked by the recruitment terminal; fusing the first graph network and the second graph network to obtain a fused graph network; Sampling nodes in the fusion graph network based on a random walk strategy to obtain a plurality of node sequences; And carrying out vector learning by using the plurality of node sequences to generate the coding vector of the nodes in the fusion graph network. The method comprises the steps of obtaining search words input by a job end in response to input operation of the job end, obtaining a first search path of the job end based on the sequence of the search words input by the job end, determining recruitment information clicked by the job end in response to clicking operation of the job end, obtaining first job information in the recruitment information, using the search words in the first search path and the first job information as nodes, connecting adjacent search words in the first search path, and connecting the first job information and the search words corresponding to the first job information to construct a first graph network. Optionally, the method further comprises the steps of determining that the weight of the connecting edge between the search words in the first graph network is a first weight, and determining that the weight of the connecting edge between the first position information and the search word corresponding to the first position information in the first graph network is a second weight, wherein the second weight is smaller than the first weight. Optionally, the construction of the second graph network based on the second search path of the recruitment end and the second position information related to the recruitment end comprises the steps of responding to the input operation of the recruitment end and acquiring search words input by the recruitment end; the method comprises the steps of obtaining a first search path of a recruitment terminal based on the sequence of search words input by the recruitment terminal, determining resume information clicked by the recruitment terminal in response to clicking operation of the recruitment terminal, obtaining first position information in the resume information, connecting the search words in the first search path with the first position information as nodes, and connecting the first position information with search words corresponding to the first