CN-121981114-A - Recruitment information standardization processing method and device
Abstract
The application relates to the technical field of information processing and discloses a recruitment information standardization processing method and device, wherein the method comprises the steps of obtaining a job publication data object containing a job identifier, a field version number and a structured field; according to the field version number, matching the current field semantic version node, calculating an optimal evolution path to a target node based on a field evolution diagram containing the semantic version node and an evolution path with an edge weight, performing version conversion on a structured field according to the path to obtain a unified field object, generating a standard structure object conforming to a standard field structure, analyzing a free text to obtain a semantic entity set, constructing a semantic constraint diagram containing the field node, a text entity node and an edge based on a constraint rule base, detecting conflict and correcting the standard structure object. The method realizes cross-version semantic unification of recruitment information, improves data processing accuracy and efficiency, and ensures data statistics reliability.
Inventors
- XU MIN
- WU HONGYU
- ZHANG WENJUAN
- XU HAN
- WU LINWEI
- GUO BO
- XUE XIAOJIE
- XU QIUXIA
- XIA CHENGJING
Assignees
- 浙江禾记电子科技有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20260403
Claims (10)
- 1. A recruitment information standardization processing method is characterized by comprising the steps of obtaining job issuing data objects, inquiring and matching in a pre-stored field version mapping table to obtain a current field semantic version node corresponding to a structured field by taking the field version number as a retrieval keyword, calculating an optimal evolution path from the current field semantic version node to a pre-defined target field semantic version node based on a field evolution graph, wherein each evolution path comprises a plurality of field semantic version nodes and evolution paths connecting adjacent field semantic version nodes, each evolution path has a corresponding side weight, sequentially executing version conversion processing on the structured field according to the traversal sequence of the field semantic version nodes in the optimal evolution path to obtain a unified field object, carrying out semantic conversion processing on the corresponding conversion rule based on the current adjacent field semantic version node, generating a standard semantic version structure conforming to the target field semantic version node, analyzing the object in the text constraint semantic version entity set based on the standard semantic version object, obtaining a constraint semantic version entity set from the text constraint semantic version entity set in the text constraint object, and the text constraint semantic version entity set in the text constraint set of the object, and obtaining the standard entity set of the text constraint semantic version entity, and carrying out conflict detection on the semantic constraint graph to obtain a conflict list, and carrying out correction processing on the standard structure object based on the conflict list.
- 2. The recruitment information standardization processing method according to claim 1 is characterized by comprising the steps of obtaining a first semantic definition text corresponding to a first node and a second semantic definition text corresponding to a second node in the adjacent field semantic version node pair, inputting the first semantic definition text and the second semantic definition text into a pre-trained Sentence-BERT model to obtain a first semantic vector and a second semantic vector, calculating to obtain a semantic deviation degree based on the first semantic vector and the second semantic vector, obtaining a first constraint rule number corresponding to the first node and a second constraint rule number corresponding to the second node, calculating to obtain a rule difference degree based on the first constraint rule number and the second constraint rule number, and determining the edge weight of an evolution path between the adjacent field semantic version node pair based on a weighted sum of the semantic deviation degree and the rule difference degree.
- 3. The recruitment information standardization processing method according to claim 1 is characterized by comprising the steps of initializing a path weight table, resetting the accumulated path weight of the current field semantic version node to 0, resetting the accumulated path weights of other field semantic version nodes in a field evolution graph to infinity, traversing adjacent field semantic version nodes by taking the current field semantic version node as a starting point, updating the accumulated path weight of the adjacent field semantic version node in a path weight table based on the sum of the accumulated path weight of the current field semantic version node and the edge weight of a corresponding evolution path, selecting a node with the smallest accumulated path weight in a non-access node from the path weight table as a new current node, repeating traversing and updating steps until the new current node is the target field semantic version node, and backtracking to obtain the node with the smallest accumulated path weight from the current field semantic version node to the target field semantic version node as the optimal evolution path.
- 4. The recruitment information standardization processing method of claim 1 is characterized by comprising the steps of carrying out semantic analysis on free text in the job position release data object to obtain a semantic entity set, inputting RoBERTa-CRF named entity recognition models to obtain an entity sequence, wherein the entity sequence comprises semantic entities and corresponding entity types, inputting Biaffine relation extraction models to obtain a triplet list, wherein each triplet in the triplet list comprises a first semantic entity, a second semantic entity and semantic relation between the first semantic entity and the second semantic entity, and integrating the entity sequence and the triplet list to obtain the semantic entity set.
- 5. The recruitment information standardization processing method according to claim 1, wherein the constructing a semantic constraint graph includes converting each standard field in the standard structure object into a field node, node attributes of the field node including a field name, a field value and a field source, converting each semantic entity in the semantic entity set into a text entity node, node attributes of the text entity node including an entity name, an entity type and an entity value, traversing the field node and the text entity node, constructing a corresponding edge for a node pair having a corresponding constraint rule based on a semantic constraint rule base, the type of the edge including a numerical constraint edge, a proportional constraint edge, a mutex edge or a time dependent edge, integrating the field node, the text entity node and the constructed edge in an adjacency list form, and generating the semantic constraint graph.
- 6. The recruitment information standardization processing method according to claim 5 is characterized by comprising the steps of carrying out conflict detection on the semantic constraint graph to obtain a conflict list, carrying out depth-first search on mutually exclusive edges in the semantic constraint graph, detecting a closed loop path formed by mutually exclusive edge connection, recording the closed loop path as mutually exclusive edge closed loop conflict, traversing numerical constraint edges or proportion constraint edges in the semantic constraint graph, obtaining corresponding first node numerical values and second node numerical values, calculating theoretical values according to constraint rules of the numerical constraint edges or the proportion constraint edges, calculating deviation values between the theoretical values and actual operation values, recording numerical value interval conflict if the deviation values are larger than rule thresholds of the numerical constraint edges or the proportion constraint edges, integrating the mutually exclusive edge closed loop conflict and the numerical value interval conflict, and generating the conflict list.
- 7. The recruitment information standardization processing method according to claim 1, wherein the correction processing of the standard structure object based on the conflict list includes determining, for each conflict in the conflict list, a field node and a text entity node involved in the conflict list, calculating field credibility of the field node and text credibility of the text entity node, modifying a field value of a corresponding field node in the standard structure object according to an entity value of the text entity node if the text credibility is greater than the field credibility, and recording a text conflict label for the corresponding text entity node in the semantic entity set if the field credibility is greater than the text credibility.
- 8. The recruitment information standardization processing method according to claim 7, wherein the determination method of the field credibility includes obtaining a ratio of a conflict-free operation duration to a total operation duration of a field semantic version corresponding to the field node within a preset time range as version stability, obtaining a historical value of the field node in historical recruitment information of an enterprise issuing the job position issue data object, calculating a similarity average of the historical value and the current field value as an enterprise historical consistency rate, and determining the field credibility based on a weighted sum of the version stability and the enterprise historical consistency rate.
- 9. The recruitment information standardization processing method according to claim 7 is characterized in that the text credibility determining method includes obtaining an entity credibility output by a named entity recognition model and a mean value of relation credibility output by a relation extraction model when the text entity node is extracted from the free text as a model credibility, obtaining a ratio of a text length of a text entity node in the free text to a preset text length as a context integrity, and determining the text credibility based on a product of the model credibility and the context integrity.
- 10. A recruitment information standardization processing device is characterized by comprising a data acquisition module, a path calculation module, a position analysis module and a position analysis module, wherein the data acquisition module is used for acquiring a job publication data object, the job publication data object comprises a job identifier, a field version number and a structured field, the version matching module is used for inquiring in a pre-stored field version mapping table by taking the field version number as a search keyword to obtain a current field semantic version node corresponding to the structured field, the path calculation module is used for calculating an optimal evolution path from the current field semantic version node to a pre-defined target field semantic version node based on a field evolution graph, the field evolution graph comprises a plurality of field semantic version nodes and evolution paths connected with adjacent field semantic version nodes, each evolution path has a corresponding edge weight, the version conversion module is used for sequentially executing version conversion processing on the structured field according to the traversal sequence of the field semantic version nodes in the optimal evolution path to obtain a unified field object, the conversion rule corresponding to the current field semantic version node is used for carrying out the conversion processing based on the current field semantic version standard, the generation module is used for generating a text constraint graph based on the field semantic constraint graph, the object semantic constraint graph is used for analyzing the object semantic constraint graph is used for constructing a set of the object semantic constraint graph, the nodes of the semantic constraint graph comprise field nodes obtained by converting standard fields in the standard structure object and text entity nodes obtained by converting semantic entities in the semantic entity set, wherein the edges of the semantic constraint graph are constructed based on a semantic constraint rule base, and a conflict detection and correction module is used for carrying out conflict detection on the semantic constraint graph to obtain a conflict list and carrying out correction processing on the standard structure object based on the conflict list.
Description
Recruitment information standardization processing method and device Technical Field The application relates to the technical field of information processing, in particular to a recruitment information standardization processing method and device. Background In practical application of the recruitment information management system, standardized processing of recruitment information is a key link for guaranteeing data statistics and analysis accuracy and improving system operation efficiency, however, a plurality of problems to be solved still exist in the recruitment information processing process in the prior art. On the one hand, the problems of latent semantic conflict, benefit mutual exclusion expression, inconsistent time conditions and the like easily occur between the structured fields and the free text descriptions in the same job position release version, the existing system can only complete field filling verification, semantic logic closed loop detection between the structured fields and the free text cannot be realized, and contradictions of various semantic layers are difficult to identify and correct. On the other hand, with the upgrade of the system and the adjustment of the policy, the field semantics of recruitment information will continuously evolve, the expression forms and definition standards of the fields such as salary, age and welfare are changed, and the historical data cannot be consistent with the new semantic standard, so that the data statistics distortion and the conflict detection rule are invalid, and the cross-year data comparison of the same post loses the reference value. In addition, in the prior art, field version evolution processing and semantic conflict detection are executed independently, a mechanism of cooperative linkage of the field version evolution processing and the semantic conflict detection is lacked, internal logic consistency of data after field evolution conversion cannot be guaranteed, new semantic conflict is easy to generate, a recruitment information standardization processing framework with dynamic evolution and stable logic closed loop is difficult to construct, and intelligent development of a recruitment information management system and full play of data value are restricted. Disclosure of Invention The application aims to provide a recruitment information standardization processing method and device, which are used for solving the problems in the background technology. According to a first aspect of the present application, there is provided a recruitment information standardization processing method, comprising the steps of obtaining job issuing data objects, wherein each job issuing data object comprises a job identifier, a field version number and a structured field, taking the field version number as a search keyword, querying in a pre-stored field version mapping table, matching to obtain a current field semantic version node corresponding to the structured field, calculating an optimal evolution path from the current field semantic version node to a pre-defined target field semantic version node based on a field evolution graph, wherein each field evolution graph comprises a plurality of field semantic version nodes and evolution paths connecting adjacent field semantic version nodes, each evolution path has a corresponding side weight, sequentially executing version conversion processing on the structured field according to the traversal order of the field semantic version nodes in the optimal evolution path to obtain a unified field object, performing the version conversion processing on the corresponding conversion rule based on the adjacent field semantic version node in the current processing, generating a standard evolution path conforming to the target field semantic version node, constructing a standard semantic version node, analyzing the structure of the object by a constraint entity in the text entity, and obtaining a constraint entity graph from the text constraint graph of the structure of the object, and obtaining a constraint graph from the text entity of the text object, wherein the constraint graph of the text issuing entity is obtained by the constraint graph, and carrying out conflict detection on the semantic constraint graph to obtain a conflict list, and carrying out correction processing on the standard structure object based on the conflict list. Preferably, a first semantic definition text corresponding to a first node and a second semantic definition text corresponding to a second node in the adjacent field semantic version node pair are obtained, the first semantic definition text and the second semantic definition text are respectively input into a pre-trained Sentence-BERT model to obtain a first semantic vector and a second semantic vector, a semantic deviation degree is obtained through calculation based on the first semantic vector and the second semantic vector, a first constraint rule number correspondin