Search

CN-121996715-A - Category knowledge graph construction and reasoning method for commodity classification

CN121996715ACN 121996715 ACN121996715 ACN 121996715ACN-121996715-A

Abstract

The invention discloses a category knowledge graph construction and reasoning method for commodity classification, which comprises the steps of acquiring commodity data through a source to generate attribute-category structured data pairs, constructing a knowledge graph and a calibration reference library through association mining, generating three types of abnormal samples based on graph weights to construct a training set and calculate a strategy threshold, constructing a graph reasoning and deep learning dual-drive mechanism, calling association weight to correct and output a response coefficient when classification confidence is lower than the threshold, dynamically updating the association weight based on correction result sliding weighting, generating the training set according to an optimization effect reconstruction sample rule and outputting a performance index, optimizing the fusion weight based on the training set, triggering a bypass mechanism to backtrack and adjust the association weight if the performance does not reach the standard, updating the reference library to an evolution strategy library through reinforcement learning, and dynamically selecting an optimal self-calibration path to obtain a target knowledge graph. The invention realizes the closed-loop self-evolution of knowledge graph construction and reasoning, and remarkably improves commodity classification precision and anomaly identification capability.

Inventors

  • WANG BINBIN

Assignees

  • 无线生活(北京)信息技术有限公司

Dates

Publication Date
20260508
Application Date
20260129

Claims (10)

  1. 1. A category knowledge graph construction and reasoning method for commodity classification is characterized by comprising the following steps: Step S1, commodity data are collected through a multi-source channel, and a structured data pair of attribute-category is generated after pretreatment; step S2, automatically extracting color, material and functional attributes based on the structured data pair, calculating the support degree, the confidence degree and the lifting degree by adopting a correlation mining algorithm to construct a knowledge graph containing the correlation weight, and synchronously generating an initial calibration reference library; step S3, three types of abnormal samples of false labeling, missed labeling and counterexample labeling are generated based on the association weight, a training set is constructed with the normal sample after screening, and a strategy threshold parameter is calculated; S4, constructing a double-drive reasoning mechanism, performing association traversal and confidence calculation on input commodity data based on the initial knowledge graph to generate a classification result, and calling the association weight to correct the classification result and output an optimized response coefficient when the comprehensive confidence of the classification result is smaller than the strategy threshold parameter; Step S5, dynamically updating the association weight by adopting a sliding weighting algorithm based on the correction result and the optimization response coefficient to obtain an optimization effect measurement; S6, reconstructing an abnormal sample generation rule parameter set based on the optimization effect measurement to adjust the generation proportion and variation intensity of the three types of abnormal samples to obtain an optimization training set and output a system efficiency index; Step S7, optimizing the fusion weight of the dual-drive reasoning mechanism based on the optimized training set and the system efficiency index, and if the system efficiency index does not reach the quality standard, triggering a bypass mechanism, and directly backtracking and adjusting the association weight to form system straight-through calibration; And S8, updating the initial calibration reference library to an evolution strategy library by adopting a reinforcement learning algorithm based on the optimization response coefficient, the optimization effect measurement, the system efficiency index, the system straight-through calibration and the initial calibration reference library, and dynamically selecting an optimal self-calibration path to obtain a target knowledge graph.
  2. 2. The method for constructing and reasoning about category knowledge patterns for commodity classification according to claim 1, wherein the process of calculating the support, confidence and promotion levels by using the association mining algorithm to construct the knowledge patterns including the association weights comprises: Classifying and filtering candidate attribute combinations by introducing category distribution priori knowledge, mining multi-attribute joint distribution characteristics by adopting a frequent pattern growth algorithm, calculating the support degree to measure attribute-category co-occurrence frequency, calculating the confidence degree to evaluate conditional probability intensity, calculating the lifting degree to judge correlation significance, and fusing the three to form a multi-dimensional correlation evaluation system; and constructing a hierarchical topological structure of the knowledge graph based on the association evaluation system, dividing nodes into attribute node domains and category node domains, dividing edges into single attribute mapping edges and combined attribute mapping edges, and generating the association weight through the product operation of the confidence coefficient and the lifting degree.
  3. 3. The method for constructing and reasoning knowledge graph of category for commodity classification as claimed in claim 2, wherein said step S3 comprises: Constructing a generation rule of three types of abnormal samples based on the association weights, wherein the mislabeling sample is realized by replacing a correct category with a low-weight category, the mislabeling sample is constructed by randomly deleting a core attribute, the counterexample label sample is generated by combining zero-weight attribute-category pairs, and the generation proportion is dynamically distributed according to the distribution characteristics of the association weights; Screening the three types of abnormal samples by adopting a sample effectiveness evaluation mechanism, and quantifying the degree of deviation of the samples from normal distribution by calculating the degree of abnormality; Mixing the screened abnormal samples with the normal samples in proportion to construct a layered training set; and calculating the strategy threshold parameter by adopting a statistical measurement method based on the confidence distribution of the normal samples in the training set and the discrete degree of the abnormal samples.
  4. 4. The method for constructing and reasoning knowledge graph of category for commodity classification according to claim 3, wherein said calculating said policy threshold parameter by using a statistical measure method based on the confidence distribution of normal samples and the degree of dispersion of abnormal samples in said training set comprises: carrying out layered feature statistics on the training set, and respectively calculating confidence distribution features of a normal sample layer and an abnormal sample layer; Calculating a differential metric value of the sample layer by adopting a weighted fusion method based on the distribution dispersion of the normal sample layer and the deviation index of the abnormal sample layer; introducing a time attenuation factor to dynamically adjust the differential measurement value, wherein the time attenuation factor is calculated based on the interval duration of the self-calibration of the adjacent two-round strategy; combining a preset reference threshold value in the initial calibration reference library, and carrying out weighted fusion on the differential metric value and the reference threshold value by adopting a sliding window mechanism to generate a dynamic threshold intermediate value; and applying bidirectional constraint verification to the dynamic threshold intermediate value to obtain the strategy threshold parameter.
  5. 5. The method for constructing and reasoning knowledge graph of category for commodity classification as claimed in claim 4, wherein said step S4 comprises: performing attribute-category association traversal on the commodity data according to the association weight to generate a map confidence coefficient; Performing feature extraction on the training set by adopting a deep learning model to generate AI confidence coefficient; acquiring comprehensive confidence coefficient and preliminary classification result according to attribute integrity dynamic fusion map confidence coefficient and AI confidence coefficient; and when the comprehensive confidence coefficient is smaller than the strategy threshold parameter, invoking a verification rule to correct the classification result and outputting an optimized response coefficient.
  6. 6. The method for constructing and reasoning knowledge graph of category for commodity classification as claimed in claim 5, wherein said process of invoking the verification rule to correct the classification result and outputting the optimized response coefficient when the integrated confidence is smaller than the policy threshold parameter comprises: In the process of executing error correction, a verification rule base is established based on the identification modes of the three types of abnormal samples, when the comprehensive confidence coefficient is smaller than the strategy threshold parameter, the verification rule base is called to carry out logic verification on a preliminary classification result, error marking, missing marking or counterexample label error is identified, and a high-confidence coefficient association relation in the association weight is called to carry out replacement correction on the error category, so that a corrected classification result is generated; and obtaining the optimized response coefficient according to the product of the correction accuracy and the deviation degree of the strategy threshold parameter.
  7. 7. The method for constructing and reasoning knowledge graph of category for commodity classification as claimed in claim 6, wherein said step S6 comprises: Calculating an adjustment factor for sample generation intensity based on the optimization effect metric; Reconstructing the abnormal sample to generate a rule parameter set, wherein the error proofing sample proportion is determined by the product of a basic proportion and the regulating factor, the missing proofing sample proportion is determined by the complementary relation between the basic proportion and the coverage rate of a sample library, and the variation strength of the counterexample sample is determined by the coupling relation between the basic strength and the strategy threshold parameter; when a sample generation process is executed, the three types of abnormal samples are generated by adopting variation operation to disturb the attribute or category of the normal samples; screening the generated abnormal samples, evaluating the degree of deviation from normal distribution by calculating the degree of abnormality of the samples, removing invalid samples by combining a preset abnormality threshold automatic filtering mechanism, and mixing the screened abnormal samples with normal samples in proportion to construct the optimized training set; And calculating the system efficiency index as the product of the effective rate of the generated samples and the misjudgment suppression rate, wherein the effective rate is determined by the proportion of the effective number of the samples to the total generated samples, and the misjudgment suppression rate is determined by comparing the similarity of the generated samples and the historical misjudgment samples.
  8. 8. The method for constructing and reasoning about category knowledge patterns for commodity classification as claimed in claim 7, wherein said reconstructing said anomaly sample generation rule parameter set comprises: Calculating the basic proportion of the mishit sample based on the amplitude value of the optimization effect measurement, wherein the basic proportion keeps an initial set value when the optimization effect is obvious, and the proportion value is improved according to a proportional relation when the optimization effect is not expected; Calculating the coverage rate of a sample library according to the proportion of the number of covered categories in the training set to the total number of categories, and determining the generation proportion of the missed sample by adopting a complementary principle, wherein the generation proportion and the coverage rate are in a negative correlation; Normalizing the strategy threshold parameter to obtain an abnormal amplification reference value, and determining the variation intensity of the counterexample sample by combining the directivity symbol of the optimization effect measurement, wherein the variation intensity is in a linear increasing trend when the threshold parameter is higher; and combining the error proofing sample proportion, the missing proofing sample proportion and the counterexample sample variation strength to form the abnormal sample generation rule parameter set.
  9. 9. The method for constructing and reasoning knowledge graph of category for commodity classification as claimed in claim 8, wherein said step S7 comprises: Calculating the achievement level of the system efficiency index based on the classification performance of the optimized training set; When the achievement level indicates that the system efficiency index reaches the quality standard, adopting an incremental learning strategy to finely adjust the weight distribution proportion of the graph reasoning and the deep learning classification in the double-drive reasoning mechanism; When the achievement level is lower than a preset reference, judging that the achievement level does not reach a quality standard, triggering the bypass mechanism and skipping a conventional adjustment flow of the fusion weight; And executing the backtracking adjustment of the association weight under a bypass mechanism, directly calling the attribute-category mapping relation of the high misjudgment rate sample in the optimized training set, and carrying out punishment correction on the association weight, wherein the punishment correction adopts a back propagation mode to reduce the weight value of the misjudgment path, and synchronously updating the knowledge graph to form the direct calibration of the system.
  10. 10. The method for constructing and reasoning knowledge patterns of categories for commodity classification according to claim 9, wherein the achievement level is determined by comparing the degree of approach of the improvement amplitude of the classification accuracy before and after training to a preset expected target.

Description

Category knowledge graph construction and reasoning method for commodity classification Technical Field The invention relates to the technical field of data processing and artificial intelligence, in particular to a category knowledge graph construction and reasoning method for commodity classification. Background With the rapid development of electronic commerce and new retail industry, the number of commodities is exponentially increased, the classes are continuously subdivided, and extremely high requirements are put on the accuracy and the refinement degree of commodity classification. The accuracy of commodity classification directly influences the recommendation efficiency, search experience and supply chain management of a platform, for example, when a user searches for ' red pure cotton children ' and if the association relationship between the commodity attribute and the category cannot be accurately identified, adult red cotton clothing or chemical fiber material children ' may be pushed, so that the user experience is obviously reduced. Current commodity classification relies mainly on manual labeling and traditional AI models (e.g. convolutional neural networks, recurrent neural networks) to handle separately. The manual labeling has the problems of high cost, low efficiency, easiness in being influenced by subjective factors and the like, the traditional AI model can realize automatic classification to a certain extent, but has three major core defects that firstly, the structured cognition of the association relation between commodity attributes and categories is lacking, the classification is only carried out by depending on data features, the accuracy is low in fine granularity scenes, secondly, the recognition capability of abnormal scenes such as wrong labeling and missed labeling is weak, the robustness is poor when facing a large amount of nonstandard data in actual business, thirdly, the coverage of a sample library is limited, the classification capability of emerging commodity categories (such as intelligent wearing equipment) or minor attribute combinations (such as bamboo fiber-infant socks) is insufficient, and the sample updating depends on manual acquisition labeling and has long iteration period. Part of prior art attempts to introduce knowledge patterns to assist commodity classification, but still has obvious defects that knowledge pattern construction relies on domain experts to manually define association rules of attributes and categories, construction efficiency is low and is difficult to adapt to rapid updating of commodity categories, patterns and AI models are not coordinated sufficiently, patterns are mostly used as auxiliary feature input models, closed loops of pattern guidance reasoning and reasoning feedback optimization patterns are not formed, so that the practicality of the patterns is attenuated with time, no special optimization mechanism is designed for abnormal scenes such as mislabeling and missed labeling in commodity classification in the prior art, the models are not contacted with enough abnormal samples in the training process, the identification and correction capability of the error labels in practical application are weak, a sample library is expanded and depends on manual acquisition of new commodity data and labeling, hysteresis is supplemented for samples of emerging categories, and the service requirements are difficult to be met by the models and pattern coverage ranges. Therefore, a commodity classification method capable of automatically constructing a commodity category knowledge graph, realizing graph and AI model joint reasoning, optimizing for abnormal scenes and supporting automatic updating of a sample library is needed. Disclosure of Invention Therefore, the invention provides a category knowledge graph construction and reasoning method for commodity classification, which is used for solving the problems in the prior art. In order to achieve the above object, the present invention provides a category knowledge graph construction and reasoning method for commodity classification, including: Step S1, commodity data are collected through a multi-source channel, and a structured data pair of attribute-category is generated after pretreatment; step S2, automatically extracting color, material and functional attributes based on the structured data pair, calculating the support degree, the confidence degree and the lifting degree by adopting a correlation mining algorithm to construct a knowledge graph containing the correlation weight, and synchronously generating an initial calibration reference library; step S3, three types of abnormal samples of false labeling, missed labeling and counterexample labeling are generated based on the association weight, a training set is constructed with the normal sample after screening, and a strategy threshold parameter is calculated; S4, constructing a double-drive reasoning mechanism, performing associa