CN-121764977-B - Food detection sample management method and system based on big data analysis
Abstract
The invention discloses a food detection sample management method and a system based on big data analysis, in particular to the technical field of food detection data processing, which are used for solving the problems of insufficient intelligent retrieval and knowledge mining capability caused by unstructured text semantic ambiguity and lack of deep semantic association between data in the existing sample management big data platform; the method comprises the steps of obtaining sample whole-flow data containing structured and unstructured data, judging semantic ambiguity of the unstructured data by adopting a text similarity algorithm, carrying out sequence analysis and semantic completion on the ambiguous text by utilizing a hidden Markov model and combining space-time correlation characteristics to generate a semantic clarification result, then establishing and optimizing a semantic correlation rule based on the semantic clarification result and the structured data, and finally constructing a food detection sample information knowledge graph according to the optimization rule, thereby realizing intelligent semantic retrieval and deep knowledge mining on sample information.
Inventors
- ZHU JUNPENG
- XIONG TIANLI
- ZHU PEIYUN
- CHEN XING
- ZHANG CHEN
- FU SHENG
- SHENG HONGYU
Assignees
- 贵州食科源信息科技有限公司
- 商丘市产品质量检验检测中心
Dates
- Publication Date
- 20260508
- Application Date
- 20260304
Claims (8)
- 1. A method for managing food detection samples based on big data analysis, comprising: s1, acquiring sample whole-flow data of food detection, wherein the sample whole-flow data comprises structured data and unstructured data; s2, judging whether the unstructured data have semantic ambiguity by adopting a text similarity algorithm; S3, when semantic ambiguity exists, performing sequence feature analysis and semantic completion processing on unstructured data with the semantic ambiguity by adopting a hidden Markov model from the space-time associated feature angle of the sample whole-flow data, and generating a semantic clarification result, wherein the method comprises the following steps: Extracting space-time correlation characteristics of the sample from the sample whole flow data corresponding to unstructured data with semantic ambiguity; Based on the extracted space-time correlation features and unstructured data with semantic ambiguity, constructing a state sequence and an observation sequence of a hidden Markov model; Decoding the observation sequence through a hidden Markov model, and analyzing hidden sequence characteristics of unstructured data with semantic ambiguity; according to the analyzed implicit sequence features, carrying out semantic complementation on unstructured data with semantic ambiguity to generate a semantic clarification result; S4, establishing a semantic association rule based on the semantic clarification result and the structured data, and defining a corresponding relation between the semantic clarification result and the structured data; S5, carrying out logic self-consistency verification and rule adaptation of sample risk levels of cross-sample management links on the semantic association rules to generate optimized semantic association rules, wherein the method comprises the following steps of: mapping the established semantic association rule to sampling, storing, detecting and sample reserving treatment links corresponding to the whole sample management flow; Verifying the consistency of the same semantic association rule in mapping logic of different links, and identifying logic conflicts generated by different semantic association rules when the different semantic association rules are applied across links; Analyzing and labeling the risk features and the risk grades implied by each semantic association rule based on the detection abnormal description and the treatment remarks associated with the semantic association rules in the historical sample data; performing logic reconstruction and weight calibration on the semantic association rule according to the identification result of the logic conflict and the marked risk characteristics and risk grades; Outputting the semantic association rule with the logic reconstruction and weight calibration completed as an optimized semantic association rule; and S6, constructing a food detection sample information knowledge graph according to the optimized semantic association rule, and executing intelligent retrieval and knowledge mining of the food detection sample information.
- 2. The method for managing food detection samples based on big data analysis according to claim 1, wherein S1 comprises: acquiring recorded sample whole-flow data from a food detection sample management big data platform; the obtained sample whole flow data is confirmed to comprise structured data of sample types, sources and batches, and unstructured data comprising sampling site descriptions, detection abnormality description and treatment remarks.
- 3. The method for managing food detection samples based on big data analysis according to claim 1, wherein S2 comprises: performing text word segmentation and denoising on unstructured data of sampling site description, detection of abnormal description and treatment remarks obtained from the whole-flow data of the sample; converting unstructured data subjected to text word segmentation and denoising treatment into text characteristic representation; Calculating similarity values between text feature representations; and judging whether the unstructured data has semantic ambiguity or not according to a comparison result of the similarity value and a preset similarity threshold value.
- 4. The method for managing food detection samples based on big data analysis according to claim 1, wherein the decoding of the observation sequence by the hidden Markov model comprises applying a Viterbi algorithm, recursively calculating the most likely hidden state sequence path by dynamic programming based on hidden Markov model parameters and the observation sequence, and retrospectively obtaining a complete hidden state sequence as the analyzed hidden sequence feature.
- 5. The method for managing food detection samples based on big data analysis according to claim 1, wherein S4 comprises: extracting key semantic elements in a semantic clarification result; extracting data elements associated with key semantic elements from the structured data of sample class, source and batch; Analyzing the logic corresponding relation between the key semantic elements and the data elements; based on the analyzed logical correspondence, a semantic association rule defining the correspondence between the semantic clarification result and the structured data is formed.
- 6. The method for managing food detection samples based on big data analysis according to claim 1, wherein performing logic reconstruction and weight calibration on the semantic association rule comprises performing logic reconstruction by redefining or combining an entity and a relation predicate in the semantic association rule with logic conflict, and performing weight calibration by giving differentiated confidence weights to the logic relation according to risk characteristics and risk levels marked by the semantic association rule.
- 7. The method for managing food detection samples based on big data analysis according to claim 1, wherein S6 comprises: taking the optimized semantic association rule as a map construction rule, and determining the definition of the entity and the relation in the knowledge map; Converting the semantic clarification result and the structural data of sample class, source and batch into entity nodes and relationship edges in the knowledge graph according to the corresponding relation defined by the optimized semantic association rule so as to complete construction; receiving a search request for food detection sample information, and analyzing natural language query in the search request into a structured query based on a knowledge graph entity and a relation; and executing structural inquiry in the constructed food detection sample information knowledge graph, and carrying out knowledge mining by traversing and matching paths in the knowledge graph.
- 8. A big data analysis based food detection sample management system for implementing the big data analysis based food detection sample management method of any one of claims 1 to 7, comprising: The data acquisition module is used for acquiring the whole flow data of the sample for food detection, and comprises structured data and unstructured data; The fuzzy judgment module is used for judging whether the unstructured data has semantic ambiguity or not by adopting a text similarity algorithm; The clarification generating module is used for carrying out sequence feature analysis and semantic completion processing on unstructured data with semantic ambiguity by adopting a hidden Markov model from the space-time correlation feature angle of the whole-flow data of the sample when the semantic ambiguity exists, so as to generate a semantic clarification result; The rule establishing module is used for establishing a semantic association rule based on the semantic clarification result and the structured data and defining a corresponding relation between the semantic clarification result and the structured data; the rule optimization module is used for carrying out logic self-consistency verification and rule adaptation of sample risk levels of the cross-sample management link on the semantic association rule to generate an optimized semantic association rule; And the retrieval mining module is used for constructing a food detection sample information knowledge graph according to the optimized semantic association rule and executing intelligent retrieval and knowledge mining of the food detection sample information.
Description
Food detection sample management method and system based on big data analysis Technical Field The invention relates to the technical field of food detection data processing, in particular to a food detection sample management method and a food detection sample management system based on big data analysis. Background In the sample management work of the food detection industry, along with the advancement of digital transformation, the sample management can gradually rely on a large data platform to realize the information management and control of the whole life cycle. In the prior art, the big data platform is mainly used for collecting and storing relevant information of the whole process from sampling, sample collection, storage, detection and distribution to sample reserving and disposal of food detection samples, and the information comprises structured data such as category, source, batch and the like, and also comprises unstructured data such as sampling site description, detection anomaly explanation, disposal remarks and the like. The platform core function is concentrated on information input filing and basic query statistics, and through simple integration of structured data, basic data support is provided for sample management, and traceability of sample information and standardized management and control of flow are realized through assistance. In the existing big data application scheme of food detection sample management, organization and management of sample information lack of multi-dimensional semantic association design, so that the intelligent retrieval and knowledge mining capability of a big data platform are obviously insufficient, the platform can only realize accurate retrieval based on keywords, semantic association and logic relationship among different types of sample information cannot be effectively identified, hidden management rules and risk features are difficult to mine from massive sample data, the value of the massive sample data cannot be fully released, and accurate and effective data support cannot be provided for food detection sample management optimization and food safety supervision decision. Disclosure of Invention In order to overcome the above-mentioned drawbacks of the prior art, the present invention provides a method and a system for managing food detection samples based on big data analysis, so as to solve the above-mentioned problems in the prior art. In order to achieve the above purpose, the present invention provides the following technical solutions: a method for food detection sample management based on big data analysis, comprising: s1, acquiring sample whole-flow data of food detection, wherein the sample whole-flow data comprises structured data and unstructured data; s2, judging whether the unstructured data have semantic ambiguity by adopting a text similarity algorithm; S3, when semantic ambiguity exists, performing sequence feature analysis and semantic completion processing on unstructured data with the semantic ambiguity by adopting a hidden Markov model from the space-time correlation feature angle of the whole-flow data of the sample, and generating a semantic clarification result; S4, establishing a semantic association rule based on the semantic clarification result and the structured data, and defining a corresponding relation between the semantic clarification result and the structured data; s5, carrying out logic self-consistency verification across sample management links on semantic association rules and rule adaptation of sample risk levels to generate optimized semantic association rules; and S6, constructing a food detection sample information knowledge graph according to the optimized semantic association rule, and executing intelligent retrieval and knowledge mining of the food detection sample information. Further, S1 includes: acquiring recorded sample whole-flow data from a food detection sample management big data platform; the obtained sample whole flow data is confirmed to comprise structured data of sample types, sources and batches, and unstructured data comprising sampling site descriptions, detection abnormality description and treatment remarks. Further, S2 includes: performing text word segmentation and denoising on unstructured data of sampling site description, detection of abnormal description and treatment remarks obtained from the whole-flow data of the sample; converting unstructured data subjected to text word segmentation and denoising treatment into text characteristic representation; Calculating similarity values between text feature representations; and judging whether the unstructured data has semantic ambiguity or not according to a comparison result of the similarity value and a preset similarity threshold value. Further, S3 includes: Extracting space-time correlation characteristics of the sample from the sample whole flow data corresponding to unstructured data with semantic ambiguity; Based on the extr