CN-122025193-A - Rule engine driven abnormal mode automatic labeling system and method for dynamic blood sugar
Abstract
The invention discloses a rule engine driven abnormal mode automatic labeling system and method for dynamic blood sugar, comprising the steps of acquiring dynamic blood sugar monitoring data and associated multidimensional information, integrating the dynamic blood sugar monitoring data and the associated multidimensional information, preprocessing the data to obtain standardized data, determining an analysis window based on a rule base containing four types of core parameters, defining the window, identifying characteristic events, calculating associated parameters and identifying abnormal modes of multiple judgment through generalized logic, positioning the abnormal and associated triggering factors, generating labeling information containing unique identifiers, and outputting structured data, clinical reports and logs. The system correspondingly comprises a data input, a rule engine, a processing engine and an output four-layer framework. The automatic accurate labeling method for the blood glucose abnormality can realize automatic accurate labeling of the blood glucose abnormality, support batch processing, greatly reduce labor and time cost of labeling of the blood glucose abnormality, and improve blood glucose management efficiency for rapid iteration and optimized adaptation of an artificial intelligent large model.
Inventors
- WENG JIANPING
- ZHENG XUEYING
- ZENG XIFENG
- Wang hanxiao
- ZHAO MINZHE
- GAO BEIBEI
Assignees
- 安徽省立医院(中国科学技术大学附属第一医院)
- 深圳市爱宝惟生物科技有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260414
Claims (10)
- 1. A rule engine driven abnormal mode automatic labeling method of dynamic blood sugar is characterized by comprising the following steps: S1, acquiring dynamic blood glucose monitoring data and associated multidimensional information, and executing format verification and preliminary integration, wherein the dynamic blood glucose monitoring data is a time series blood glucose value, and the associated multidimensional information comprises an insulin infusion record, a carbohydrate intake record, patient basic information and personalized glucose control parameters; S2, preprocessing the integrated data to obtain standardized data, wherein the preprocessing comprises data cleaning, missing value complementation, outlier correction and time sequence alignment; S3, based on a preset abnormal blood sugar pattern recognition rule base, automatically determining analysis time windows corresponding to the abnormal patterns, wherein the abnormal blood sugar pattern recognition rule base comprises four core parameters including a blood sugar threshold value, a time window, a frequency threshold value and a difference value threshold value; s4, identifying a target abnormal blood glucose mode within the analysis time window by adopting generalized abnormal identification logic; S5, determining the starting and ending time of each abnormal mode, associating the triggering factors of the abnormal modes, and generating labeling information containing unique identifiers according to preset rules; and S6, outputting the structured annotation data file, the clinical readable annotation report and the full-flow processing log.
- 2. The automatic labeling method for the rule engine driven abnormal mode of the dynamic blood sugar according to claim 1, wherein in step S4, specifically comprising: s41, defining an observation window of a target abnormal mode, defining a window starting point reference and a window duration, and covering a complete occurrence period of the abnormal mode; S42, identifying characteristic events in the window, setting a judgment standard of each characteristic event, scanning data and recording start and stop time of each characteristic event; s43, calculating association parameters of the feature event and the observation window, and quantifying association degree and time sequence relation of the feature event and the observation window; s44, setting multiple judging conditions including existence/nonexistence of the characteristic event, a related parameter threshold value, a window end point state and event time sequence logic, and confirming that the abnormal mode is established if all the conditions are met simultaneously.
- 3. The automatic labeling method for abnormal patterns driven by a rule engine for dynamic blood sugar according to claim 2, wherein in step S2, the preprocessing specifically comprises: The missing value is complemented by adopting a grading processing strategy, discarding all data of the patient when the data integrity is less than 80%, and complementing long-period missing by adopting a machine learning logistic regression model when the data integrity is more than or equal to 80%, complementing short-time isolated missing by adopting a linear interpolation method, wherein the linear interpolation formula is as follows: Wherein g prev is the previous effective blood glucose value, t prev is the corresponding time, g next is the next effective blood glucose value, and t next is the corresponding time; The abnormal value correction, namely adopting a first-order difference smoothness detection, a standard difference method and a fractional number method to perform triple combination judgment, slightly isolating abnormal points, marking serious continuous abnormal sections as invalid and eliminating the abnormal points through interpolation correction, and synchronously recording the abnormal value position, the detection method and the processing mode; And (3) time sequence alignment, namely unifying all blood glucose data into a 5-minute standard sampling interval, carrying out downsampling on high-frequency data to obtain an average value or a median value, supplementing the downsampling on low-frequency data by linear interpolation, and matching related information such as insulin infusion, carbohydrate ingestion and the like to the nearest standard time stamp according to time difference.
- 4. The automatic labeling method for abnormal patterns driven by a rule engine for dynamic blood glucose according to claim 3 wherein in step S3, the analysis time window comprises three types of self-adaptation types: A 7-day long-term trend window for identifying long-term abnormal patterns such as frequent hypoglycemia and frequent hyperglycemia; A 24-hour intra-day fluctuation window for identifying intra-day abnormal patterns such as repeated intra-day hypoglycemia, dawn phenomenon, etc.; a sliding window for a specific time period of 2 hours for identifying regular hyperglycemia and hyperglycemia at night 23:00-05:00, at night 05:00-11:00, at noon 11:00-18:00, at night 18:00-23:00; And the window positioning adopts a parallel processing strategy, simultaneously positions observation windows of multiple abnormal modes for the same data source, and sequentially traverses all abnormal mode rules to carry out multiple rounds of scanning.
- 5. The automatic labeling method for abnormal patterns driven by a dynamic blood glucose rule engine according to claim 4 wherein in step S3, the abnormal pattern rules in the abnormal blood glucose pattern recognition rule base comprise the following 8 major classes: Frequent hypoglycemia, b sappan wood phenomenon, c continuous hyperglycemia throughout the day, d frequent hyperglycemia, e hypoglycemia after overcorrection of hyperglycemia, f hyperglycemia after overcorrection of hypoglycemia, g postprandial hypoglycemia, h postprandial hyperglycemia; In step S44, the multiple judging conditions comprise event independence judging that the time interval between the front abnormal event and the rear abnormal event is 15 minutes, determining that the two abnormal events are independent events, and if the time interval is less than or equal to 15 minutes, merging the two abnormal events into one event.
- 6. The automatic labeling method for abnormal patterns driven by a dynamic blood sugar rule engine according to claim 5, wherein in step S5, the trigger factor association analysis is specifically: Hyperglycemia events are associated with carbohydrate intake records over the first 2 hours, including meal time, food type, and carbohydrate content; hypoglycemic events, associated with insulin infusion records over the first 5 hours, including basal rate, meal bolus, correction bolus and active insulin data; the unique identifier is in a format of U_patient ID_mode ID_start timestamp, the start timestamp is accurate to minutes, and the mode ID corresponds to the abnormal blood sugar mode in the rule base one by one.
- 7. The automatic labeling method for abnormal patterns driven by a rule engine for dynamic blood sugar according to claim 6, wherein in step S6, the structured labeling data file adopts a unified format, and comprises patient ID, observation window period, abnormal blood sugar pattern classification, pattern recognition time interval, intervention priority, unique identifier and associated event key field; The clinical readable labeling report displays the blood sugar control condition, the abnormal mode type, the occurrence time and the occurrence frequency of the patient in the form of a chart and characters; The whole-flow processing log records data processing key steps, abnormal processing conditions, rule matching details and data quality assessment results, and supports problem investigation and result rechecking.
- 8. A rule engine driven abnormal mode automatic labeling system for dynamic blood glucose, characterized in that the method of any one of claims 1-7 is adopted, comprising: The data input layer is used for executing the operation of the step S1, receiving dynamic blood glucose monitoring data and associated multidimensional information, supporting the data format of the adaptive CGM equipment and realizing data reading, analysis, format verification and preliminary integration; The rule engine layer is used for storing the abnormal blood sugar pattern recognition rule base in the step S3, wherein the rule base comprises four core parameters of a blood sugar threshold value, a time window, a frequency threshold value and a difference value threshold value, and supports dynamic configuration, updating and version management of rules; the processing engine layer comprises a data preprocessing module, an automatic recognition module of an observation window period, an automatic matching module of an abnormal blood sugar mode and a data positioning and event matching module, and the operations of the steps S2-S5 are respectively executed, and the automatic matching module of the abnormal blood sugar mode integrates the generalized abnormal recognition logic of the step S4; And the output layer is used for executing the operation of the step S6 and outputting the structured annotation data file, the clinical readable annotation report and the whole-flow processing log.
- 9. The automatic labeling system for the rule engine driven abnormal mode of the dynamic blood sugar according to claim 8, wherein a data preprocessing module of the processing engine layer adopts processing logic of deletion value grading complementation, abnormal value triple detection and correction and time sequence standardization alignment; The observation window period automatic identification module is used for supporting three types of self-adaptive window types and adopts a parallel positioning and traversal analysis method; the data positioning and event matching module is used for supporting trigger factor association analysis and unique identifier generation rules; In the associated multidimensional information received by the data input layer, insulin infusion records comprise basal rate, bolus before meal and correction bolus, personalized glucose control parameters comprise insulin sensitivity coefficients and carbohydrate coefficients, and patient basal information comprises age, gender and diabetes type; The data input layer also has a heterogeneous data source compatibility processing function, and is adaptive to the data export format of CGM equipment, so that an additional format conversion tool is not needed; The key field of the structured annotation data file of the output layer also comprises the severity degree of the abnormal mode and rule matching details; The clinically-readable annotation report also comprises abnormal mode trigger analysis and clinical reference suggestions; the full-flow processing log also records the running state of each module, the time consumption of data processing and abnormal alarm information, and supports full-link tracing; the processing engine layer adopts parallel computing technology, and distributes data of different patients to different processing units through a multithreading or distributed computing framework.
- 10. The automatic labeling system for abnormal patterns driven by a rule engine for dynamic blood sugar according to claim 9 wherein the unique identifier is generated by a data positioning and event matching module in the format of "u_patient id_pattern id_start time stamp", wherein pattern ID is the unique code of abnormal blood sugar pattern in rule base, start time stamp is accurate to minute, and global uniqueness and traceability of each abnormal event are ensured; Data exchange and function call are carried out among layers of the system through standardized interfaces; The rule engine layer supports rule version management, can record the history of new addition, modification and deletion of rules, and is convenient for rule iteration tracing and rollback.
Description
Rule engine driven abnormal mode automatic labeling system and method for dynamic blood sugar Technical Field The invention relates to the technical field of medical data processing, in particular to a rule engine driven type abnormal mode automatic labeling system and method for dynamic blood sugar. Background In the field of diabetes management, dynamic blood glucose monitoring (Continuous Glucose Monitoring, CGM) techniques produce vast amounts of time series data that contain rich physiological and pathological information. However, the raw monitoring data itself is often raw, unstructured, and of limited value for direct use. Dynamic blood glucose monitoring devices are capable of continuously and densely recording changes in the patient's blood glucose level, typically generating one data point every 5 minutes, and 288 data points can be generated a day. These high-dimensional, high-density data streams contain not only the blood glucose values themselves, but are often associated with lifestyle events such as insulin injections, dietary intake, exercise, etc. of the patient. This multi-dimensional, time-series nature makes dynamic blood glucose data extremely valuable for analysis, but at the same time presents a significant processing challenge. The traditional data processing method is often dependent on manual examination and labeling, so that the efficiency is low, the method is easily influenced by subjective factors, and consistency and accuracy of labeling are difficult to ensure. In the training process of identifying an AI large model by an abnormal blood sugar mode, acquiring a large amount of high-quality labeling data is a crucial first step. However, the manual labeling method commonly adopted in the industry currently has a significant efficiency bottleneck. The labeling personnel need to review the continuous blood glucose monitoring data for each patient one by one, which typically involves data records for days or even weeks. For a study involving more than a thousand patients, the total amount of data may reach millions or even tens of millions of data points. The labeling personnel not only need to identify the abnormality of the blood sugar value, but also need to combine the comprehensive analysis of insulin and diet records of the patient, judge the type of the abnormality mode and the starting and ending time according to clinical standards, and label the judgment basis. This process is extremely time consuming and the dynamic blood glucose data labeling of a single patient may take hours to complete. The low-efficiency labeling mode severely restricts the development period and the iteration speed of an AI model, so that large-scale and multi-center research is difficult to develop, and development of intelligent management technology of diabetes is limited. Another core problem with manual labeling is that labeling standards are difficult to unify. Different labeling personnel, even after the same training, may make different decisions in the face of complex blood glucose excursion curves due to differences in understanding clinical guidelines, differences in personal experience. For example, for the identification of "dawn phenomenon", one annotator might consider a 1.0mmol/L increase in blood glucose to be a decision, while another might insist on the need to reach 2mmol/L. Likewise, for the frequency judgment of "regular hypoglycemia", there may be a divergence of "2 times per week" or "3 times per week". The subjectivity and the inconsistency lead to the fact that the labeling result has larger noise, and the quality of training data is seriously affected. When AI models are trained using these noisy data, the learned patterns will be ambiguous and inaccurate, ultimately resulting in a significant compromise in the predictive performance and generalization ability of the model and failure to be reliably applied clinically. The manual labeling process is not only inefficient, but also an extremely labor and effort intensive task. The labeling personnel needs to pay attention to complex charts and data for a long time, visual fatigue and attention degradation are easy to generate, and labeling errors are caused. For example, a transient hypoglycemic event may be ignored, or "hyperglycemia after overcorrection for hypoglycemia" may be erroneously judged to be postprandial glycemic abnormality. In addition, the labeling personnel need to refer to a plurality of information sources such as blood glucose curves, insulin injection records, carbohydrate intake and the like at the same time, and carry out comprehensive judgment. Manually integrating such information is extremely prone to overlooking or mismatching in a large number of repetitive efforts. These human errors further reduce the reliability of the annotation data, increasing the complexity of subsequent data cleaning and model training. Therefore, the traditional manual labeling method is high in cost, and the quality