CN-122021949-A - Internal control defect collaborative prediction system based on federal causal forest and safety aggregation
Abstract
The invention discloses an internal control defect collaborative prediction system based on federal causal forest and safety aggregation, which relates to the technical field of data intelligence, and comprises a data standard module, wherein each participant collects internal control running state data from an internal control environment and performs standard processing to obtain a standard data set; the system comprises a causal modeling module, a path feedback module and a global causal forest model, wherein the causal modeling module is used for locally constructing a causal forest structure on each participant based on a standardized data set and carrying out iterative optimization on causal path selection to obtain a local causal model, and the path feedback module is used for carrying out indication adjustment on causal paths corresponding to feedback of each participant according to predictive characterization information to optimize the global causal forest model. According to the invention, a consistency driving causal path dynamic selection mechanism based on time indexes is introduced in the causal inference process, so that the causal path selection can keep continuity and consistency along with the evolution of an internal control running state.
Inventors
- WU SHAOHUA
- ZHUANG XIAOMING
- LIN XIAODONG
- ZHENG HANG
- FAN SHENGJUN
Assignees
- 厦门美亚亿安信息科技有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260416
Claims (10)
- 1. The federal causal forest and safety aggregation-based internal control defect collaborative prediction system is characterized by comprising, The data standard module is used for collecting internal control running state data from an internal control environment by each participant and carrying out standard processing to obtain a standard data set; The causal modeling module is used for constructing a causal forest structure locally on each participant based on the standardized data set, and carrying out iterative optimization on causal path selection to obtain a local causal model; the homomorphic encryption module is used for carrying out encryption processing on the local causal model by adopting Paillier homomorphic encryption to obtain an encryption causal model; the privacy aggregation module is used for uploading the encryption causal model to a central aggregation node, and executing privacy-controlled collaborative aggregation calculation on the encryption causal models of the multiple participants by utilizing a differential privacy aggregation method to form a global causal forest model; The cause and effect inference module is used for carrying out cause and effect inference calculation on the newly accessed internal control running state data based on the global cause and effect forest model to obtain prediction characterization information of the internal control defects; And the path feedback module is used for indicating and adjusting the causal paths corresponding to the feedback of each participant according to the predictive characterization information and optimizing the global causal forest model.
- 2. The federal causal forest and safety aggregation-based cooperative prediction system for internal control defects according to claim 1, wherein the internal control operation state data comprises control parameter values, control parameter variation amounts, operation state identifiers and corresponding time indexes.
- 3. The federal causal forest and safety aggregation based cooperative prediction system for internal control defects according to claim 2, wherein said obtaining a standardized dataset comprises the steps of, Performing association arrangement on the internal control running state data according to the time index to form an original data sequence arranged according to the time sequence; Performing dimension unification and numerical normalization processing on the control parameter values and the control parameter variation in the original data sequence, and performing unification mapping processing on the running state identifiers to form a standardized data sequence; and performing consistency alignment and missing item processing on the standardized data sequence according to the time index, and packaging to form a standardized data set.
- 4. The cooperative prediction system of internal control defects based on federal causal forest and safety aggregation according to claim 1, wherein the construction of the causal forest structure comprises the following specific steps of, Organizing training samples according to time indexes based on a standardized data set, and dividing the training samples into covariates, processing variables and target variables to form a training sample set; performing residual orthogonalization preprocessing on the training sample set, and stripping the influence of the covariates on the processing variable and the target variable to obtain a preprocessed sample set; and dividing the sample space layer by layer based on the pretreatment sample set to generate a causal tree structure, forming a local treatment effect representation at a final node, integrating the local treatment effect representation, and constructing a causal forest structure.
- 5. The cooperative prediction system of internal control defects based on federal causal forest and safety aggregation according to claim 1, wherein the local causal model is obtained by the following steps, Based on a causal forest structure, extracting a splitting condition sequence from a root node to a leaf node of each causal tree, and storing each group of splitting condition sequences in association with a local processing effect characterization of the corresponding leaf node to form a candidate causal path library; and (3) jointly packaging the causal forest structure, the candidate causal path library and metadata for recording the split point information of each causal tree on the training set to obtain a local causal model.
- 6. The cooperative prediction system of internal control defects based on federal causal forest and safety aggregation according to claim 1, wherein the encryption causal model is obtained by the following steps, Based on the local causal model, reading a causal forest structure, a candidate causal path library and a path decision record to form a local causal model representation to be encrypted; and executing Paillier homomorphic encryption processing on the local causal model representation, and packaging the encrypted local causal model representation to form an encryption causal model.
- 7. The federal causal forest and safety aggregate based cooperative prediction system for internal control defects according to claim 1, wherein said forming a global causal forest model is performed by the following steps, Receiving an encryption causal model at a central aggregation node, and performing consistency alignment on the causal tree number, the hierarchical order and the candidate causal path index according to the causal forest structure to form an encryption causal model set; Based on the encryption causal model set, executing cooperative aggregation processing in an encryption domain on the local processing effect characterization under the candidate causal path to form a path-level global processing effect characterization; Based on the overall processing effect characterization, carrying out consistency correction on the path effects among different participants, weakening the path effects deviating from the overall distribution interval, and obtaining the path processing effect characterization; and writing the path processing effect characterization back to the corresponding causal forest structure position, and forming a global causal forest model under the condition of keeping the splitting condition sequence unchanged.
- 8. The federal causal forest and security aggregation-based internal control defect collaborative prediction system according to claim 7, wherein the central aggregation node is a logic processing entity for executing unified aggregation calculation on encryption causal models uploaded by a plurality of participants without decrypting the encryption causal models.
- 9. The cooperative prediction system of internal control defects based on federal causal forest and safety aggregation according to claim 1, wherein the obtained predictive characterization information of the internal control defects comprises the following specific processes, Based on the global causal forest model, reading newly accessed internal control running state data item by item according to a time index sequence, and inputting the internal control running state data into a corresponding splitting condition sequence in the global causal forest model; traversing each cause and effect tree in the global cause and effect forest layer by layer along the splitting condition sequence, positioning the corresponding candidate cause and effect path and reading the associated path processing effect characterization; summarizing path processing effect characterizations output by a plurality of causal trees under the same time index to form prediction characterization information under the time index.
- 10. The federal causal forest and safety aggregation based internal control defect collaborative prediction system according to claim 1, wherein the optimized global causal forest model is specifically implemented by the following steps, Based on the prediction characterization information, reading the path processing effect characterization of the corresponding candidate causal path, and screening the path processing effect characterization which continuously deviates in the time sequence as an updating object; and positioning the candidate causal path positions corresponding to the updated objects in the global causal forest model, and writing the path processing effect representation back to the corresponding final-stage node positions under the condition of keeping the splitting condition sequence unchanged, so as to optimize the global causal forest model.
Description
Internal control defect collaborative prediction system based on federal causal forest and safety aggregation Technical Field The invention relates to the technical field of data intelligence, in particular to an internal control defect collaborative prediction system based on federal causal forests and safety aggregation. Background Along with the complicating of the enterprise governance structure and the high informatization of the business process, the internal control running state data presents the characteristics of multiple sources, strong time sequence and strong coupling. In recent years, a data-driven method is gradually applied to the fields of internal control risk identification and defect prediction, and advanced perception of potential internal control defects is realized through modeling analysis of control parameters, running states and time evolution relations thereof. In the process, the causal inference model is paid attention to because of the causal relationship among variables, and the federal learning framework is gradually introduced into a multi-subject collaborative analysis scene so as to meet the requirement of cross-tissue data collaborative modeling. The prior art usually focuses on statistical correlation or static rule inference in internal control defect prediction, and is difficult to simultaneously consider time sequence self-adaptive modeling and data privacy protection requirements of a causal path in a multi-party environment. Especially under the condition that the internal control running state continuously evolves along with time, the existing method is difficult to realize stable aggregation and consistent inference of the causal path effect on the premise of not exposing local data or model details, so that the reliability and the interpretability of the collaborative prediction result are limited. Disclosure of Invention The present invention has been made in view of the above-described problems occurring in the prior art. Therefore, the invention provides an internal control defect collaborative prediction system based on federal causal forests and safety aggregation, which solves the problem that internal control defects are difficult to collaborative predict under the condition that internal control data of multiple participants cannot be shared. In order to solve the technical problems, the invention provides the following technical scheme: The invention provides an internal control defect collaborative prediction system based on federal causal forest and safety aggregation, which comprises a data standard module, a causal modeling module, a homomorphic encryption module, a causal optimization module and a path feedback module, wherein the data standard module is used for collecting internal control running state data from an internal control environment and carrying out standardization processing to obtain a standardized data set, the causal modeling module is used for locally constructing a causal forest structure on each participant based on the standardized data set and carrying out iterative optimization on causal path selection to obtain a local causal model, the homomorphic encryption module is used for carrying out encryption processing on the local causal model by adopting Paillier homomorphic encryption to obtain an encrypted causal model, the encrypted causal aggregation module is used for uploading the encrypted causal model to a central aggregation node and carrying out privacy-controlled collaborative aggregation calculation on the encrypted causal models of a plurality of participants by utilizing a differential privacy aggregation method to form a global causal forest model, and the causal inference module is used for carrying out causal inference calculation on the newly accessed internal control running state data to obtain the forecast characterization information of the internal control defect, and the path feedback module is used for carrying out instruction adjustment on the causal optimization module according to the forecast characterization information of the causal forest of the causal defect. As an optimal scheme of the federal causal forest and safety aggregation-based internal control defect collaborative prediction system, the internal control running state data comprises control parameter values, control parameter variable amounts, running state identifiers and corresponding time indexes. As a preferable scheme of the federal causal forest and safety aggregation-based internal control defect collaborative prediction system, the method comprises the following specific steps of obtaining a standardized data set, Performing association arrangement on the internal control running state data according to the time index to form an original data sequence arranged according to the time sequence; Performing dimension unification and numerical normalization processing on the control parameter values and the control parameter variation in