CN-122022983-A - Multi-dimensional credit investigation data processing system oriented to cloud architecture
Abstract
The invention relates to the technical field of big data processing and information security, and discloses a multidimensional credit information data processing system facing a cloud architecture, which comprises the following steps: the system comprises a multi-source data acquisition and standardization module, a multi-dimensional space-time topology mapping engine, a query intention analysis and topology preheating module, a cold-hot hierarchical encryption storage scheduler, a path integrity verification and cache management module and an event driving and increment calculation module. The method comprises the steps of constructing a multidimensional space-time topological graph comprising entities and field nodes, quantifying the influence of spatial attributes on credit, carrying out cold-hot layered encryption storage by utilizing query pattern recognition, constructing a cache fingerprint based on an encryption object to verify path integrity, responding to data change, and carrying out dynamic increment calculation by utilizing a credit potential energy conduction model in combination with a consistency verification result. According to the method, on the premise of guaranteeing the data privacy and consistency, the calculation efficiency and the evaluation accuracy of the large-scale credit investigation map are improved.
Inventors
- TIAN JUN
- WANG ZHI
- CHEN WENQING
Assignees
- 湖北省征信有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20251229
Claims (10)
- 1. A cloud architecture-oriented multidimensional credit data processing system, comprising: The multi-source data acquisition and standardization module is used for mapping the original data into standardized feature vectors and monitoring the change of the standardized feature vectors to generate data change events; the multidimensional space-time topology mapping engine is used for receiving the standardized feature vectors and constructing a multidimensional space-time topology map; the query intention analysis and topology preheating module is used for generating a preloading instruction aiming at the corresponding sub-graph range according to the target query mode; the cold-hot layered encryption storage scheduler module is used for scheduling node data between a memory database hot area and an object storage cold area according to the preloading instruction and executing encryption on the node data settled to the object storage cold area to generate an encrypted data object; The path integrity verification and cache management module is used for constructing a cache fingerprint by using encrypted data objects corresponding to all nodes on a target conduction path in the multidimensional space-time topological graph, reading the encrypted data objects of all nodes to generate a real-time fingerprint before credit potential energy calculation is initiated, and comparing the real-time fingerprint with the cache fingerprint to generate a verification result; And the event driving and increment calculating module is used for responding to the data change event, carrying out credit potential energy calculation according to the verification result based on the credit potential energy conduction model, obtaining potential energy variation and updating credit scores.
- 2. The system of claim 1, wherein the multi-source data acquisition and normalization module is specifically configured to: Converting the original data into standardized feature vectors through a preset standardized function; the data change event includes a unique identifier of the credit principal, a feature change amount, a time stamp, and an operation type.
- 3. The system of claim 1, wherein the multi-dimensional spatio-temporal topology mapping engine is further configured to calculate weights for associated edges and home edges between nodes; the method for constructing the field node by the multidimensional space-time topology mapping engine comprises the following steps: analyzing regional keywords from address fields of a credit investigation main body, and materializing the regional keywords into independent nodes in a map; The multidimensional space-time topology mapping engine establishes a home edge of a physical node pointing to the field node.
- 4. The system of claim 1, wherein the query intent analysis and topology pre-heating module calculates the frequency of the target query pattern by a sliding time window statistical model.
- 5. The system of claim 4, wherein the calculation logic of the frequency of the target query pattern is: in the sliding window, carrying out weighted summation on the requests matching the target query mode, and dividing the weighted summation by the window time length; And when the calculated frequency of the target query mode exceeds a preset heat threshold, generating the pre-loading instruction.
- 6. The system of claim 1, wherein the logic of the hot and cold hierarchical encryption storage scheduler module to perform data encryption and decryption is configured to: When the node data is required to be persisted to the object storage cold area, encrypting the serialized node data by using a preset platform public key to generate the encrypted data object; and when the node data is required to be loaded into the hot area of the memory database, decrypting and restoring the encrypted data object by using a platform private key only stored in the memory isolation area.
- 7. The system of claim 1, wherein the path integrity verification and cache management module constructs a topological hash fingerprint chain as a cache fingerprint and stores the cache fingerprint in the in-memory database hot zone in a manner comprising: Acquiring the encrypted data object corresponding to each node on the target conduction path in the object storage cold area; Performing hash digest calculation on the encrypted data object to generate a node-level fingerprint; according to the topological order of the target conduction path, performing cascade concatenation on node-level fingerprints of all nodes on the target conduction path; And performing hash operation again on the data after cascade concatenation to generate a fingerprint chain signature unique to the target conduction path.
- 8. The system of claim 3, wherein the path integrity verification and cache management module pre-calculates a composite conductivity coefficient for the target conductive path, the composite conductivity coefficient being equal to a continuous product of all edge weights on the target conductive path.
- 9. The system of claim 3, wherein the credit potential energy conduction model defined in the event driven and delta calculation module comprises multiplying a feature variance in the data modification event by a feature sensitivity matrix to obtain an initial credit potential energy; the initial credit potential energy is used as output energy of a source node to be injected into a multidimensional space-time topological graph, and accumulated credit potential energy of any affected node in the multidimensional space-time topological graph is calculated; When the accumulated credit potential exceeds a dynamic sensitivity threshold, a credit score update for the node is triggered and credit potential is conducted to the downstream node.
- 10. The system of claim 1, further comprising an application service interface module for mapping the updated credit score to a credit rating.
Description
Multi-dimensional credit investigation data processing system oriented to cloud architecture Technical Field The invention relates to the technical field of big data processing and information security, in particular to a multidimensional credit investigation data processing system facing a cloud architecture. Background With the rapid development of digital economies, the data sources of credit investigation systems have been extended from traditional credit records to multi-source heterogeneous data of tax, business, judicial, etc. In order to meet the storage and calculation requirements of mass data, the credit investigation system gradually migrates to the cloud primary framework, and the construction of a large-scale enterprise relationship graph by utilizing the elastic resources of the cloud platform becomes a mainstream trend of the industry. The existing atlas-based credit investigation data processing technology still has limitations in terms of model dimension, calculation timeliness and data security. In the model dimension, traditional credit graph construction mainly depends on explicit business association such as equity investment, guarantee chain or supply chain transaction and the like. Although the modeling mode can track the fund flow direction risk, the hidden influence of the geospatial environment where the credit investigation subject is located on the operation stability of the fund flow direction risk is often ignored, for example, the field factors such as policy change of an industrial park, economic scene fluctuation of the area where the credit investigation subject is located and the like cannot be effectively expressed in the map, so that the risk assessment model is difficult to identify the systematic risk caused by regional factors. In terms of calculation timeliness and data consistency, with the exponential growth of the map scale, credit conduction calculation based on full graph traversal is often time-consuming and difficult to meet the requirements of real-time credit inquiry. Although the prior art generally introduces a buffer mechanism to promote the response speed, when the data is changed in the face of high-frequency concurrency, a high-efficiency verification means for a complex topological structure is lacked. When the node attribute or association relationship of the bottom layer slightly changes, the system often has difficulty in accurately judging whether the path calculation result in the cache is still valid or not, and credit score lag or erroneous evaluation result output based on dirty data is easy to cause. In terms of data storage and privacy protection under a cloud architecture, the prior art generally faces performance and security trade-off challenges. In order to ensure the safety of sensitive credit investigation data, the strong encryption storage of the whole data can obviously increase the I/O overhead and cause inquiry delay, and in order to pursue the search speed and reduce the encryption intensity, the data leakage risk can be exposed in a shared resource pool such as object storage and the like. The existing storage scheduling scheme lacks dynamic perception capability for map data access heat, and is difficult to realize low-cost and high-strength privacy protection for cold data while meeting high-concurrency low-delay inquiry. Disclosure of Invention Aiming at the defects of the prior art, the invention provides a multi-dimensional credit investigation data processing system facing a cloud architecture, which solves the problems that the prior cloud credit investigation system lacks consideration of the dimension of the geographic space environment and is difficult to consider calculating instantaneity, data consistency verification efficiency and full-link privacy security when processing massive map data. In order to achieve the purpose, the invention is realized by the following technical scheme that the multidimensional credit information data processing system facing the cloud architecture comprises: The multi-source data acquisition and standardization module is used for mapping the acquired original data into standardized feature vectors and monitoring the change of the standardized feature vectors by utilizing a change data acquisition technology so as to generate a data change event; The multidimensional space-time topology mapping engine is connected with the multisource data acquisition and standardization module and is used for receiving the standardized feature vectors, constructing a multidimensional space-time topology map comprising entity nodes and field nodes, and calculating the weights of the associated edges and the attribution edges among the nodes; the query intention analysis and topology preheating module is used for analyzing a query log stream to identify a target query mode and generating a preloading instruction aiming at a corresponding sub-graph range according to the target query mode; The cold-hot