CN-122020216-A - Automatic modeling management method and system for data configuration items
Abstract
The invention provides a method and a system for automatically modeling and managing data configuration items, which relate to the technical field of data processing, wherein the method comprises the steps of automatically extracting initial metadata of the configuration items from a plurality of heterogeneous configuration sources; and carrying out structural topology analysis on the initial metadata to obtain initial metadata attributes, mapping the initial metadata attributes into space dimension parameters, completing clustering through an axis alignment minimum bounding rectangle clustering algorithm, and generating a corresponding axis alignment minimum bounding rectangle for each clustering cluster. The invention realizes the automation, standardization and intellectualization from the extraction of the configuration items from the metadata to the full period management.
Inventors
- LI TIEYU
- SONG WEIWEI
- FAN GUANGZHOU
- SHI YUFEN
Assignees
- 北京仁和诚信科技有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260126
Claims (10)
- 1. A method for automatically modeling and managing data configuration items, the method comprising: automatically extracting initial metadata of the configuration items from a plurality of heterogeneous configuration sources; Performing structural topology analysis on the initial metadata to obtain initial metadata attributes, mapping the initial metadata attributes into space dimension parameters, completing clustering through an axis alignment minimum bounding rectangle clustering algorithm, and generating a corresponding axis alignment minimum bounding rectangle for each clustering cluster; Constructing structural topological expression by using the association among the space position, the range boundary and the cluster of the minimum bounding rectangle; in the structural topology expression, taking the circumscribed integral rectangles of all rectangles as the top-layer reference, and combining the core attribute weights of all clusters to establish a hierarchical framework of the integral reference and the intra-cluster reference; Calculating a topology correction factor based on the range size, attribute dispersion, overlapping degree and distance relation of each subdomain and adjacent subdomains, adjusting metadata topology association and subdomain boundaries in the subdomains through the correction factor, correcting attribute dispersion and overlapping abnormity problems, and performing topology reconstruction on initial metadata to obtain optimized metadata; based on the optimized metadata, analyzing the business logic dependency relationship between configuration items through a preset business rule base and a natural language processing technology; And executing the full-period management operation of the configuration items based on the standardized configuration model.
- 2. The method of automatic modeling management of data configuration items according to claim 1, wherein automatically extracting initial metadata of configuration items from a plurality of heterogeneous configuration sources comprises: Automatically identifying and connecting a plurality of heterogeneous configuration sources, wherein the heterogeneous configuration sources comprise a database, a configuration file, a cloud service interface and a log file; based on the connected configuration sources, extracting structured attribute information and unstructured description information of configuration items from each configuration source to obtain original metadata; And carrying out format unification and field mapping processing on the original metadata, and eliminating grammar difference among sources to obtain the original metadata.
- 3. The method for automatically modeling and managing data configuration items according to claim 2, wherein performing structural topology analysis on the initial metadata to obtain initial metadata attributes, mapping the initial metadata attributes into spatial dimension parameters, completing clustering by an axis alignment minimum bounding rectangle clustering algorithm, and generating a corresponding axis alignment minimum bounding rectangle for each cluster, comprises: performing multidimensional structural topology analysis on the initial metadata, extracting an attribute feature set of the configuration item, and forming an attribute feature analysis result; Based on the attribute feature analysis result, carrying out normalization processing on the extracted attribute feature set, eliminating dimension differences, mapping the attribute feature of each normalized configuration item into a feature point in a multidimensional space according to the topological association degree weight among the attributes, and constructing a space topological mapping matrix for forming initial metadata; Based on a space topology mapping matrix, adopting an axis alignment minimum bounding rectangle clustering algorithm, carrying out cluster analysis on the feature points in the space topology mapping matrix according to the density distribution features of the feature points in a multidimensional space and a preset attribute similarity threshold value, and automatically identifying and dividing a plurality of cluster clusters, wherein each cluster comprises a configuration item feature point set with adjacent space positions and attribute feature similarity reaching the threshold value; Based on each cluster, calculating the minimum coordinate value and the maximum coordinate value of each dimension according to the distribution range of all feature points in the cluster on each dimension coordinate axis, and generating a minimum bounding rectangle which is strictly aligned with the coordinate axes.
- 4. The automatic modeling management method of data configuration items according to claim 3, wherein a structural topology expression is formed by associating a spatial position, a range boundary and clusters of a minimum bounding rectangle, in the structural topology expression, a hierarchical framework of an overall benchmark and an intra-cluster benchmark is built by combining core attribute weights of all the rectangles with an external overall rectangle of all the rectangles as a top-level benchmark, and according to metadata attribute distribution characteristics and cluster attribute relativity, a cluster area corresponding to each rectangle is used as an independent topology subdomain to complete partitioning, and initial metadata is mapped to a corresponding subdomain according to attribute characteristics, comprising: based on the minimum bounding rectangles, extracting the spatial position coordinates, the range boundary parameters and the spatial association relation among clusters of each minimum bounding rectangle, and integrating the parameters to form a structural topology expression data set in a multidimensional space; Based on the structural topology expression data set, calculating an external whole rectangle of the minimum bounding rectangle in the multidimensional space as a top-level space benchmark, and simultaneously, combining the cluster core attribute weights, and distributing corresponding intra-cluster benchmark weights for each cluster, thereby establishing a hierarchical framework system consisting of the top-level benchmark and a plurality of intra-cluster benchmarks; based on a hierarchical framework system, combining initial metadata attribute distribution characteristics and cluster attribute relevance, defining a cluster area corresponding to each minimum bounding rectangle as a sub-domain with independent topological characteristics, and completing topological sub-domain partition operation of the whole multidimensional space; Based on the topology subdomain partition result, mapping the initial metadata to the corresponding topology subdomains one by one according to the matching degree of the attribute characteristics and each topology subdomain.
- 5. The method of automatic modeling and managing data configuration items according to claim 4, wherein calculating topology correction factors based on the range size, attribute dispersion, overlapping degree with adjacent subfields and distance relation of each subfield, adjusting metadata topology association and subfield boundaries in the subfields by the correction factors, correcting attribute dispersion and overlapping anomaly problems, performing topology reconstruction on initial metadata to obtain optimized metadata, comprising: Extracting characteristic parameters of each independent topological subdomain, and calculating range size parameters of each subdomain, metadata attribute dispersion indexes in the subdomain, overlapping degree coefficients of adjacent subdomains and spatial distance relations between subdomains to form a subdomain characteristic parameter set; Based on the subdomain characteristic parameter set, carrying out weighted fusion calculation on the range size parameter, the attribute dispersion index, the overlapping degree coefficient and the distance relation according to a preset weight distribution strategy to obtain a topology correction factor for each subdomain; Dynamically adjusting the boundaries of all topological subdomains based on the topological correction factors, expanding and contracting subdomain ranges, and recalculating topological association strength among metadata in subdomains; And remapping and carrying out association reconstruction on the initial metadata based on the subdomain boundary and the corrected topological association strength, and eliminating attribute distribution abnormality and subdomain overlapping conflict to obtain metadata with an optimized topological structure.
- 6. The method for automatically modeling and managing data configuration items according to claim 5, wherein analyzing business logic dependencies between configuration items by a preset business rule base and natural language processing technology based on optimization metadata, comprises: The method comprises the steps of obtaining optimization metadata, separating and extracting structured service attributes and unstructured service description information in the optimization metadata to form a structured service data set and an unstructured service text set; Based on the structured service data set, calling a service logic rule template in a preset service rule base, and carrying out matching analysis on service logic dependency relations among configuration items to generate a preliminary service logic dependency relation set; based on the unstructured service text set, carrying out semantic feature extraction and entity relationship identification by applying a natural language processing technology, carrying out deep understanding on service semantic description of the configuration item, and generating a configuration item semantic feature vector set; Carrying out fusion analysis on the primary service logic dependency relationship set and the configuration item semantic feature vector set, and identifying configuration item semantic association pairs crossing different service boundaries through semantic similarity calculation and service rule verification; based on the configuration item semantic association pairs, a cross-service configuration item semantic association network is established, association strength weights and business logic type labels are allocated for each semantic association, and a complete business logic dependency relationship is formed.
- 7. The method of automatic modeling and managing data configuration items according to claim 6, wherein constructing a standardized configuration model according to the optimized metadata and business logic dependency relation, and performing full period management operation of the configuration items based on the standardized configuration model comprises: Receiving optimized metadata and a complete business logic dependency relationship map, and carrying out data fusion on the optimized metadata and the complete business logic dependency relationship map to generate a fusion data set containing configuration item attribute characteristics and business semantic association; Based on the fusion data set, carrying out type abstraction, attribute standardization and relation mode definition on the configuration items according to a preset standardized modeling rule, and constructing a standardized configuration model with a unified data structure and business semantics; Defining an operation rule set for full-period management of configuration items according to a preset life period management strategy based on a standardized configuration model; Based on the operation rule set, performing full-period management operation on the configuration items in the standardized configuration model, wherein the full-period management operation comprises the steps of automatically creating new configuration items based on business rules, dynamically updating configuration item attributes according to dependency relationships, managing configuration item version history records, verifying the service dependency relationship integrity of the configuration items, and automatically cleaning failure configuration items according to aging rules, so that the full-process automatic management from creation, use and destruction of the configuration items is realized.
- 8. A data configuration item automatic modeling management system implementing the method of any of claims 1 to 7, comprising: an extraction module for automatically extracting initial metadata of the configuration items from a plurality of heterogeneous configuration sources; The computing module is used for carrying out structural topology analysis on the initial metadata to obtain initial metadata attributes, mapping the initial metadata attributes into space dimension parameters, completing clustering through an axis alignment minimum bounding rectangle clustering algorithm, and generating a corresponding axis alignment minimum bounding rectangle for each clustering cluster; The system comprises a mapping module, a structure topology expression module, a clustering module, a data processing module and a data processing module, wherein the mapping module is used for constructing the structure topology expression by using the spatial position, the range boundary and the inter-cluster association of the minimum bounding rectangle; The correction module is used for calculating a topology correction factor based on the range size, attribute dispersion degree, overlapping degree and distance relation between each subdomain and the adjacent subdomains, adjusting metadata topology association and subdomain boundaries in the subdomains through the correction factor, correcting attribute dispersion and overlapping abnormal problems, and carrying out topology reconstruction on the initial metadata to obtain optimized metadata; The analysis module is used for analyzing the business logic dependency relationship between the configuration items through a preset business rule base and a natural language processing technology based on the optimization metadata; and the management module is used for constructing a standardized configuration model according to the dependency relationship between the optimized metadata and the business logic, and executing the full-period management operation of the configuration items based on the standardized configuration model.
- 9. A computing device, comprising: one or more processors; Storage means for storing one or more programs which when executed by the one or more processors cause the one or more processors to implement the method of any of claims 1 to 7.
- 10. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a program which, when executed by a processor, implements the method according to any of claims 1 to 7.
Description
Automatic modeling management method and system for data configuration items Technical Field The invention relates to the technical field of data processing, in particular to a method and a system for automatically modeling and managing data configuration items. Background With the continuous promotion of enterprise digital transformation, the problems of uneven data format, semantic isomerism, complex dependency and the like are increasingly highlighted due to the large increase of the number of configuration items derived from multi-source isomerism configuration sources, the traditional configuration management mode relies on manual metadata carding and manual configuration model construction, pain points such as low operation efficiency, lag in model updating, incomplete configuration item identification and the like generally exist, and the automatic management requirements of a large-scale distributed system are difficult to adapt. When relevant configuration items of a core business system are integrated, configuration data of the core business system are distributed in structural parameters, cache configuration, unstructured description of cloud service interfaces and log files of a main flow database, an operation and maintenance team organizes metadata in a manual recording arrangement mode, so that a great amount of time is consumed for completing initial modeling, and due to failure in solving cross-source semantic conflict and missing hidden dependency relations of configuration items in a key business process, faults of abnormal business data synchronization after configuration adjustment are caused, the core technical defects that a traditional method lacks heterogeneous metadata automatic integration capability, business dependency relation identification is inaccurate, a configuration model has no dynamic topology optimization mechanism are exposed, and the actual requirements of full-period dynamic management of the configuration items cannot be met. Disclosure of Invention The invention aims to solve the technical problem of providing a method and a system for automatically modeling and managing data configuration items, which realize the automation, standardization and intellectualization from the extraction of the configuration items from metadata to full period management. In order to solve the technical problems, the technical scheme of the invention is as follows: in a first aspect, a method for automatically modeling and managing a data configuration item, the method comprising: automatically extracting initial metadata of the configuration items from a plurality of heterogeneous configuration sources; Performing structural topology analysis on the initial metadata to obtain initial metadata attributes, mapping the initial metadata attributes into space dimension parameters, completing clustering through an axis alignment minimum bounding rectangle clustering algorithm, and generating a corresponding axis alignment minimum bounding rectangle for each clustering cluster; Constructing structural topological expression by using the association among the space position, the range boundary and the cluster of the minimum bounding rectangle; in the structural topology expression, taking the circumscribed integral rectangles of all rectangles as the top-layer reference, and combining the core attribute weights of all clusters to establish a hierarchical framework of the integral reference and the intra-cluster reference; Calculating a topology correction factor based on the range size, attribute dispersion, overlapping degree and distance relation of each subdomain and adjacent subdomains, adjusting metadata topology association and subdomain boundaries in the subdomains through the correction factor, correcting attribute dispersion and overlapping abnormity problems, and performing topology reconstruction on initial metadata to obtain optimized metadata; based on the optimized metadata, analyzing the business logic dependency relationship between configuration items through a preset business rule base and a natural language processing technology; And executing the full-period management operation of the configuration items based on the standardized configuration model. Further, automatically extracting initial metadata for a configuration item from a plurality of heterogeneous configuration sources, including: Automatically identifying and connecting a plurality of heterogeneous configuration sources, wherein the heterogeneous configuration sources comprise a database, a configuration file, a cloud service interface and a log file; based on the connected configuration sources, extracting structured attribute information and unstructured description information of configuration items from each configuration source to obtain original metadata; And carrying out format unification and field mapping processing on the original metadata, and eliminating grammar difference among sources to obtain