CN-122018779-A - Data intelligent hierarchical storage method and system based on multidimensional quantitative evaluation
Abstract
The invention discloses a data intelligent hierarchical storage method and system based on multidimensional quantitative evaluation, and relates to the technical field of computer data storage. The method is used for solving the problems of lag identification of the traditional hierarchical storage cold-hot data, lack of cooperativity of migration strategies and insufficient regulation of system performance fluctuation. The method comprises the steps of collecting dynamic access characteristics and static attribute characteristics of data objects in real time, adaptively adjusting characteristic weights according to liveness fluctuation, calculating layering migration coefficients, analyzing co-occurrence relations of business operation sequences to construct a data association graph, establishing cooperative migration constraint by combining density clustering and space-time locality to realize synchronous migration of high-association objects, analyzing and extracting trend items, season items and residual items by utilizing time sequences, constructing a business load prediction function, identifying low-load migration windows, carrying out priority scheduling on high-I/O tasks, and dynamically adjusting model parameters by monitoring performance jitter and response deviation, so that layering decision intellectualization and system stability are improved.
Inventors
- LIU FENG
- XU WENCHAO
- YU NING
- AN BINGJIAN
- ZHUANG XIAOBIN
Assignees
- 山东省创新发展研究院(山东信息通信技术研究院管理中心)
- 广州博士信息技术研究院有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20251208
Claims (9)
- 1. The intelligent hierarchical storage method for data based on multidimensional quantitative evaluation is characterized by comprising the following steps of: S1, acquiring dynamic access characteristics and static attribute characteristics of a data object in real time, adaptively adjusting characteristic weights according to fluctuation intensity of data liveness, and calculating layering migration coefficients of the data object; S2, taking the layered migration coefficient as input, constructing a data association graph by analyzing the co-occurrence relation in the service operation sequence, identifying a data object set with strong service association by a clustering algorithm based on density, establishing a collaborative migration constraint rule based on space-time locality characteristics of data access, and synchronously adjusting the storage strategy of related objects in the set according to the constraint rule when the layered migration coefficient of any object in the association set is detected to trigger the change of the storage hierarchy; S3, inputting the layered migration coefficient and the collaborative migration constraint rule into a migration decision unit, periodically decomposing historical access data through a time sequence analysis algorithm, extracting a trend item, a season item and a residual item, constructing a service load prediction function, identifying a time period when the utilization rate of system resources is lower than a set threshold value as a migration window, decomposing a migration task with high I/O load into a plurality of subtasks and scheduling and executing in the identified migration window; S4, by monitoring the system performance jitter amplitude and the service response time deviation degree in the migration process, when the root mean square value of the performance jitter amplitude exceeds a first threshold value or the service response time deviation degree exceeds a second threshold value, the weight ratio of the data activity degree is reduced, and the seasonal item weight of the service load prediction function is adjusted.
- 2. The method for intelligently layering and storing data based on multi-dimensional quantitative evaluation according to claim 1, wherein the step S1 specifically comprises the following steps: collecting access time sequences of data objects in real time, extracting access frequency, access time interval distribution and concurrent access quantity as dynamic access characteristics, and acquiring storage capacity, data structure complexity and predefined service key level of the data objects as static attribute characteristics; Calculating a variation coefficient of the data liveness, and dynamically adjusting the weight distribution proportion of the dynamic access characteristic and the static attribute characteristic based on the magnitude of the variation coefficient; And according to the weighted characteristic values, obtaining layered migration coefficients of the data object through linear combination calculation, wherein the weight of the dynamic access characteristic is increased along with the increase of the variation coefficient.
- 3. The multi-dimensional quantitative evaluation-based data intelligent hierarchical storage method according to claim 1, wherein the logical process of constructing a data association graph by analyzing co-occurrence relations in a business operation sequence is as follows: analyzing a service operation serial number, extracting a data access sequence in the service operation, and calculating the co-occurrence frequency and time sequence relation of the data object in the service operation; Constructing an association matrix of the data objects based on the co-occurrence frequency, wherein matrix element values represent the association strength among the data objects, constructing a weighted data association graph according to the association matrix, wherein nodes in the graph represent the data objects, and edge weights represent the association strength; mapping the data association map to a low-dimensional vector space through a map embedding algorithm, and reserving association relations among the data objects.
- 4. The method for intelligently layering and storing data based on multi-dimensional quantitative evaluation according to claim 3, wherein the specific process of identifying a data object set with strong business association through a density-based clustering algorithm and establishing a collaborative migration constraint rule based on space-time locality characteristics of data access is as follows: Applying an OPTICS clustering algorithm on a low-dimensional vector space of the data association graph, identifying data object sets connected in density, calculating the contour coefficient of each clustering set, and screening out the data object sets with strong service association; analyzing access temporal-spatial features of a set of data objects, extracting access temporal correlations and storage space adjacencies, and establishing a collaborative migration constraint rule based on the temporal-spatial locality features, comprising: Migration time sequence constraint, namely completing migration of the associated data object in the same time period; Storage location constraints, associated data objects remain spatially contiguous in the target storage layer; access path constraint, namely, the related data object keeps original access path optimization after migration.
- 5. The method for intelligently layering and storing data based on multidimensional quantitative assessment according to claim 4, wherein when detecting that layering migration coefficients of any object in the association set trigger storage level change, the specific process of synchronously adjusting storage strategies of the related objects in the association set according to constraint rules is as follows: monitoring the real-time change of the layered migration coefficients of all the data objects in the association set, and triggering the collaborative migration evaluation of the association set when the migration coefficient of any data object reaches a storage level change threshold; According to the association strength weight in the collaborative migration constraint rule, the target storage hierarchy of each data object in the collection is recalculated, wherein for the data object with direct strong association with the trigger object, the target storage hierarchy is adjusted to be the same hierarchy as the trigger object, and for the data object with indirect association with the trigger object, the recommended storage hierarchy is calculated based on the association strength weight; Generating a collaborative migration sequence based on the recalculated target storage hierarchy, and ensuring that the related data objects finish hierarchy switching according to preset time sequence constraint; in the migration execution process, the execution sequence and concurrency of migration tasks are dynamically adjusted according to the utilization rate of system resources.
- 6. The multi-dimensional quantitative evaluation-based data intelligent hierarchical storage method according to claim 1 is characterized in that the historical access data is periodically decomposed through a time sequence analysis algorithm, trend items, season items and residual items are extracted, and the specific process for constructing a business load prediction function is as follows: acquiring a data access record in a historical time period, forming a time sequence according to the fixed time granularity statistics access quantity, performing stability test on the time sequence, and eliminating a non-stable component through differential operation; extracting trend items by a moving average method, reflecting the long-term change rule of the access quantity, analyzing and identifying main season periods by a period diagram, and extracting season item components by utilizing Fourier transformation; And calculating residual error items, checking autocorrelation of the residual error items, establishing an autoregressive model to describe random fluctuation, and overlapping and reconstructing trend items, seasonal items and residual error items to form a service load prediction function.
- 7. The method for intelligently layering and storing data based on multidimensional quantitative assessment according to claim 1, wherein a period of time when the utilization rate of system resources is lower than a set threshold is identified as a migration window, and the specific process of decomposing a migration task with high I/O load into a plurality of subtasks and scheduling execution within the identified migration window is as follows: Based on the service load prediction function, calculating an expected value of the resource utilization rate of the system, identifying a period when the resource utilization rate is continuously lower than a set threshold value, and determining an available migration window by combining service operation characteristics; logically grouping high I/O load migration tasks according to space-time constraint conditions in the collaborative migration constraint rule, and decomposing each migration group into a plurality of subtasks according to the data block size and the I/O load balancing principle; and establishing a subtask priority assessment mechanism based on the layered migration coefficient and the service criticality, and scheduling and executing subtasks according to the priority order in the determined migration window.
- 8. The method for intelligently layering and storing data based on multi-dimensional quantitative evaluation according to claim 1, wherein the step S4 comprises the following steps: Continuously acquiring system performance indexes including CPU (Central processing Unit) utilization rate, memory occupancy rate and disk I/O (input/output) response time in the migration process, calculating a root mean square value of a performance jitter amplitude, and counting the deviation degree of the service response time relative to a reference value; Establishing a performance evaluation matrix, comparing and analyzing performance indexes with a preset threshold, and when the performance jitter amplitude exceeds a first threshold, reducing the duty ratio of the data liveness in the characteristic weight according to the corresponding proportion according to the deviation degree, and simultaneously improving the weight duty ratio of the service criticality according to the corresponding proportion; when the deviation degree of the service response time exceeds a second threshold value, adjusting the amplitude parameter of the seasonal term in the service load prediction function according to the deviation degree, and correspondingly reducing the influence range of the periodic fluctuation.
- 9. The data intelligent hierarchical storage system based on the multi-dimensional quantitative evaluation is applied to the data intelligent hierarchical storage method based on the multi-dimensional quantitative evaluation as claimed in any one of claims 1 to 8, and is characterized by comprising the following modules: The data monitoring weight module is used for acquiring dynamic access characteristics and static attribute characteristics of the data object in real time, adaptively adjusting the characteristic weight according to the fluctuation intensity of the data activity, and calculating the layering migration coefficient of the data object; The association analysis constraint module is used for taking the layered migration coefficient as input, constructing a data association map by analyzing the co-occurrence relation in the business operation sequence, identifying a data object set with strong business association through a clustering algorithm based on density, establishing a collaborative migration constraint rule based on space-time locality characteristics of data access, and synchronously adjusting the storage strategy of the related object in the set according to the constraint rule when the layered migration coefficient of any object in the association set is detected to trigger the storage level change; The migration decision scheduling module is used for inputting the layered migration coefficient and the collaborative migration constraint rule into the migration decision unit, periodically decomposing the historical access data through a time sequence analysis algorithm, extracting a trend item, a season item and a residual item, constructing a service load prediction function, identifying a time period with the system resource utilization rate lower than a set threshold as a migration window, decomposing a migration task with high I/O load into a plurality of subtasks and scheduling and executing in the identified migration window; And the performance monitoring optimization module is used for reducing the weight duty ratio of the data activity degree and adjusting the seasonal term weight of the service load prediction function when the root mean square value of the performance jitter amplitude exceeds a first threshold value or the service response time deviation exceeds a second threshold value by monitoring the system performance jitter amplitude and the service response time deviation degree in the migration process.
Description
Data intelligent hierarchical storage method and system based on multidimensional quantitative evaluation Technical Field The invention relates to the technical field of computer data storage, in particular to a data intelligent hierarchical storage method and system based on multidimensional quantitative evaluation. Background With the deep advancement of enterprise digital transformation and the popularization of internet of things equipment, the global data volume is undergoing explosive growth. The data generated in the daily operation of the enterprise is huge in quantity, and the characteristics of multi-source isomerism and dynamic change of the access mode are presented. Data has become a core asset for enterprises, and the efficiency of storage management directly affects business response speed, operating cost and market competitiveness. Under the background, how to construct an intelligent data management system which can meet the high-performance access requirement and effectively control the storage cost becomes a key technical challenge in the enterprise digital transformation process. The current data hierarchical storage method has several substantial defects that firstly, static hierarchical strategies based on fixed rules cannot accurately reflect real-time value changes of data, service key data accessed at high frequency are often reserved at a storage level with insufficient performance, or history data accessed rarely occupies high-performance storage resources for a long time. Second, existing methods typically handle individual data objects in isolation, neglecting the inherent correlation between data in business operations, resulting in tightly-correlated data being stored in a decentralized manner at different performance levels, increasing access latency for business operations. In addition, the traditional data migration opportunity selection lacks intelligent prediction capability, and often triggers large-scale data migration in the service peak period, so that system resource contention is aggravated, and normal service operation is affected. These drawbacks make it difficult for existing storage systems to achieve an optimal balance of performance and cost in a dynamically changing business environment. Disclosure of Invention Aiming at the defects of the prior art, the invention provides a data intelligent hierarchical storage method and system based on multidimensional quantitative evaluation, which solve the problems of the background art. S1, collecting dynamic access characteristics and static attribute characteristics of a data object in real time, adaptively adjusting characteristic weights according to fluctuation intensity of data liveness, and calculating layering migration coefficients of the data object; S2, constructing a data association map by analyzing the co-occurrence relation in a service operation sequence, identifying a data object set with strong service association through a density-based clustering algorithm, establishing a collaborative migration constraint rule based on the space-time locality feature of data access, synchronously adjusting the storage strategy of related objects in the set according to the constraint rule when the hierarchical migration coefficient of any object in the association set is detected to trigger the change of a storage hierarchy, S3, inputting the hierarchical migration coefficient and the collaborative migration constraint rule into a migration decision unit, periodically decomposing historical access data through a time sequence analysis algorithm, extracting trend terms, season terms and residual terms, constructing a service load prediction function, identifying a period with the system resource utilization rate lower than a set threshold as a migration window, decomposing a migration task with high I/O load into a plurality of subtasks and scheduling and executing the subtasks in the identified migration window, S4, reducing the weight occupation ratio of the data activity degree and adjusting the service load prediction function through monitoring the system performance jitter amplitude and the service response time deviation degree in the migration process when the root mean square value of the performance jitter amplitude exceeds a first threshold or the service response time deviation degree exceeds a second threshold. Further, the step S1 specifically comprises the steps of collecting access time sequences of data objects in real time, extracting access frequency, access time interval distribution and concurrent access quantity as dynamic access characteristics, obtaining storage capacity, data structure complexity and predefined business key level of the data objects as static attribute characteristics, calculating variation coefficients of data liveness, dynamically adjusting weight distribution proportion of the dynamic access characteristics and the static attribute characteristics based on the magnitude o