Search

CN-122019161-A - Big data platform service management system based on AI algorithm

CN122019161ACN 122019161 ACN122019161 ACN 122019161ACN-122019161-A

Abstract

The invention discloses a large data platform service management system based on an AI algorithm, which relates to the technical field of large data platform service management. The task priority assessment module utilizes a machine learning algorithm and a multiple linear regression model to quantitatively assess task priorities by combining features such as real-time requirements, data volume, business importance and the like, and optimizes assessment results by correcting parameters. The invention obviously improves the resource utilization rate and the task execution efficiency of the big data platform through intelligent priority assessment and resource allocation, enhances the reliability of data transmission, and provides an efficient and flexible solution for big data processing.

Inventors

  • LUO JINHONG
  • Luo Qiaobo
  • LUO LUTANG

Assignees

  • 泽宇科技集团有限公司

Dates

Publication Date
20260512
Application Date
20260127

Claims (8)

  1. 1. The large data platform service management system based on the AI algorithm is characterized by comprising the following components: the data sensing module is used for collecting platform data in real time through the distributed sensor and the data acquisition interface, summarizing the platform data to the buffer pool and monitoring the data quality at the same time; The task priority evaluation module is used for extracting the key features of the tasks by using a machine learning algorithm to calculate priority scores based on the business rules and the user demands, and optimizing and evaluating the corrected priority scores by combining the correction parameters; The resource allocation module is used for constructing a resource allocation intelligent agent, attempting different allocation actions according to data perception and task priority evaluation results by using a deep reinforcement learning algorithm, and determining an optimal resource allocation scheme according to system feedback rewards; And the resource scheduling execution module is used for distributing various resources to the tasks according to the optimal scheme of the resource distribution module, ensuring data transmission by utilizing a high-speed network and a distributed file system and ensuring the smooth execution of the tasks.
  2. 2. The AI algorithm-based big data platform service management system of claim 1, wherein the task priority assessment module is executed as follows: firstly, extracting key features affecting task priority, including: Real-time requirements The real-time requirement is quantified by the deadline and the expected response time index of the task; Data size The data size related to the task is measured by byte number and record number unit; Importance of business Evaluating the importance degree of the service according to the task, and quantifying the importance of the service by expert evaluation and service rule definition; Resource demand The method comprises the steps of determining computing resources, storage resources and network resources required by task completion through historical execution records of the tasks and resource demand statement information in task description; The multiple linear regression model is used to calculate the priority score of the task in the form of: In the middle of A priority score representing a task; is an intercept term; Regression coefficients of real-time requirements, data size, business importance and resource requirements are respectively; Is an error term; after determining the priority score of the task Thereafter, the priority is scored by modifying the parameters And (5) performing correction.
  3. 3. The AI algorithm-based big data platform service management system of claim 2, wherein the task priority assessment module trains the model using historical task data when determining regression coefficients, the training process is as follows: collecting different historical task information in a big data platform, including characteristic values of each task The priority label carries out manual labeling according to the business rules and the historical experience; the collected data are subjected to cleaning and normalization pretreatment operation so as to improve the training effect of the model, and a minimum-maximum normalization method is specifically used for scaling each characteristic value into a [0,1] interval; training a multiple linear regression model using the preprocessed data, solving regression coefficients by minimizing the sum of squares of errors Specifically using a gradient descent method or a least squares optimization algorithm.
  4. 4. The AI algorithm-based big data platform service management system of claim 2, wherein the task priority assessment module scores priority by modifying parameters The specific operation steps for correction are as follows: The correction parameters include: Dependency of tasks The dependence degree of the task is represented by a dependence coefficient, and the range of the value is When the task has no pre-dependency, ; Degree of task urgency The emergency degree reflects the urgency of the task to be processed immediately at the current moment, and the emergency degree is quantified by manual evaluation or according to specific business rules, and the value range is A larger value indicates a more urgent; availability of resources The execution of the task requires corresponding computing resources, storage resources and network resources, when the required resources are not currently available, the actual priority of the task needs to be adjusted, the availability of the resources is measured by calculating the ratio of the currently available resources to the resources required by the task, and the range of the values is ; Historical execution record of tasks The historical execution condition of the task reflects the stability and reliability thereof, and the historical execution record coefficient is calculated according to the historical success rate and the average execution time index of the task, and the value range is The larger the value, the better the historical execution; Comprehensively considering the four correction parameters, and scoring the priority of the task Correction is carried out, and priority score after correction Calculated by the following formula: Through the correction process, the priority evaluation of the task is more comprehensive and accurate.
  5. 5. The AI algorithm-based big data platform service management system of claim 1, wherein the specific operation steps of the resource allocation module are as follows: The method comprises the steps of determining the types of resources and task sets in a big data platform, randomly distributing resources for each task, randomly distributing the resources for each task by combining the priority scores of the tasks, giving more proportion of resource trends to the tasks with higher priority scores during the random distribution until all the tasks are distributed with various resources to form a random resource distribution scheme; Analyzing the feedback rewards of each scheme to obtain a reward evaluation value JPZ, taking the obtained reward evaluation value JPZ as a measurement standard for measuring the feedback rewards of different schemes, sorting the reward evaluation values JPZ of different schemes according to the size, and selecting the scheme with the largest reward evaluation value JPZ as the current optimal scheme.
  6. 6. The AI-algorithm-based big data platform service management system according to claim 5, wherein the analysis process of the prize evaluation value JPZ by the resource allocation module is as follows: analyzing the feedback rewards of each scheme, wherein the feedback rewards comprise: Task completion time calculating average completion time of all tasks To obtain Time judgment value TP; The resource utilization rate is measured by calculating the average utilization rate of various resources, and the average utilization rate is recorded as ZL; Task priority satisfaction degree, namely calculating rewards according to the priority scores and the completion conditions of the tasks, and scoring the corrected priority scores of different tasks Multiplying and summing the completion of the tasks to obtain a completion value, and marking the completion value as WCF; And substituting the obtained time judgment value TP, average utilization rate ZL and completion value WCF after normalization treatment into the following formula: to obtain the prize value JPZ, in Respectively a time judgment value TP, an average utilization rate ZL and a preset weight coefficient of a completion value WCF.
  7. 7. The AI algorithm-based big data platform service management system of claim 1, wherein the specific operation steps of the resource scheduling execution module are as follows: After the intelligent resource allocation module is received to determine an optimal resource allocation scheme, analyzing the CPU core number, the memory size and the storage capacity resource information which are required to be allocated by each task in the optimal resource allocation scheme, and converting text information in the scheme into a data structure which can be identified by a system; Before resource allocation, judging the current available state of various resources in the system, acquiring CPU utilization rate, memory residual quantity and storage free space information of each resource node, evaluating the feasibility of an optimal resource allocation scheme according to the inquired resource state, checking whether the resources required by each task have enough available amount in the current system, and triggering a resource coordination mechanism when the resources required by a certain task are found to be insufficient; The method comprises the steps of interacting with a task scheduler of an operating system, distributing a corresponding number of CPU cores for each task according to a distribution scheme, binding the tasks to the designated CPU cores by modifying the CPU affinity settings of the tasks to ensure that the tasks can run on the assigned cores, reserving memory spaces with designated sizes for each task in a system memory, distributing continuous or discontinuous memory blocks for the tasks through a memory management mechanism of the operating system, distributing corresponding storage capacity for the tasks according to the distribution scheme, creating or mounting special storage volumes for the tasks through a distributed file system, and ensuring that the tasks independently read and write the distributed storage spaces; the method comprises the steps of establishing a stable data transmission channel between different resource nodes by utilizing a high-speed network interface, initializing and configuring network connection through a network protocol stack, scheduling data transmission according to the demands of tasks and the importance of data, monitoring the transmission state in real time in the data transmission process, detecting whether transmission errors or packet loss occur, and immediately starting a retransmission mechanism when the errors are found, so as to ensure the integrity and the accuracy of the data; And sending a starting instruction to a task execution engine to transfer the resource information and the data path required by the task.
  8. 8. The AI algorithm-based big data platform service management system of claim 7, wherein the resource coordination mechanism in the resource scheduling execution module comprises: Firstly, determining the difference value between the resources required by the task and the available resources of the current system, setting a priority score threshold according to the corrected priority score analyzed by the task priority evaluation module, judging the task to be a high-priority task if the priority score threshold is larger than the priority score threshold, otherwise, judging the task to be a low-priority task, and taking the low-priority task as a candidate object for releasing the resources; For each candidate object for releasing the resources, calculating the releasable resource quantity by combining the current resource use condition of the task and the minimum resource quantity for maintaining the task to run; And when the released resources of the low-priority tasks cannot meet the requirements, requesting additional resources from a resource management center or an external resource provider.

Description

Big data platform service management system based on AI algorithm Technical Field The invention relates to the technical field of large data platform service management, in particular to a large data platform service management system based on an AI algorithm. Background With the rapid development of information technology, large data platforms are widely used in various fields, and the data processing capacity and service efficiency of the large data platforms are critical to the operation and decision of enterprises. However, conventional data platform service management faces a number of challenges. On one hand, the data volume is in explosive growth, and the data types are complex and various, so that the difficulty of data acquisition, storage and processing is greatly increased. On the other hand, the demands of different tasks on resources are huge, the evaluation of task priorities lacks of accuracy and dynamic adaptability, and the multi-dimensional business demands such as instantaneity, business importance and the like are difficult to meet. In addition, the unreasonable resource allocation often leads to low utilization rate of system resources, low task execution efficiency, even occurrence of task backlog or overtime phenomenon, and serious influence on the overall performance and service quality of a large data platform. The existing large data platform service management system mostly adopts a static resource allocation strategy and a simple priority ordering rule. For example, some systems allocate resources only according to the time of task submission or the size of the data volume, ignoring the real-time requirements of tasks, business importance, and diversity of resource requirements. The single resource allocation mode cannot effectively cope with complex and changeable business scenes, and dynamic optimal allocation of resources is difficult to achieve. Meanwhile, in the aspect of task priority evaluation, the prior art lacks comprehensive consideration of key factors such as task dependency relationship, emergency degree, resource availability, history execution record and the like, so that a priority evaluation result is inaccurate, and reliable basis cannot be provided for resource allocation. In addition, the lack of an effective monitoring and error correction mechanism in the data transmission process is easy to cause data loss or error, and further affects the execution effect of the task. In order to solve the above-mentioned drawbacks, a technical solution is now provided. Disclosure of Invention The invention aims to solve the problems of inaccurate task priority evaluation and unreasonable resource allocation in the conventional large data platform service management, and provides a large data platform service management system based on an AI algorithm. The aim of the invention can be achieved by the following technical scheme: A big data platform service management system based on AI algorithm includes: the data sensing module is used for collecting platform data in real time through the distributed sensor and the data acquisition interface, summarizing the platform data to the buffer pool and monitoring the data quality at the same time; The task priority evaluation module is used for extracting the key features of the tasks by using a machine learning algorithm to calculate priority scores based on the business rules and the user demands, and optimizing and evaluating the corrected priority scores by combining the correction parameters; The resource allocation module is used for constructing a resource allocation intelligent agent, attempting different allocation actions according to data perception and task priority evaluation results by using a deep reinforcement learning algorithm, and determining an optimal resource allocation scheme according to system feedback rewards; And the resource scheduling execution module is used for distributing various resources to the tasks according to the optimal scheme of the resource distribution module, ensuring data transmission by utilizing a high-speed network and a distributed file system and ensuring the smooth execution of the tasks. Further, the task priority evaluation module performs the following process Firstly, extracting key features affecting task priority, including: Real-time requirements The real-time requirement is quantified by the deadline and the expected response time index of the task; Data size The data size related to the task is measured by byte number and record number unit; Importance of business Evaluating the importance degree of the service according to the task, and quantifying the importance of the service by expert evaluation and service rule definition; Resource demand The method comprises the steps of determining computing resources, storage resources and network resources required by task completion through historical execution records of the tasks and resource demand statement information in ta