Search

CN-121542135-B - Intelligent data lake server management method and system based on IPMI protocol

CN121542135BCN 121542135 BCN121542135 BCN 121542135BCN-121542135-B

Abstract

The invention discloses an intelligent data lake server management method and system based on an IPMI protocol, and belongs to the technical field of intelligent management of servers. According to the method, server sensors, power consumption meters and event log original data are continuously collected through BMC hardware driving, aggregation calculation, trend analysis and lightweight machine learning reasoning processing are executed after hierarchical cache storage and significant change data screening are conducted, management strategies configured by a storage user are compiled, association mapping of the data and the strategies is established, processing results are pushed to a visual interaction interface, and meanwhile automatic execution of the strategies is triggered. The system relies on the IPMI protocol to realize cross-platform hardware interaction, realizes the rapid feedback of the server state through real-time data pushing and visual display, reduces invalid data interference, improves the intellectualization and high efficiency of management, reduces the manual intervention cost, and provides full-flow support for the stable operation of the server.

Inventors

  • ZHAO XUCHENG

Assignees

  • 四川华鲲振宇智能科技有限责任公司

Dates

Publication Date
20260508
Application Date
20260120

Claims (7)

  1. 1. An intelligent data lake server management method based on an IPMI protocol is characterized by comprising the following steps: S1, continuously collecting original data of a sensor, a power consumption meter and an event log, and storing the data into a hierarchical cache structure; S2, processing the data in the cache, screening the significant change data, and performing aggregation calculation, trend analysis and event association processing on the significant change data to generate corresponding analysis data; S3, generating an aggregation result, trend data, a root cause analysis report and an optional prediction result based on the analysis data in the step S2, and compiling and storing a management strategy configured by a user; S4, pushing the generated aggregation result, trend data, root cause analysis report and optional prediction result in the step S3 to an interactive interface, analyzing the result by the interactive interface, monitoring the data in real time by a management strategy and triggering execution; Wherein step S1 comprises the sub-steps of: s1.1, polling a sensor, a power consumption meter and a log module of server hardware according to a preset frequency, and collecting original data; S1.2, firstly storing the acquired original data into an L1 cache, and after carrying out preliminary aggregation on the data in the L1 cache, storing the data into an L2 cache to form a layered cache storage structure; wherein step S2 comprises the sub-steps of: S2.1, comparing the newly acquired data with historical data in an L1 cache, and screening out significant change data with a difference value exceeding a preset dead zone threshold; S2.2, rolling calculation is carried out on the significant change data, and average, maximum, minimum and standard deviation aggregate data are generated; s2.3, generating trend lines and trend related data through a fitting algorithm based on the L2 cache data; s2.4, searching related data of an event occurrence period, performing association analysis and forming a root cause analysis report; the step S2 further comprises the steps of starting integrated lightweight machine learning related processing, performing model training by utilizing historical data, reasoning real-time processing data through the trained model, marking abnormal data points and outputting an abnormal analysis result.
  2. 2. The method according to claim 1, characterized in that step S3 comprises the sub-steps of: S3.1, receiving management strategy configuration information, and compiling according to a rule format similar to SQL; s3.2, storing the compiled management strategy, and forming an association mapping with the analysis data in the step S2.
  3. 3. The method according to claim 1, wherein in step S3, a graphical drag configuration interface is provided for a user to configure a management policy, information configured by the user is compiled to form a policy definition file in a preset format, and the policy definition file is issued through an API interface, received and stored.
  4. 4. The method according to claim 1, wherein in step S4, after the interactive interface receives the generated result in step S3, the result is parsed, and the aggregated result in the result is converted into comprehensive health status information, real-time power consumption data and a key event list through a global health situation awareness area of the cockpit layout, and is displayed.
  5. 5. The method according to claim 1, wherein in step S4, the prediction result and the root cause analysis report of step S3 are extracted, and are parsed by the predictive insight area of the card layout to generate an early warning prompt and a processing suggestion, and are displayed to the user.
  6. 6. The method according to claim 1, wherein in step S4, sensor-related analysis data in step S3 is acquired, the data is mapped to a server physical layout topology, the status of the components is marked by color coding, and detailed data and a historical trend graph of the corresponding components are displayed when a mouse interaction occurs.
  7. 7. An intelligent data lake server management system based on an IPMI protocol is characterized by comprising a rear-end intelligent data lake engine IDLE and a front-end insight driving interaction interface GUI; The intelligent data lake engine IDLE comprises a data acquisition and cache layer, a data processing and intelligent analysis layer, an event and strategy engine and a communication gateway which are connected in sequence; The data acquisition and caching layer comprises an equipment drive and layered caching structure and is used for acquiring original data and storing the original data in a layered mode, wherein the acquired original data is firstly stored in an L1 cache, and after the data in the L1 cache are preliminarily aggregated, the data are stored in an L2 cache to form a layered caching storage structure; the data processing and intelligent analysis layer comprises a change detector, an aggregator, a trend calculator and an optional lightweight machine learning model, and is used for screening significant change data and executing aggregation calculation, trend analysis, event association and abnormal reasoning; Comparing the newly acquired data with the historical data in the L1 cache, screening out obvious change data with the difference value exceeding a preset dead zone threshold value, comparing the newly acquired and stored original data in the L1 cache with the stored historical data in the cache one by utilizing a change detection mechanism, and calculating the difference value between the two; Performing rolling calculation on the significant change data to generate average, maximum, minimum and standard deviation aggregate data; generating trend lines and trend related data through a fitting algorithm based on the L2 cache data; the event and policy engine comprises a rule base and an interpreter, and is used for storing and executing the compiled management policy; The communication gateway comprises a RESTful API server and a WebSocket server and is used for establishing connection with the front end and transmitting data; The front-end insight driving interaction interface GUI comprises a global health situation sensing area, a predictive insight and early warning card, an intelligent sensor fusion view and a one-key operation and strategy panel, and is used for analyzing and processing back-end data and providing a strategy configuration inlet.

Description

Intelligent data lake server management method and system based on IPMI protocol Technical Field The invention relates to the technical field of intelligent management of servers, in particular to an intelligent data lake server management method and system based on an IPMI protocol. Background With the deep advancement of digital transformation, the server is used as a core hardware device of an information system, the deployment scale in a data center, a cloud computing platform and various business scenes is continuously expanded, and the stable and efficient operation of the server is crucial to the smooth development of the whole business. Currently, the field of server management is developing towards the direction of intellectualization and automation, and an IPMI protocol is used as a standardized hardware management interface, and is widely applied to the scenes of server hardware state monitoring, remote operation, event recording and the like, and the hardware management of cross-platform and cross-operating systems is realized through independent interaction between BMC and server hardware. Meanwhile, the fusion of the data lake technology and the server management is gradually deepened, sensor data, power consumption data, event logs and the like generated by the operation of a mass server are stored and primarily analyzed by means of a big data processing technology, links such as fault prediction and resource scheduling are also fused into an AI and machine learning technology, a visual interaction interface and a real-time monitoring function become common configuration of a server management system, and a management mode is promoted to be upgraded from traditional artificial operation to intelligent operation. Despite the continued development of server management techniques, there are many issues with existing solutions that remain to be improved. The system layered caching and screening mechanism is lacking after data acquisition, a large amount of invalid data which does not change significantly directly enter a subsequent processing link, so that data processing resources are occupied, and the overall circulation efficiency is limited; the method has the advantages that the data processing is concentrated in simple calculation with single dimension, the collaborative coordination of aggregation calculation, trend analysis and event association analysis is lacking, valuable state information and change rules are difficult to accurately extract from mass data, the linkage of data processing results and management strategies is insufficient, a real-time monitoring and automatic triggering execution mechanism is lacking, the results are displayed in most simple data lists, visual presentation and quick feedback of information cannot be realized through an efficient interaction mode, in addition, part of schemes do not effectively integrate lightweight machine learning capability, the recognition precision and timeliness of data anomalies are insufficient, the actual requirements of server management on intellectualization and high efficiency are difficult to meet, and the problems affect the overall quality and response speed of server management. Disclosure of Invention The invention aims to overcome the defects of the prior art and provides an intelligent data lake server management method and system based on an IPMI protocol. The aim of the invention is realized by the following technical scheme: the intelligent data lake server management method based on the IPMI protocol comprises the following steps: S1, continuously collecting original data of a sensor, a power consumption meter and an event log, and storing the data into a hierarchical cache structure; S2, processing the data in the cache, screening the significant change data, and performing aggregation calculation, trend analysis and event association processing on the significant change data to generate corresponding analysis data; S3, generating an aggregation result, trend data, a root cause analysis report and an optional prediction result based on the analysis data in the step S2, and compiling and storing a management strategy configured by a user; s4, pushing the generated aggregation result, trend data, root cause analysis report and optional prediction result in the step S3 to an interactive interface, analyzing the result by the interactive interface, and monitoring the data in real time and triggering execution by a management strategy. Further, step S1 comprises the sub-steps of: s1.1, polling a sensor, a power consumption meter and a log module of server hardware according to a preset frequency, and collecting original data; s1.2, the collected original data is firstly stored in an L1 cache, the data in the L1 cache is preliminarily aggregated, and then the data is stored in an L2 cache, so that a layered cache storage structure is formed. Further, step S2 comprises the sub-steps of: S2.1, comparing the newly acquired