KR-20260064863-A - METHOD AND SYSTEM MANAGING SERVERS
Abstract
A computer program stored on a computer-readable storage medium according to one embodiment of the present disclosure, wherein the computer program performs a server management method, the method may include: a step of collecting server status data; a step of obtaining failure prediction data of the server by inputting the collected server status data into a pre-generated analysis model; and a step of determining to perform a preliminary measure for the server based on the failure prediction data.
Inventors
- 김용철
Assignees
- (주) 케이티클라우드
Dates
- Publication Date
- 20260508
- Application Date
- 20241029
Claims (15)
- A computer program stored on a computer-readable storage medium, wherein the computer program performs a server management method, and the method is Step to collect server status data; A step of obtaining failure prediction data of the server by inputting collected server status data into a pre-generated analysis model; and A step including determining to perform preliminary measures for the server based on the above failure prediction data, A computer program stored on a computer-readable storage medium.
- In Article 1, The above server status data is, including at least one portion of server sensing data and hypervisor OS log data, A computer program stored on a computer-readable storage medium.
- In Article 1, The above server status data is, Collected periodically according to a preset time interval, taking into account the time required to extract the server status data from the server, A computer program stored on a computer-readable storage medium.
- In Article 1, The above server status data is, including data regarding the normality of at least one of the resources of the above server, A computer program stored on a computer-readable storage medium.
- In Article 1, The above analysis model is, A model pre-generated to infer failure prediction data from the collected server status data based on big data constructed from accumulated server status data, A computer program stored on a computer-readable storage medium.
- In Paragraph 5, The above big data is, A dataset including the above-mentioned accumulated server status data and the above-mentioned accumulated server failure occurrence data corresponding to the above-mentioned accumulated server status data, with labels A computer program stored on a computer-readable storage medium.
- In Paragraph 5, The above analysis model is, An artificial intelligence-based model that is pre-trained to infer failure prediction data from the collected server status data using training data which is at least a part of the big data, A computer program stored on a computer-readable storage medium.
- In Article 1, The above failure prediction data is, at least one of whether a failure is predicted in the server, the type of the predicted failure, the timing of the predicted failure, the impact upon the occurrence of the predicted failure, and recommended precautionary measures for the predicted failure. A computer program stored on a computer-readable storage medium.
- In Paragraph 8, The above preliminary measures are, Comprising at least one of the migration of at least some of the customers' virtual machines (VMs) associated with the server and the replacement of devices associated with the server, A computer program stored on a computer-readable storage medium.
- In Article 9, At least some of the virtual machines targeted for the migration above are, Selected from virtual machines associated with the server based on at least one of the impact upon occurrence of the above-mentioned predicted failure and the importance of the above-mentioned customer, A computer program stored on a computer-readable storage medium.
- In Article 1, The step of determining the performance of the above preliminary measures is, The step of providing the above-mentioned failure prediction data to an integrated control system; and A method comprising the step of receiving a response from an operator of the integrated control system for performing the aforementioned preliminary measures, A computer program stored on a computer-readable storage medium.
- In Paragraph 11, The step provided to the above-mentioned integrated control system is, A method comprising the step of visualizing the above failure prediction data into a user-friendly web-based interface and providing it to the above integrated control system. A computer program stored on a computer-readable storage medium.
- In Article 1, The step of determining the performance of the above preliminary measures is, A step comprising controlling that at least one of the recommended pre-measures for the predicted failure is automatically performed, A computer program stored on a computer-readable storage medium.
- As a server management method, The above method is, Step to collect server status data; A step of obtaining failure prediction data of the server by inputting collected server status data into a pre-generated analysis model; and A step including determining to perform preliminary measures for the server based on the above failure prediction data, Server Management Methods
- As a server management system, A data collection unit that collects server status data; A data analysis unit that inputs collected server status data into a pre-generated analysis model to obtain failure prediction data of the said server; and A data output unit that determines the execution of preliminary measures for the server based on the above failure prediction data, Server Management System.
Description
Method and System Managing Servers The present disclosure relates to a method and system for managing one or more servers, and specifically to a method and system for predicting in advance the occurrence of failure of a server constituting a cloud system. With the technological advancement of computer networks, the traditional computing environment, which relied on the independent hardware performance of each terminal, has evolved into a form of cloud computing that utilizes all computing resources on the network to provide convenient and easy access to services in response to user terminal requests. Cloud computing refers to the provision of computing services, such as servers, storage, software, and analytics, over the "cloud," or the Internet. The market for cloud computing is expanding rapidly due to the advantage of enabling the easy and inexpensive construction of desired systems by quickly allocating infrastructure—such as storage and servers—that were previously used in physical form, within a virtual environment. Virtualization technology can be cited as a core foundational technology of cloud computing. Widely used open-source server virtualization technologies in the server field include Virtual Machine (VM) or Hypervisor-based systems such as Xen, KVM, and VirtualBox. Virtual machine-based server virtualization involves installing an operating system (hereinafter Host OS) on a physical server, creating virtual machines by partitioning resources based on a hypervisor, installing a guest operating system (hereinafter Guest OS) on top of the host OS, and running desired applications. Meanwhile, in the event of a failure in the hardware and systems constituting the aforementioned cloud computing—for example, when system failures such as the physical shutdown of virtual servers or poor data reception occur due to issues with hardware resources constituting cloud computing, namely CPU, RAM, storage, and network—the administrator generally assesses the situation after the failure occurs and then takes measures regarding the affected server. However, according to these manual and reactive failure response procedures performed by administrators, there is a problem in that administrators must personally access the management server, causing the inconvenience of having to come to work on holidays or at night to check for failures, and consequently making prompt response and measures impossible. Furthermore, there is a problem in that failure occurrences are unpredictable, and it is difficult to prepare in advance for and predict new types of failures that have not occurred in the past. Accordingly, there is a need to provide technology that can predict server failures in a cloud system in advance and rapidly perform appropriate measures before a failure occurs. FIG. 1 is a drawing illustrating an example of a server management system according to some embodiments of the present invention. FIG. 2 is a drawing illustrating an example of a server according to some embodiments of the present invention. FIG. 3 is a diagram showing an artificial neural network according to some embodiments of the present invention. FIG. 4 is a diagram illustrating an example of data output of a server management system according to some embodiments of the present invention. FIG. 5 is a flowchart of a server management method according to some embodiments of the present invention. FIG. 6 is a block diagram showing a computing device providing a server management method according to some embodiments of the present invention. Various embodiments are now described with reference to the drawings. In this specification, various descriptions are provided to provide an understanding of the present disclosure. However, it is evident that these embodiments can be practiced without such specific descriptions. As used herein, terms such as “component,” “module,” “system,” etc. refer to computer-related entities, hardware, firmware, software, combinations of software and hardware, or executions of software. For example, a component may be, but is not limited to, a procedure executed on a processor, a processor, an object, an execution thread, a program, and/or a computer. For example, both an application executed on a computer device and the computer device itself may be a component. One or more components may reside within a processor and/or an execution thread. A component may be localized within a single computer. A component may be distributed among two or more computers. Additionally, these components may be executed from various computer-readable media having various data structures stored therein. Components may communicate through local and/or remote processes, for example, according to signals having one or more data packets (e.g., data from a component interacting with another component in a local system or distributed system, and/or data transmitted through signals to other systems and networks such as the Internet). Furthermore,