Search

JP-2026075748-A - Monitoring systems, monitoring methods, and programs

JP2026075748AJP 2026075748 AJP2026075748 AJP 2026075748AJP-2026075748-A

Abstract

[Challenge] To detect system anomalies more effectively. [Solution] The monitoring system obtains a plurality of first values corresponding to each of a plurality of target time points included in the monitoring period, each representing a representative value of the replication delay from the primary database to the secondary database during a first period including the corresponding target time point. It also obtains a second value corresponding to each of the plurality of target time points, each representing a representative value of the replication delay during a second period that includes the corresponding target time point and is longer than the first period. If the number of the plurality of target time points in which the corresponding first value is greater than the corresponding second value satisfies the anomaly detection condition, the system outputs an alert regarding replication. [Selection Diagram] Figure 6

Inventors

  • ウマパシィ サラヴァナ クマール

Assignees

  • ラクテン アジア プライベート リミテッド

Dates

Publication Date
20260511
Application Date
20241023

Claims (9)

  1. Obtain a plurality of first values corresponding to each of the plurality of target time points included in the monitoring period, each representing a representative value of the replication delay from the primary database to the secondary database during the first period including the corresponding target time point. A second value is obtained that corresponds to each of the multiple target time points and represents a representative value of the replication delay during a second period that is longer than the first period and includes the corresponding target time point. If, among the multiple target time points, the number of instances where the corresponding first value is greater than the corresponding second value satisfies the anomaly detection condition, an alert regarding replication is output. Monitoring system.
  2. A monitoring system according to claim 1, Each of the aforementioned plurality of first values represents the moving average of the replication delay during the first period, which includes the corresponding target time point. The plurality of second values each represent the moving average of the replication delay during the second period, which includes the corresponding target time point. Monitoring system.
  3. A monitoring system according to claim 1, The replication delay is the time from when the information is written to the primary database until the information is written to the secondary database. Monitoring system.
  4. A monitoring system according to claim 1, Each of the aforementioned multiple target time points is closer to the end than the start of the corresponding second period. Monitoring system.
  5. A monitoring system according to claim 1, If the counted number for the number of target time points is greater than a threshold corresponding to the number of target time points, an alert regarding replication is output. Monitoring system.
  6. A monitoring system according to claim 5, The threshold is determined by multiplying the number of target time points by a predetermined ratio. Monitoring system.
  7. A monitoring system according to claim 1, When the aforementioned anomaly detection conditions are met, an alert regarding the replication is sent to the administrator. Monitoring system.
  8. The steps include obtaining a plurality of first values, each corresponding to a plurality of target time points included in the monitoring period, which represent a typical value of the replication delay from the primary database to the secondary database during the first period including the corresponding target time point, A step of obtaining a second value that corresponds to each of the multiple target time points and indicates a representative value of the replication delay during a second period that includes the corresponding target time point and is longer than the first period, The steps include outputting a replication alert when, among the multiple target time points, the number of instances where the corresponding first value is greater than the corresponding second value satisfies the anomaly detection condition, A monitoring method that includes this.
  9. Obtain a plurality of first values corresponding to each of the plurality of target time points included in the monitoring period, each representing a representative value of the replication delay from the primary database to the secondary database during the first period including the corresponding target time point. A second value is obtained that corresponds to each of the multiple target time points and represents a representative value of the replication delay during a second period that is longer than the first period and includes the corresponding target time point. If, among the multiple target time points, the number of instances where the corresponding first value is greater than the corresponding second value satisfies the anomaly detection condition, an alert regarding replication is output. A program that causes a computer to perform a process.

Description

This invention relates to a monitoring system, a monitoring method, and a program. Some systems synchronize data between multiple databases (also known as replication) and utilize the synchronized database. When performing replication from a primary database to a secondary database, delays in data writing occur due to replication. If this delay becomes significant, it may cause problems with service provision. Furthermore, database synchronization can affect data consistency due to factors such as reading data from before synchronization. To address such problems promptly, technologies exist to monitor the operational status of systems. This figure shows elements related to an information processing system according to an embodiment of the present invention.This is a diagram illustrating database replication.This is a block diagram showing the functions that an information processing system can perform.This is a flowchart illustrating an example of a process for collecting monitoring data.This figure shows an example of data stored in a metrics database.This is a flowchart illustrating an example of a process for detecting anomalies.This figure shows an example of how the moving average of a delay changes over time. The embodiments of the present invention will be described below with reference to the drawings. For components denoted by the same reference numerals, redundant descriptions will be omitted. Figure 1 shows elements related to an information processing system according to an embodiment of the present invention. The information processing system includes a primary database server 1, a secondary database server 2, one or more monitoring servers 3, and one or more application servers 4. The primary database server 1, secondary database server 2, monitoring server 3, and application servers 4 are so-called server computers. These communicate with each other via a network. Primary database server 1 and secondary database server 2 provide database services for storing various types of data. Hereafter, when referring to them without distinction, they will simply be referred to as "database servers." Replication is performed between primary database server 1 and secondary database server 2. This synchronizes the data on secondary database server 2 with that of primary database server 1. In the example in Figure 1, primary database server 1 can write to and read from the database, while secondary database server 2 can only read. The information processing system may include multiple primary database servers 1 and multiple secondary database servers 2 that cooperate to provide database services. The monitoring server 3 includes one or more processors 31, one or more storage devices 32, and one or more communication units 33. The primary database server 1, secondary database server 2, and application server 4 also include one or more processors 31, one or more storage devices 32, and one or more communication units 33. These may be implemented on one or more virtual servers or container infrastructures. The processor 31 operates according to the program (also called instruction code) stored in the storage 32. The processor 31 also controls the communication unit 33. The processor 31 may include, for example, a CPU (Central Processing Unit), and may further include a GPU (Graphic Processing Unit) and an NPU (Neural Processing Unit). The program may be provided via the internet or other means, or it may be provided stored on a computer-readable storage medium such as flash memory or DVD-ROM. The storage device 32 is composed of memory elements such as RAM and flash memory, and an external storage device such as a hard disk drive (HDD) or solid-state drive (SSD). The storage device 32 stores the program mentioned above. It also stores information and calculation results input from the processor 31 and the communication unit 33. The communication unit 33 is a communication interface that communicates with other devices, such as a network interface card. The communication unit 33 is composed of integrated circuits, antennas, and communication terminals that implement wireless LAN and wired LAN, for example. Based on the control of the processor 31, the communication unit 33 inputs information received from other devices via the network to the processor 31 and storage 32, and transmits the information to the other devices. Note that the hardware configuration of monitoring server 3 and other servers is not limited to the example above. For example, monitoring server 3 may include devices for reading computer-readable information storage media (e.g., optical disc drives or memory card slots) and devices for data input/output with external devices (e.g., USB ports). External devices may also be input or output devices. Further explanation will be provided regarding the replication process between primary database server 1 and secondary database server 2. Figure 2 illustrates database replication. Figure 2 illustra