Search

US-12626579-B2 - Remotely identifying potential system instability based on jitter in scheduled interactions

US12626579B2US 12626579 B2US12626579 B2US 12626579B2US-12626579-B2

Abstract

Techniques for remotely identifying potential system instability based on jitter in scheduled interactions are described. One example method includes identifying a scheduled interaction between a computing device and a remote host, wherein the computing device and the remote host are communicatively coupled over a network, and wherein the scheduled interaction includes the computing device repeatedly receiving a message over the network from the remote host at a regular time interval; identifying arrival times of messages received from the remote host over a plurality of cycles of the scheduled interaction; calculating a jitter metric for the scheduled interaction over the plurality of cycles based on the arrival times of the messages; determining that the jitter metric exceeds a pre-configured jitter threshold; and generating an alert indicating that a potential system instability condition exists on the remote host.

Inventors

  • Michael Emery Brown
  • SHRINIDHI KATTE
  • Ravishankar Kanakapura N

Assignees

  • DELL PRODUCTS L.P.

Dates

Publication Date
20260512
Application Date
20240426

Claims (17)

  1. 1 . A method comprising: identifying, by a computing device, a scheduled interaction between the computing device and a remote host, wherein the computing device and the remote host are communicatively coupled over a network, and wherein the scheduled interaction includes the computing device repeatedly receiving a message over the network from the remote host at a regular time interval; identifying, by the computing device, arrival times of messages received from the remote host over a plurality of cycles of the scheduled interaction; calculating, by the computing device, a jitter metric for the scheduled interaction over the plurality of cycles based on the arrival times of the messages; determining, by the computing device, that the jitter metric exceeds a jitter threshold, wherein the jitter threshold represents a value of the jitter metric that is indicative of potential system instability; in response to determining that the jitter metric exceeds the jitter threshold, generating, by the computing device, an alert indicating that a potential system instability condition exists on the remote host; and in response to generating the alert indicating that the potential system instability condition exists on the remote host, instructing, by the computing device, the remote host to mark and save a copy of a log file.
  2. 2 . The method of claim 1 , wherein the scheduled interaction includes a watchdog message received from the remote host at regular time intervals.
  3. 3 . The method of claim 1 , wherein calculating the jitter metric includes identifying an elapsed time between each of the plurality of messages and a previous message received in the previous cycle of the scheduled interaction.
  4. 4 . The method of claim 3 , wherein calculating the jitter metric includes calculating an average time variation for the plurality of messages based on the identified elapsed times.
  5. 5 . The method of claim 4 , wherein calculating the jitter metric includes determining a number of the identified elapsed times that exceed a threshold time.
  6. 6 . The method of claim 1 , further comprising: providing, by the computing device, a telemetry data stream including the jitter metric to a data analytics system.
  7. 7 . A system comprising: a computing device including at least one processor and a memory, and configured to perform operations including: identifying a scheduled interaction between the computing device and a remote host, wherein the computing device and the remote host are communicatively coupled over a network, and wherein the scheduled interaction includes the computing device repeatedly receiving a message over the network from the remote host at a regular time interval; identifying arrival times of messages received from the remote host over a plurality of cycles of the scheduled interaction; calculating a jitter metric for the scheduled interaction over the plurality of cycles based on the arrival times of the messages; determining that the jitter metric exceeds a jitter threshold, wherein the jitter threshold represents a value of the jitter metric that is indicative of potential system instability; in response to determining that the jitter metric exceeds the jitter threshold, generating an alert indicating that a potential system instability condition exists on the remote host; and in response to generating the alert indicating that the potential system instability condition exists on the remote host, instructing the remote host to mark and save a copy of a log file.
  8. 8 . The system of claim 7 , wherein the scheduled interaction includes a watchdog message received from the remote host at regular time intervals.
  9. 9 . The system of claim 7 , wherein calculating the jitter metric includes identifying an elapsed time between each of the plurality of messages and a previous message received in the previous cycle of the scheduled interaction.
  10. 10 . The system of claim 9 , wherein calculating the jitter metric includes calculating an average time variation for the plurality of messages based on the identified elapsed times.
  11. 11 . The system of claim 10 , wherein calculating the jitter metric includes determining a number of the identified elapsed times that exceed a threshold time.
  12. 12 . The system of claim 7 , further comprising: providing a telemetry data stream including the jitter metric to a data analytics system.
  13. 13 . An article of manufacture comprising a non-transitory, computer-readable medium having computer-executable instructions thereon that are executable by a processor of a computing device to perform operations comprising: identifying a scheduled interaction between the computing device and a remote host, wherein the computing device and the remote host are communicatively coupled over a network, and wherein the scheduled interaction includes the computing device repeatedly receiving a message over the network from the remote host at a regular time interval; identifying arrival times of messages received from the remote host over a plurality of cycles of the scheduled interaction; calculating a jitter metric for the scheduled interaction over the plurality of cycles based on the arrival times of the messages; determining that the jitter metric exceeds a jitter threshold, wherein the jitter threshold represents a value of the jitter metric that is indicative of potential system instability; in response to determining that the jitter metric exceeds the jitter threshold, generating an alert indicating that a potential system instability condition exists on the remote host; and in response to generating the alert indicating that the potential system instability condition exists on the remote host, instructing the remote host to mark and save a copy of a log file.
  14. 14 . The article of claim 13 , wherein the scheduled interaction includes a watchdog message received from the remote host at regular time intervals.
  15. 15 . The article of claim 13 , wherein calculating the jitter metric includes identifying an elapsed time between each of the plurality of messages and a previous message received in the previous cycle of the scheduled interaction.
  16. 16 . The article of claim 15 , wherein calculating the jitter metric includes calculating an average time variation for the plurality of messages based on the identified elapsed times.
  17. 17 . The article of claim 16 , wherein calculating the jitter metric includes determining a number of the identified elapsed times that exceed a threshold time.

Description

TECHNICAL FIELD The present disclosure relates in general to information handling systems, and more particularly to techniques for remotely identifying potential system instability based on jitter in scheduled interactions in information handling systems. BACKGROUND OF THE INVENTION In distributed systems (e.g., information handling systems), individual computing devices may be monitored in order to identify and troubleshoot potential issues. Such potential issues may include, for example, hardware failures, software issues such as bugs or crashes, network connectivity issues, and the like. In some cases, this monitoring may be performed automatically by a server remote from the computing device based on identifying and analyzing network activity of the computing device. For example, the computing device may be configured to periodically send a heartbeat or watchdog message to the remote server at regular intervals, such as every 60 seconds. If the remote server receives these messages from the device, it may infer that the device is still operational. But if the remote server fails to receive a heartbeat message from the device within a certain amount of time from when the message is expected, the remote server may infer that the device has failed and take corrective action, such as removing the device from the system, notifying system administrators of the issue, and the like. SUMMARY OF THE INVENTION In accordance with embodiments of the present disclosure, a method for remotely identifying potential system instability based on jitter in scheduled interactions includes identifying a scheduled interaction between the computing device and a remote host, wherein the computing device and the remote host are communicatively coupled over a network, and wherein the scheduled interaction includes the computing device repeatedly receiving a message over the network from the remote host at a regular time interval; identifying arrival times of messages received from the remote host over a plurality of cycles of the scheduled interaction; calculating a jitter metric for the scheduled interaction over the plurality of cycles based on the arrival times of the messages; determining that the jitter metric exceeds a pre-configured jitter threshold, wherein the jitter threshold represents a value of the jitter that metric indicative of potential system instability; and in response to determining that the jitter metric exceeds the jitter threshold, generating an alert indicating that a potential system instability condition exists on the remote host. In some cases, the scheduled interaction includes a watchdog message received from the remote host at regular time intervals. In some implementations, the jitter metric includes identifying an elapsed time between each of the plurality of messages and a previous message received in the previous cycle of the scheduled interaction. In some implementations, calculating the jitter metric includes calculating an average time variation for the plurality of messages based on the identified elapsed times. In some cases, calculating the jitter metric includes determining a number of the identified elapsed times that exceed a threshold time. In some implementations, the method further includes in response to generating the alert indicating that the potential system instability condition exists on the remote host, instructing the remote host to mark and save a copy of its log file. In some cases, the method further includes providing a telemetry data stream including the jitter metric to a data analytics system. In accordance with embodiments of the present disclosure, a system for remotely identifying potential system instability based on jitter in scheduled interactions includes a computing device including at least one processor and a memory, and configured to perform operations including identifying a scheduled interaction between the computing device and a remote host, wherein the computing device and the remote host are communicatively coupled over a network, and wherein the scheduled interaction includes the computing device repeatedly receiving a message over the network from the remote host at a regular time interval; identifying arrival times of messages received from the remote host over a plurality of cycles of the scheduled interaction; calculating a jitter metric for the scheduled interaction over the plurality of cycles based on the arrival times of the messages; determining that the jitter metric exceeds a pre-configured jitter threshold, wherein the jitter threshold represents a value of the jitter metric that indicative of potential system instability; and in response to determining that the jitter metric exceeds the jitter threshold, generating an alert indicating that a potential system instability condition exists on the remote host. In accordance with embodiments of the present disclosure, an article of manufacture includes a non-transitory, computer-readable medium