US-12619892-B2 - System and method for managing inference model performance through proactive communication system analysis

US12619892B2US 12619892 B2US12619892 B2US 12619892B2US-12619892-B2

Abstract

Methods and systems for managing execution of inference models hosted by data processing systems are disclosed. To manage execution of inference models hosted by data processing systems, a system may include an inference model manager and any number of data processing systems. The inference model manager may communication system data for the communication system linking the data processing systems. The inference model manager may use the communication system data to determine whether the communication system meets inference generation requirements of the downstream consumer. If the communication system does not meet inference generation requirements of the downstream consumer, the inference model manager may obtain an inference generation plan to return to compliance with the inference generation requirements of the downstream consumer.

Inventors

Ofir Ezrielev
Jehuda Shemer
Tomer Kushnir

Assignees

DELL PRODUCTS L.P.

Dates

Publication Date: 20260505
Application Date: 20221130

Claims (20)

1 . A method of managing execution of an inference model hosted by data processing systems, the method comprising: obtaining an execution plan for inference generation according to inference generation requirements of a downstream consumer, the execution plan designating a time interval to each of the data processing systems for transmitting operational capability data; obtaining communication system information for a communication system connecting the data processing systems; making a determination regarding whether the communication system information meets the inference generation requirements; in an instance of the determination, in which the communication system information does not meet the inference generation requirements of the downstream consumer: obtaining an inference generation path for the inference model based on the inference generation requirements of the downstream consumer and the communication system information; and modifying a deployment of the inference model to the data processing systems based on the inference generation path.
2 . The method of claim 1 , further comprising: prior to obtaining the communication system information: obtaining the inference model; obtaining characteristics of the inference model and characteristics of the data processing systems; and obtaining portions of the inference model based on the characteristics of the data processing systems and the characteristics of the inference model; and distributing the portions of the inference model to the data processing systems based on the execution plan, wherein the execution plan is obtained based on the portions of the inference model, the characteristics of the data processing systems, and the inference generation requirements of the downstream consumer.
3 . The method of claim 2 , wherein the communication system information comprises: a quantity of available communication system bandwidth between each data processing system of the data processing systems; and a reliability of transmission between each data processing system of the data processing systems.
4 . The method of claim 3 , wherein the reliability of transmission is based on: historical data indicating a likelihood of successful transmission of data between each data processing system of the data processing systems; or a distance between each data processing system of the data processing systems.
5 . The method of claim 4 , wherein the inference generation requirements of the downstream consumer are based on: an inference generation speed threshold, and an inference generation reliability threshold.
6 . The method of claim 5 , wherein the inference generation speed threshold indicates a minimum quantity of communication bandwidth between each data processing system of the data processing systems to meet the inference generation requirements of the downstream consumer.
7 . The method of claim 6 , wherein the inference generation reliability threshold indicates a minimum likelihood of successful transmission of data between each data processing system of the data processing systems to meet the inference generation requirements of the downstream consumer.
8 . The method of claim 7 , wherein the inference generation path comprises: a listing of instances of each of the portions of the inference model usable to generate an inference model result in compliance with the inference generation requirements of the downstream consumer; and an ordering of the listing of the instances.
9 . The method of claim 8 , wherein modifying the deployment of the inference model comprises: generating an updated execution plan based on the inference generation path; and distributing the updated execution plan to the data processing systems to implement the updated execution plan.
10 . The method of claim 9 , wherein the communication system comprises: multiple point-to-point wireless connections between the data processing systems, each point-to-point wireless connection of the multiple point-to-point wireless connections having distinct characteristics.
11 . A non-transitory machine-readable medium having instructions stored therein, which when executed by a processor, cause the processor to perform operations for managing execution of an inference model hosted by data processing systems, the operations comprising: obtaining an execution plan for inference generation according to inference generation requirements of a downstream consumer, the execution plan designating a time interval to each of the data processing systems for transmitting operational capability data; obtaining communication system information for a communication system connecting the data processing systems; making a determination regarding whether the communication system information meets the inference generation requirements; in an instance of the determination, in which the communication system information does not meet the inference generation requirements of the downstream consumer: obtaining an inference generation path for the inference model based on the inference generation requirements of the downstream consumer and the communication system information; and modifying a deployment of the inference model to the data processing systems based on the inference generation path.
12 . The non-transitory machine-readable medium of claim 11 , the operations further comprising: prior to obtaining the communication system information: obtaining the inference model; obtaining characteristics of the inference model and characteristics of the data processing systems; obtaining portions of the inference model based on the characteristics of the data processing systems and the characteristics of the inference model; and distributing the portions of the inference model to the data processing systems based on the execution plan, wherein the execution plan is obtained based on the portions of the inference model, the characteristics of the data processing systems, and the inference generation requirements of the downstream consumer.
13 . The non-transitory machine-readable medium of claim 12 , wherein the communication system information comprises: a quantity of available communication system bandwidth between each data processing system of the data processing systems; and a reliability of transmission between each data processing system of the data processing systems.
14 . The non-transitory machine-readable medium of claim 13 , wherein the reliability of transmission is based on: historical data indicating a likelihood of successful transmission of data between each data processing system of the data processing systems; or a distance between each data processing system of the data processing systems.
15 . The non-transitory machine-readable medium of claim 14 , wherein the inference generation requirements of the downstream consumer are based on: an inference generation speed threshold, and an inference generation reliability threshold.
16 . A data processing system, comprising: a processor; and a memory coupled to the processor to store instructions, which when executed by the processor, cause the processor to perform operations for managing execution of an inference model hosted by data processing systems, the operations comprising: obtaining an execution plan for inference generation according to inference generation requirements of a downstream consumer, the execution plan designating a time interval to each of the data processing systems for transmitting operational capability data; obtaining communication system information for a communication system connecting the data processing systems; making a determination regarding whether the communication system information meets the inference generation requirements; in an instance of the determination, in which the communication system information does not meet the inference generation requirements; obtaining an inference generation path for the inference model based on the inference generation requirements of the downstream consumer and the communication system information; and modifying a deployment of the inference model to the data processing systems based on the inference generation path.
17 . The data processing system of claim 16 , the operations further comprising: prior to obtaining the communication system information: obtaining the inference model; obtaining characteristics of the inference model and characteristics of the data processing systems; obtaining portions of the inference model based on the characteristics of the data processing systems and the characteristics of the inference model; and distributing the portions of the inference model to the data processing systems based on the execution plan, wherein the execution plan is obtained based on the portions of the inference model, the characteristics of the data processing systems, and the inference generation requirements of the downstream consumer.
18 . The data processing system of claim 17 , wherein the communication system information comprises: a quantity of available communication system bandwidth between each data processing system of the data processing systems; and a reliability of transmission between each data processing system of the data processing systems.
19 . The data processing system of claim 18 , wherein the reliability of transmission is based on: historical data indicating a likelihood of successful transmission of data between each data processing system of the data processing systems; or a distance between each data processing system of the data processing systems.
20 . The data processing system of claim 19 , wherein the inference generation requirements of the downstream consumer are based on: an inference generation speed threshold, and an inference generation reliability threshold.

Description

FIELD Embodiments disclosed herein relate generally to inference generation. More particularly, embodiments disclosed herein relate to systems and methods to manage inference generation based on inference consumer expectations. BACKGROUND Computing devices may provide computer-implemented services. The computer-implemented services may be used by users of the computing devices and/or devices operably connected to the computing devices. The computer-implemented services may be performed with hardware components such as processors, memory modules, storage devices, and communication devices. The operation of these components may impact the performance of the computer-implemented services. BRIEF DESCRIPTION OF THE DRAWINGS Embodiments disclosed herein are illustrated by way of example and not limitation in the figures of the accompanying drawings in which like references indicate similar elements FIG. 1 shows a block diagram illustrating a system in accordance with an embodiment. FIG. 2A shows a block diagram illustrating an inference model manager and multiple data processing systems over time in accordance with an embodiment. FIG. 2B shows a block diagram illustrating multiple data processing systems over time in accordance with an embodiment. FIG. 3A shows a flow diagram illustrating a method of managing inference models hosted by data processing systems to complete timely execution of the inference models in accordance with an embodiment. FIG. 3B shows a flow diagram illustrating a method of preparing to distribute inference model portions to data processing systems in accordance with an embodiment. FIG. 3C shows a flow diagram illustrating a method of obtaining an execution plan in accordance with an embodiment. FIG. 3D shows a flow diagram illustrating a method of managing the execution of the inference models in accordance with an embodiment. FIGS. 4A-4C show diagrams illustrating a method of executing inference models across multiple data processing systems over time in accordance with an embodiment. FIG. 5 shows a block diagram illustrating a data processing system in accordance with an embodiment. DETAILED DESCRIPTION Various embodiments will be described with reference to details discussed below, and the accompanying drawings will illustrate the various embodiments. The following description and drawings are illustrative and are not to be construed as limiting. Numerous specific details are described to provide a thorough understanding of various embodiments. However, in certain instances, well-known or conventional details are not described in order to provide a concise discussion of embodiments disclosed herein. Reference in the specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in conjunction with the embodiment can be included in at least one embodiment. The appearances of the phrases “in one embodiment” and “an embodiment” in various places in the specification do not necessarily all refer to the same embodiment. In general, embodiments disclosed herein relate to methods and systems for managing execution of an inference model throughout a distributed environment. To manage execution of the inference model, the system may include an inference model manager and any number of data processing systems. Hosting and executing an inference model by one data processing system may consume undesirable quantities of computing resources of the data processing system (and/or the data processing system may include insufficient resources to host the inference model). To reduce computing resource consumption, the inference model manager may partition the inference model into portions and distribute the portions across multiple data processing systems. The inference model hosted by the multiple data processing systems may generate inferences usable by a downstream consumer. The speed and reliability of inference generation by the multiple data processing systems may affect the downstream consumer's ability to utilize the inferences to make decisions (and/or other uses). The speed and reliability of inference generation may depend, at least in part, on a communication system (e.g., a wireless point-to-point connection) between the multiple data processing systems. However, disruptions to the communication system may (temporarily or permanently) affect the speed and/or reliability of inference generation by the multiple data processing systems, thereby making it difficult for the downstream consumer to depend on the inferences generated by the multiple data processing systems. To meet inference generation requirements of the downstream consumer, the inference model manager may dynamically modify an inference generation path (e.g., an ordered selection of the data processing systems involved in inference generation) to adjust the speed and/or reliability of inference generation throughout the distributed environment. To do so, the inference m