Search

US-20260127034-A1 - DATA PROCESSING PIPELINE

US20260127034A1US 20260127034 A1US20260127034 A1US 20260127034A1US-20260127034-A1

Abstract

A data processing device includes a plurality of hardware accelerators, a scheduler circuit, and a blocking circuit. The scheduler circuit is coupled to the plurality of hardware accelerators, and includes a plurality of hardware task schedulers. Each hardware task scheduler is coupled to a corresponding hardware accelerator, and is configured to control execution of the task by the hardware accelerator. The blocking circuit is coupled to the plurality of hardware accelerators and configured to inhibit communication between a first hardware accelerator and a second hardware accelerator of the plurality of hardware task schedulers.

Inventors

  • Mihir Mody
  • Niraj NANDAN
  • Rajasekhar ALLU
  • Ankur Ankur

Assignees

  • TEXAS INSTRUMENTS INCORPORATED

Dates

Publication Date
20260507
Application Date
20251231

Claims (19)

  1. 1 . A system comprising: a camera; and a processing system coupled to the camera, the processing system comprising: a camera capture component coupled to the camera; a display component; at least one processor coupled to the camera capture component and to the display component via an interconnect; and a vision accelerator coupled to the at least one processor via the interconnect, the vision accelerator comprising: a first hardware accelerator; a second hardware accelerator; a third hardware accelerator; and a scheduler circuit coupled to the first hardware accelerator, to the second hardware accelerator, and to the third hardware accelerator, wherein the scheduler circuit comprises: a first task scheduler circuit comprising: an output configurable to provide a signal that indicates that a set of data produced by the first hardware accelerator is available; and a blocking circuit configurable to inhibit communication of the signal when the first set of data is available based on a status of the third hardware accelerator; and a second task scheduler circuit coupled to the second hardware accelerator, the second task scheduler circuit comprising an input coupled to the output of the first task scheduler circuit, the second task scheduler circuit configurable to cause the second hardware accelerator to start execution of a corresponding task on the set of data based on the signal.
  2. 2 . The system of claim 1 , further comprising a memory mapped register including a field, wherein the blocking circuit is configurable to inhibit communication of the signal based on a value stored in the field.
  3. 3 . The system of claim 2 , wherein the memory mapped register includes a second field that indicates whether the blocking circuit is currently inhibiting communication of the signal.
  4. 4 . The system of claim 1 , further comprising a memory coupled to the first hardware accelerator and to the second hardware accelerator, wherein the signal indicates that the set of data is available in the memory.
  5. 5 . The system of claim 1 , wherein the third hardware accelerator is subsequent to the first and second hardware accelerators in a pipeline.
  6. 6 . The system of claim 1 , wherein: the first task scheduler includes: a set of inputs each configured to receive a respective signal indicating that a respective set of data is available; and a set of outputs that includes the output, each configured to provide a respective signal that indicates that a respective set of data is available; the second task scheduler includes: a set of inputs that includes the input, each configured to receive a respective signal that indicates that a respective set of data is available; and a set of outputs each configured to provide a respective signal indicating that a respective set of data is available; and the scheduler circuit further comprises a crossbar coupled to the set of inputs and the set of outputs of the first task scheduler and coupled to the set of inputs and the set of outputs of the second task scheduler.
  7. 7 . The system of claim 1 , wherein the first task scheduler further includes a clear block pending circuit configurable to instruct the blocking circuit to stop inhibiting communication of the signal based at least in part on completion of a corresponding task by the third hardware accelerator.
  8. 8 . The system of claim 1 , wherein: the signal is a first pending signal; the first task scheduler includes an input configured to receive a second pending signal; and the first task scheduler is configured to cause the first hardware accelerator to start execution of a corresponding task based on the second pending signal.
  9. 9 . A device comprising: a first task scheduler circuit comprising: an output configurable to provide a signal that indicates that a set of data produced by a first hardware accelerator is available; and a blocking circuit configurable to inhibit communication of the signal when the first set of data is available based on a status of a second hardware accelerator; and a second task scheduler circuit coupled to a third hardware accelerator, the second task scheduler circuit comprising an input coupled to the output of the first task scheduler circuit, the second task scheduler circuit configurable to cause the third hardware accelerator to start execution of a corresponding task on the set of data based on the signal.
  10. 10 . The device of claim 9 , further comprising a memory mapped register including a field, wherein the blocking circuit is configurable to inhibit communication of the signal based on a value stored in the field.
  11. 11 . The device of claim 10 , wherein the memory mapped register includes a second field that indicates whether the blocking circuit is currently inhibiting communication of the signal.
  12. 12 . The device of claim 9 , wherein: the first task scheduler includes: a set of inputs each configured to receive a respective signal indicating that a respective set of data is available; and a set of outputs that includes the output, each configured to provide a respective signal that indicates that a respective set of data is available; the second task scheduler includes: a set of inputs that includes the input, each configured to receive a respective signal that indicates that a respective set of data is available; and a set of outputs each configured to provide a respective signal indicating that a respective set of data is available; and the device further comprises a crossbar coupled to the set of inputs and the set of outputs of the first task scheduler and coupled to the set of inputs and the set of outputs of the second task scheduler.
  13. 13 . The device of claim 9 , wherein the first task scheduler further includes a clear block pending circuit configurable to instruct the blocking circuit to stop inhibiting communication of the signal based at least in part on completion of a corresponding task by the third hardware accelerator.
  14. 14 . The device of claim 9 , wherein: the signal is a first pending signal; the first task scheduler includes an input configured to receive a second pending signal; and the first task scheduler is configured to cause the first hardware accelerator to start execution of a corresponding task based on the second pending signal.
  15. 15 . The device of claim 9 , further comprising a channel mapping circuit coupled to a direct memory access (DMA).
  16. 16 . A method comprising: configuring a first thread on a first task scheduler for a first hardware accelerator; configuring a second thread on a second task scheduler for a second hardware accelerator; initiating execution of the first thread by the first task scheduler; concurrently initiating execution of the second thread by the second task scheduler; determining that a set of data produced by the first hardware accelerator is available for use by the second hardware accelerator; and using a blocking circuit of the first task scheduler, inhibiting communication of a signal that indicates that the set of data is available until completion of a third hardware accelerator.
  17. 17 . The method of claim 16 , wherein the third hardware accelerator is subsequent to the first and second hardware accelerators in a pipeline.
  18. 18 . The method of claim 16 , further comprising: using a clear block pending circuit, causing the blocking circuit to stop inhibiting the communication of the signal based at least in part on completion of the third hardware accelerator.
  19. 19 . The method of claim 16 , wherein one or more of the first, second, and third hardware accelerators is a processor and a respective one of the first thread or the second thread comprises executing software instructions.

Description

RELATED APPLICATIONS This application is a continuation of and claims priority to U.S. patent application Ser. No. 18/175,333 filed Feb. 27, 2023, and claims the benefit of and priority to U.S. Provisional Patent Application No. 63/345,937, titled “FLEXCONNECT: SUPER PIPELINE”, filed on May 26, 2022 and which Applications are hereby incorporated herein by reference in their entireties. This application claims priority to U.S. Application, entitled “HARDWARE EVENT TRIGGERED PIPELINE CONTROL,” filed herewith on Feb. 27, 2023, Attorney Docket No. T101779US02, which claims the benefit of U.S. Provisional Patent Application No. 63/345,940, entitled “HARDWARE EVENT TRIGGERED PIPELINE CONTROL,” filed May 26, 2022, both of which are hereby incorporated by reference in their entirety for all purposes. TECHNICAL BACKGROUND A new class of embedded safety systems, referred to as advanced driver assistance systems (ADAS), has been introduced into automobiles to reduce human operation error. Such systems may provide functionality such as rear-view facing cameras, electronic stability control, and vision-based pedestrian detection systems. Many of these systems rely on computer vision processing of images captured by one or more cameras to detect objects in the field of view of the one or more cameras. The vision processing may include, for example, image processing, lens distortion correction, noise filtering, edge detection, motion detection, image scaling, etc. Tasks implementing various parts of the vision processing of the images may be performed on hardware accelerators and/or by software executing on programmable processors, e.g., digital signal processors and general-purpose processors. Current hardware thread schedulers provide for scheduling of a single thread of tasks to be executed on hardware accelerators but do not provide the flexibility needed for image and vision processing in ADAS. Overview In an implementation, a data processing device includes a plurality of hardware accelerators, a scheduler circuit, and a blocking circuit. The scheduler circuit is coupled to the plurality of hardware accelerators, and includes a plurality of hardware task schedulers. Each hardware task scheduler is coupled to a corresponding hardware accelerator, and is configured to control execution of the task by the hardware accelerator. The blocking circuit is coupled to the plurality of hardware accelerators and configured to inhibit communication between a first hardware accelerator and a second hardware accelerator of the plurality of hardware accelerators. In another implementation, a scheduler circuit for a data processing device includes a plurality of hardware accelerators, each hardware accelerator configured to execute a task. The scheduler circuit also includes a plurality of hardware task schedulers. Each hardware task scheduler of the plurality of hardware task schedulers is coupled to a corresponding hardware accelerator and is configured to control execution of the task by the hardware accelerator. The scheduler circuit further includes a blocking circuit coupled to the plurality of hardware accelerators that is configured to inhibit communication between a first hardware accelerator and a second hardware accelerator of the plurality of hardware accelerators. The scheduler circuit is configured to concurrently control a first hardware accelerator to execute a task from a first thread of tasks requiring a first configuration of the first hardware accelerator, and a second hardware accelerator to execute a task from a second thread of tasks requiring a second configuration of the second hardware accelerator different from the first configuration. In a further embodiment, a method for executing concurrent threads on a read scheduler circuit (comprising a plurality of hardware task schedulers, included in a data processing device including a plurality of hardware accelerators), includes configuring a first thread on a first hardware task scheduler included in the scheduler circuit. The first thread includes tasks requiring a first configuration of a first hardware accelerator comprised in the data processing device. The first hardware task scheduler is coupled with, and configured to control, the first hardware accelerator. The method also includes configuring a second thread on a second hardware task scheduler included in the scheduler circuit. The second thread includes tasks requiring a second configuration of a second hardware accelerator comprised in the data processing device. The second configuration of the second hardware accelerator is different from the first configuration of the first hardware accelerator. The second hardware task scheduler is coupled with, and configured to control, the second hardware accelerator. The method further includes initiating execution of the first thread by the first hardware accelerator, and concurrently initiating execution of the second thread by the second hardware accelerat