US-12621182-B2 - Functional safety for system-on-chip arrangements
Abstract
A sensor data input chiplet obtains sensor data from a sensor system. A central chiplet executes a functional safety program to dynamically compare and verify output of workloads being executed by a set of workload processing chiplets, where the workloads are executed across the set of workload processing chiplets based on the sensor data.
Inventors
- Francois Piednoel
- Oliver Pöllny
Assignees
- Mercedes-Benz Group AG
Dates
- Publication Date
- 20260505
- Application Date
- 20230621
Claims (18)
- 1 . A computing system, comprising: a sensor data input chiplet to obtain sensor data from a sensor system; a set of workload processing chiplets; and a first central chiplet comprising a shared memory including a functional safety (FuSa) program that causes one or more processors of the first central chiplet to: dynamically compare and verify output of workloads executed by the set of workload processing chiplets, the workloads being executed across the set of workload processing chiplets based on the sensor data; a first system-on-chip (SoC) that includes the first central chiplet; and a second SoC that includes a second central chiplet, the first SoC and the second SoC being communicatively coupled by an interconnect; wherein the FuSa program is further included in the second central chiplet of the second SoC, the FuSa program included in the second central chiplet of the second SoC causes one or more processors of the second SoC to: monitor the shared memory of the first central chiplet of the first SoC to dynamically determine whether the first SoC is functioning within nominal operating parameters.
- 2 . The computing system of claim 1 , wherein the sensor data input chiplet, the first central chiplet, and the set of workload processing chiplets communicate over a performance network comprising a plurality of network hubs, and wherein the FuSa program monitors communications through the plurality of network hubs between the sensor data input chiplet, the first central chiplet, and the one or more workload processing chiplets.
- 3 . The computing system of claim 2 , wherein the FuSa program monitors communications through the plurality of network hubs using a set of FuSa accounting hubs that communicate over a high-reliability FuSa network.
- 4 . The computing system of claim 3 , wherein the first central chiplet comprises a dedicated FuSa CPU executing the FuSa program to (i) communicate over the performance network via a performance network-on-chip (NoC), and (ii) communicate over the high-reliability FuSa network via a FuSa NoC.
- 5 . The computing system of claim 4 , wherein the set of workload processing chiplets transmit (i) processed data to a cache memory of the first central chiplet over the performance network, and (ii) a first error correction code (ECC) along the high-reliability FuSa network to the first central chiplet based on the processed data.
- 6 . The computing system of claim 5 , wherein upon receiving the processed data, the first central chiplet generates a second ECC using the processed data, and wherein the FuSa CPU performs a functional safety call in the first central chiplet to verify that the first ECC and the second ECC match to ensure that the processed data was transmitted correctly.
- 7 . The computing system of claim 1 , wherein the computing system is included on a vehicle, and wherein the workloads comprise inference tasks based on the sensor data for autonomously operating the vehicle.
- 8 . The computing system of claim 1 , wherein the workloads are executed by the set of workload processing chiplets in independent pipelines, and wherein the FuSa program dynamically compares and verifies output of the independent pipelines by executing a set of FuSa workloads in a FuSa pipeline.
- 9 . The computing system of claim 1 , wherein the FuSa program in the second central chiplet of the second SoC further causes the one or more processors of the second SoC to: in response to determining that the first SoC is not functioning within nominal operating parameters, cause a second set of workload processing chiplets of the second SoC to take over execution of the workloads.
- 10 . The computing system of claim 1 , wherein determining that the first SoC is not operating within nominal operating parameters corresponds to one or more of the first SoC overheating, a power surge, or an error in the first SoC.
- 11 . The computing system of claim 9 , wherein for each respective sensor data item generated by each respective sensor data component of the sensor system, the first SoC generates a cipher associated with the respective sensor data component at which the respective sensor data item originates, and transmits the cipher to the second SoC.
- 12 . The computing system of claim 11 , wherein, upon determining that the first SoC is not functioning within the nominal operating parameters, the second SoC decrypts the cipher to verify the respective sensor data item to take over execution of the workloads.
- 13 . A non-transitory computer readable medium storing instructions that, when executed by one or more processors of a computing system, cause the computing system to: obtain, by a sensor data input chiplet of the computing system, sensor data from a sensor system; on a first central chiplet of a first system-on-chip (SoC) of the computing system, execute a functional safety program (FuSa) to dynamically compare and verify output of workloads executed by a set of workload processing chiplets, the workloads being executed across the set of workload processing chiplets based on the sensor data; and on a second central chiplet of a second SoC of the computing system on which the FuSa program is included, monitor a shared memory of the first central chiplet of the first SoC to dynamically determine whether the first SoC is functioning within nominal operating parameters.
- 14 . The non-transitory computer readable medium of claim 13 , wherein the sensor data input chiplet, the first central chiplet, and the set of workload processing chiplets communicate over a performance network comprising a plurality of network hubs, and wherein the FuSa program monitors communications through the plurality of network hubs between the sensor data input chiplet, the first central chiplet, and the one or more workload processing chiplets.
- 15 . The non-transitory computer readable medium of claim 14 , wherein the FuSa program monitors communications through the plurality of network hubs using a set of FuSa accounting hubs that communicate over a high-reliability FuSa network.
- 16 . The non-transitory computer readable medium of claim 15 , wherein the first central chiplet comprises a dedicated FuSa CPU executing the FuSa program to (i) communicate over the performance network via a performance network-on-chip (NoC), and (ii) communicate over the high-reliability FuSa network via a FuSa NoC.
- 17 . The non-transitory computer readable medium of claim 16 , wherein the set of workload processing chiplets transmit (i) processed data to a cache memory of the first central chiplet over the performance network, and (ii) a first error correction code (ECC) along the high-reliability FuSa network to the first central chiplet based on the processed data.
- 18 . A computer-implemented method of implementing functional safety on a computing system, the method being performed by one or more processors and comprising: obtaining, by a sensor data input chiplet of the computing system, sensor data from a sensor system; on a first central chiplet of a first system-on-chip (SoC) of the computing system, executing a functional safety (FuSa) program to dynamically compare and verify output of workloads executed by a set of workload processing chiplets, the workloads being executed across the set of workload processing chiplets based on the sensor data; and on a second central chiplet of a second SoC of the computing system on which the FuSa program is included, monitoring a shared memory of the first central chiplet of the first SoC to dynamically determine whether the first SoC is functioning within nominal operating parameters.
Description
BACKGROUND Universal Chiplet Interconnect Express (UCIe) provides an open specification for an interconnect and serial bus between chiplets, which enables the production of large system-on-chip (SoC) packages with intermixed components from different silicon manufacturers. Autonomous vehicle computing systems may operate using chiplet arrangements that follow the UCIe specification. One goal of creating such computing systems is to achieve the robust safety integrity levels of other important electrical and electronic (E/E) automotive components of the vehicle. SUMMARY A computing system can include a sensor data input chiplet to obtain sensor data from a sensor system of a vehicle, and one or more workload processing chiplets that execute workloads based on the sensor data. The computing system can further include a first central chiplet comprising a shared memory including a functional safety (FuSa) program that the first central chiplet to dynamically compare and verify output of workloads being executed by the set of workload processing chiplets. In various examples, the computing system can be included on a vehicle, and the workloads can comprise inference tasks based on the sensor data for autonomously operating the vehicle. In certain implementations, the workloads can be executed by the set of workload processing chiplets in independent pipelines, and the FuSa program can dynamically compare and verify output of the independent pipelines by executing a set of FuSa workloads in a FuSa pipeline. In some embodiments, the computing system can comprise a first system-on-chip (SoC) that includes the first central chiplet and a second SoC that includes a second central chiplet. The first SoC and the second SoC can be communicatively coupled by an interconnect, and the FuSa program can also be included in the second central chiplet of the second SoC. In such embodiments, the FuSa program included in the second central chiplet of the second SoC causes the second SoC to monitor the shared memory of the first central chiplet of the first SoC to dynamically determine whether the first SoC is functioning within nominal operating parameters. In response to determining that the first SoC is not functioning within nominal operating parameters, the FuSa program of the second SoC can cause a second set of workload processing chiplets of the second SoC to take over execution of the workloads. As provided herein, the nominal operating parameters can correspond to nominal temperature ranges, voltage ranges, or any, faults, failures, or errors on the first SoC. In various implementations, the sensor data input chiplet, the central chiplet, and the one or more workload processing chiplets can communicate over a performance network comprising a plurality of network hubs. The performance network can comprise a high-bandwidth network for the transmission of raw sensor data, processed sensor data, and messages. In further implementations, the FuSa program can monitor communications through the plurality of network hubs between the sensor data input chiplet, the central chiplet, and the one or more workload processing chiplets. For example, the FuSa program can monitor communications through the plurality of network hubs using a set of FuSa accounting hubs that communicate over a high-reliability FuSa network. In certain examples, the central chiplet can comprise a dedicated FuSa CPU executing the FuSa program to communicate over the performance network via a performance network-on-chip (NoC), and communicate over the high-reliability FuSa network via a FuSa NoC. The one or more workload processing chiplets can transmit processed sensor data corresponding to the execution of workloads to a cache memory of the central chiplet over the performance network. The workload processing chiplets can further transmit a first error correction code (ECC) along the high-reliability FuSa network to the central chiplet based on the processed sensor data. Upon receiving the processed sensor data, the central chiplet can generate a second ECC using the processed sensor data. The FuSa CPU of the central chiplet may then perform a functional safety call in the central chiplet to verify that the first ECC and the second ECC match to ensure that the processed data was transmitted correctly. BRIEF DESCRIPTION OF THE DRAWINGS The disclosure herein is illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings in which like reference numerals refer to similar elements, and in which: FIG. 1 is a block diagram depicting an example computing system in which embodiments described herein may be implemented, in accordance with examples described herein; FIG. 2 is a block diagram depicting a system-on-chip (SoC) in which examples described herein may be implemented, in accordance with examples described herein; FIG. 3 is a block diagram illustrating an example central chiplet of an SoC arrangement for executing w