CN-121979720-A - Satellite load FPGA configuration data error correction method and system for resisting single event upset

CN121979720ACN 121979720 ACN121979720 ACN 121979720ACN-121979720-A

Abstract

The invention belongs to the technical field of aerospace electronic integrated circuits, and particularly relates to a satellite load FPGA configuration data error correction method and system for resisting single event upset, wherein a bit-symbol double-layer error correction method for correcting an RS decoding operator based on finite field algebra is cascaded on the basis of 'three-taking-two' majority voting logic, so that single event independent upset and multi-copy common mode failure can be corrected at the same time, and the problem of traditional TMR error locking is solved. Meanwhile, when configuration data is written into a nonvolatile memory, a cross-sector space staggered scattered storage mechanism is introduced, so that symbols in the same RS codeword are separated as far as possible in physical positions, and aggregated physical damage is converted into scattered symbol-level errors.

Inventors

NING YAFEI
LI ZIRUI
WANG YONG

Assignees

山东大学

Dates

Publication Date: 20260505
Application Date: 20260407

Claims (10)

1. A satellite load FPGA configuration data error correction method for single event upset resistance is characterized by comprising the following steps: S1, for original configuration bit stream data of a satellite load FPGA, grouping the original configuration bit stream data by taking a preset byte as an information segment, executing coding operation on each information segment, generating a first byte check symbol, and constructing RS coding configuration data of a second byte; S2, constructing a mapping relation between a logical address and a physical address through address remapping, and dispersedly writing continuous RS coding configuration data into different physical sectors; S3, synchronously reading data of the same addresses of three NOR Flash in parallel, and executing triple-modular redundancy operation on each bit of data to obtain a first data stream; S4, for the first data stream, performing RS decoding operation by taking the second byte as a unit to obtain a second data stream; And S5, writing back repair error data based on the second data stream, and reporting a single event upset event and an error correction state.
2. The satellite load FPGA configuration data error correction method for single event upset resistance according to claim 1, wherein S1 specifically is: For the original configuration bit stream data of the satellite load FPGA, 239 bytes are taken as an information segment to carry out grouping, coding operation is carried out on each information segment, 16 byte check symbols are generated, and 255 bytes of RS coding configuration data are constructed.
3. The satellite load FPGA configuration data error correction method for single event upset resistance according to claim 1, wherein S2 specifically is: For three NOR Flash slices, dividing each NOR Flash slice into an upper half area and a lower half area; Based on RS coding configuration data, splicing a plurality of groups of continuous RS code words into standard storage words according to byte sequence numbers; After the mapping relation between the logical address and the physical address is constructed through the address remapping, the standard memory word is written into the corresponding half-zone addresses of the three NOR Flash chips in sequence according to the mapping address.
4. The satellite load FPGA configuration data error correction method for single event upset resistance according to claim 1, wherein S4 specifically is: And when the error exists, constructing an error positioning polynomial, solving the error position and the error value, and correcting the byte level under the condition that the number of error symbols is less than 8.
5. The method for correcting satellite load FPGA configuration data with single event upset resistance according to claim 1, wherein S3 performs a triple modular redundancy operation on each bit of data, and outputs multiple digital values and completes single event upset error correction when three paths of data are identified as inconsistent, and records single event upset event counts.
6. The method for correcting satellite payload FPGA configuration data with single event upset resistance according to claim 1, wherein S5 is based on the second data stream, write back repair error data, specifically: based on the second data stream, comparing the difference between the three paths of NOR Flash read data and the second data stream, performing write-back correction on the upper half area of the NOR Flash, and keeping the lower half area read-only.
7. A satellite load FPGA configuration data error correction system resistant to single event upset is characterized by comprising: The RS encoding module is used for grouping bit stream data of original configuration of the satellite load FPGA by taking a preset byte as an information segment, performing encoding operation on each information segment, generating a first byte check symbol and constructing RS encoding configuration data of a second byte; The cross-sector staggered storage module is used for constructing a mapping relation between a logical address and a physical address through address remapping and writing continuous RS coding configuration data into different physical sectors in a scattered way; The first error correction module is used for synchronously reading data of the same addresses of three NOR Flash in parallel, and executing triple-modular redundancy operation on each bit of data to obtain a first data stream; The second error correction module is used for carrying out RS decoding operation by taking the second byte as a unit for the first data stream to obtain a second data stream; And the repair module is used for writing back repair error data based on the second data stream and reporting a single event upset event and an error correction state.
8. The satellite payload FPGA configuration data error correction system of claim 7, wherein the RS encoding module is specifically configured to: For the original configuration bit stream data of the satellite load FPGA, 239 bytes are taken as an information segment to carry out grouping, coding operation is carried out on each information segment, 16 byte check symbols are generated, and 255 bytes of RS coding configuration data are constructed.
9. The single event upset resistant satellite payload FPGA configuration data error correction system of claim 7, wherein the cross-sector interleaved storage module is specifically: For three NOR Flash slices, dividing each NOR Flash slice into an upper half area and a lower half area; Based on RS coding configuration data, splicing a plurality of groups of continuous RS code words into standard storage words according to byte sequence numbers; After the mapping relation between the logical address and the physical address is constructed through the address remapping, the standard memory word is written into the corresponding half-zone addresses of the three NOR Flash chips in sequence according to the mapping address.
10. The satellite payload FPGA configuration data error correction system of claim 7, wherein the second error correction module is specifically configured to: And when the error exists, constructing an error positioning polynomial, solving the error position and the error value, and correcting the byte level under the condition that the number of error symbols is less than 8.

Description

Satellite load FPGA configuration data error correction method and system for resisting single event upset Technical Field The invention belongs to the technical field of aerospace electronic integrated circuits, and particularly relates to a satellite load FPGA configuration data error correction method and system for resisting single event upset. Background In an aerospace electronic system, the SRAM type FPGA is widely applied to satellite control and various load systems by virtue of reconfigurable characteristics, and is a key programmable logic device for executing core functions. Since configuration data of the SRAM type FPGA is stored in the volatile memory unit, a configuration bit stream (bitstream) thereof is lost after power failure, and the system must rely on an external nonvolatile memory to complete reloading when power is applied. Therefore, the safe storage and reliable loading of configuration data in a nonvolatile memory chip become important preconditions for the stable operation of the whole star system. In the traditional engineering in the non-aerospace field, an SPI Flash storage chip is generally adopted for the system to be directly connected with an FPGA, and the FPGA actively reads configuration data to finish loading after being electrified. However, the architecture has significant limitations in the context of aerospace high-reliability applications: on the one hand, high-energy particles in the space environment may induce a single event upset effect (SINGLE EVENT Upset, SEU), so that the configuration bit stream stored in Flash is subjected to random bit upset, while the traditional SPI direct-connection architecture lacks an effective data detection and error correction mechanism, once the configuration data in the memory chip is overturned or damaged, the system can hardly recognize and recover error data, so that loading verification failure or abnormal function is caused. On the other hand, the single event upset effect can even directly act on an FPGA configuration network or a logic unit in operation to cause the function interruption or state failure of a digital integrated circuit, and under the architecture without an independent configuration management chip, the abnormal operation state usually needs to be reloaded by restarting a whole board after power failure, so that the on-orbit independent on-line reconstruction and fault self-healing of the satellite load FPGA can not be realized, and the continuity and safety of a space mission are seriously influenced. The existing aerospace system generally adopts an independent configuration management chip, is externally connected with three NOR Flash memory chips to serve as a memory area of configuration data (namely, bitstream operated by FPGA), and each Flash is logically divided into an upper independent half area and a lower independent half area, so that six mutually independent configuration mirror image areas are formed, and six identical configuration data are stored. In the configuration loading process, single copy error correction is realized through triple modular redundancy (Triple Modular Redundancy, TMR) triple taking two voting, and error copies can be written back and repaired. However, under the complex space radiation environment, the factors such as high-energy particle beam, secondary particle shower effect, and physical adjacency of integrated circuit device layout may cause correlation errors of multiple memory chips in time and space, namely so-called "Common mode Failure". In this case, the same logical bit may flip in two or even more redundant copies at the same time. TMR majority voting will output erroneous results when a dual copy parity flip occurs. Because TMR is essentially a memory-free combined judgment mechanism, after an error value is judged and output, when three-mode write-back repair is subsequently executed, the error result is reversely written into an original correct copy, so that three data tend to be consistent but are consistent as the error value, and an error locking phenomenon is formed. Once such common mode flipping occurs and write back diffusion is completed, the system will lose the ability to recover the correct data through the redundant structure, resulting in permanent configuration corruption. For satellite loading systems that lack on-orbit reconstruction capability, such damage often directly results in mission failure. Therefore, while the conventional TMR architecture can effectively suppress single-copy, independently distributed SEU, its decision mechanism has structural drawbacks when faced with multiple-copy parity-flipping. The defect is caused by the fact that the majority voting mechanism lacks the measurement capability of the reliability of a decision result and also lacks a redundancy check means of crossing symbols or crossing data blocks, and the reliability is improved only by relying on the number of physical copies without introducing hi