Search

CN-122019464-A - Satellite-borne system reliability protection method, equipment and storage medium based on software and hardware cooperation

CN122019464ACN 122019464 ACN122019464 ACN 122019464ACN-122019464-A

Abstract

The invention relates to the technical field of spaceborne computers, in particular to a method, equipment and a storage medium for protecting the reliability of a spaceborne system based on software and hardware cooperation, which comprises the following steps that S1, an FPGA (field programmable gate array) dynamically reconfigures between a cooperation reinforcing mode and an independent operation mode in response to a configuration instruction from a spaceborne processor; and S2, in a collaborative reinforcement mode, the FPGA performs parallel access on a plurality of storage copies, adopts a multi-bit dynamic voting mechanism with dynamically configurable voting bit numbers, performs real-time hardware-level voting on read data or state feedback signals to obtain voting results, and submits the voting results to the satellite-borne processor, and S3, in an independent operation mode, the satellite-borne processor schedules a background task and performs data inspection, verification and error correction operation on a single designated storage copy through a single operation interface provided by the FPGA. According to the method, the problem of time sequence asynchronism of redundant devices is solved through synchronous voting of dynamic mode switching and hardware level, and the on-orbit autonomous maintenance and adaptation capability is improved.

Inventors

  • SU JIAWEI
  • SHUANG XIAOCHUAN
  • WEI JIE
  • ZHU WENXING
  • WANG ZHIGUO
  • FENG SHUYI
  • TIAN WENBO

Assignees

  • 上海航天电子通讯设备研究所

Dates

Publication Date
20260512
Application Date
20260203

Claims (10)

  1. 1. A satellite-borne system reliability protection method based on software and hardware cooperation is characterized by comprising the following steps: s1, responding to a configuration instruction from a satellite-borne processor, and dynamically reconfiguring a programmable logic device FPGA between a collaborative reinforcement mode and an independent operation mode; s2, under the collaborative reinforcement mode, the programmable logic device FPGA performs parallel access on a plurality of storage copies in a multi-copy storage system, adopts a multi-bit dynamic voting mechanism with dynamically configurable voting bit numbers, performs real-time hardware-level voting on read data or state feedback signals from the plurality of storage copies to obtain voting results, and submits the voting results to a satellite-borne processor; and S3, in the independent operation mode, the satellite-borne processor schedules a background task, and performs data inspection, verification and error correction operation on a single designated storage copy in the multi-copy storage system through a single-path operation interface provided by the programmable logic device FPGA.
  2. 2. The software and hardware collaboration-based on-board system reliability protection method according to claim 1, wherein in step S1, the programmable logic device FPGA dynamically reconfigures between a collaboration reinforcement mode and an independent operation mode, comprising: S11, the FPGA safely terminates all the ongoing memory access operations; S12, the FPGA receives and analyzes the configuration instruction from the satellite-borne processor, and extracts a target mode identifier and a mode parameter; S13, reconstructing an internal data path and control logic according to the target mode identifier, activating parallel paths connected to all storage copies if the mode is switched to a collaborative reinforcement mode, configuring a hardware voter according to voting bit numbers in the mode parameters, and simultaneously enabling a synchronous timing engine; And S14, after the reconstruction is completed, the programmable logic device FPGA sends a mode switching completion confirmation to the satellite-borne processor.
  3. 3. The software-hardware cooperation-based satellite-borne system reliability protection method according to claim 1, wherein in step S2, performing real-time hardware-level voting on readout data from a plurality of storage copies to obtain a voting result, includes: S21, dynamically setting a decision rule of a hardware voter by the FPGA according to the voting bit number in the configuration instruction; s22, when a storage access request from the satellite-borne processor is received, the FPGA synchronous timing engine generates a strictly synchronous control signal and sends the strictly synchronous control signal to all storage copies in parallel; S23, the FPGA collects the response of each storage copy in parallel, and for the read operation, collects the read data of each storage copy; And S24, the hardware voter performs real-time voting on the read data or the state feedback signal according to the decision rule, and generates the voting result.
  4. 4. The satellite-borne system reliability protection method based on software and hardware cooperation according to claim 3, wherein in step S24, the real-time voting is performed on the read data, which includes: when the preset timeout time is reached or the data of all the stored copies are ready, starting the hardware voter; The hardware voter compares the read data of each storage copy in parallel according to bits or words; Voting each data bit according to the voting threshold value determined by the decision rule, wherein if the number of copies with the same value reaches the voting threshold value on a certain data bit, the effective value of the bit is determined to be the value; And after the bitwise voting, generating the voting result, and simultaneously, recording the difference between each copy of data and the voting result.
  5. 5. The method for protecting reliability of a satellite-borne system based on software and hardware cooperation according to claim 3, wherein in step S24, voting the status feedback signals in real time comprises: The FPGA continuously acquires state feedback signals returned by all storage copies; the hardware voter votes the operation state feedback signal according to the voting threshold value determined by the decision rule; if the number of the copies with the consistent states reaches the voting threshold, judging that the operation is completed, outputting the consistent states as voting results, if the number of the copies with the consistent states does not reach the voting threshold, judging that the operation is not successful in synchronization, marking the copies with the inconsistent states or triggering an error processing flow.
  6. 6. The software and hardware collaboration-based satellite-borne system reliability protection method according to claim 1, wherein in step S3, performing data patrol, verification and error correction operations on a single designated storage copy in the multi-copy storage system comprises: s31, the satellite-borne processor dispatches a background inspection task with low priority, sends an instruction to the programmable logic device FPGA to switch to an independent operation mode, and designates a single storage copy to be inspected; S32, the satellite-borne processor reads the data of the appointed storage copy to the memory buffer area in a blocking way through a single-way operation interface of the FPGA; s33, the satellite-borne processor performs verification of check codes and decoding of error correction codes on the read data at a software layer, and judges the state of the data block; S34, if the data block has a correctable error, generating corrected data, and writing back to an original storage position through a single-path interface of the programmable logic device FPGA; and S35, updating the health status record and the system log of the storage copy.
  7. 7. The method for protecting reliability of a satellite-borne system based on software-hardware coordination according to claim 6, wherein in step S33, the satellite-borne processor performs verification of check code and decoding of error correction code on the read data at the software level, and determining the status of the data block comprises: s331, performing check code calculation on the main body content of the data block, generating a real-time checksum, and comparing the real-time checksum with an original storage checksum extracted from the data block to verify the data integrity; S332, decoding operation is carried out based on error correction codes associated with the data blocks, and the data blocks are judged to be in an error-free state, a correctable error state or an uncorrectable error state according to decoding results; S333, integrating the verification result of the check code and the decoding result of the error correction code, and judging the final state of the data block; s334, performing state processing according to the final state.
  8. 8. The method for protecting reliability of a satellite-borne system based on software and hardware cooperation according to claim 7, wherein in step S334, performing state processing according to the final state includes: If the data block is in a correctable error state, correcting error bits in a memory buffer area according to the position and correct value information provided by the error correction code decoding result, and generating a corrected data block; If the data block is in an uncorrectable error state, an alarm is triggered and an attempt is made to recover the data by other redundant copies, or the corresponding memory block is marked as a bad block.
  9. 9. A computer device comprising a memory and a processor, the memory having stored therein computer readable instructions which, when executed by the processor, cause the processor to perform the steps of a software and hardware co-ordinated on board system reliability protection method as claimed in any one of claims 1 to 8.
  10. 10. A storage medium storing computer readable instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of a software and hardware collaboration based on the method of protecting reliability of a satellite-borne system as claimed in any one of claims 1 to 8.

Description

Satellite-borne system reliability protection method, equipment and storage medium based on software and hardware cooperation Technical Field The invention relates to the technical field of spaceborne computers, in particular to a method, equipment and a storage medium for protecting the reliability of a spaceborne system based on software and hardware cooperation. Background In order to respond to space networking and routing exchange, the requirements of space-earth integration situation calling, inter-satellite interconnection, intelligent aerospace and the like are generated. Conventional on-board computer systems usually adopt a design scheme of combining commercial processors (such as PowerPC, ARM and the like) with radiation-resistant reinforcement measures, but due to the specificity of space environment (such as single event effect, total dose radiation and the like), the reliability of commercial chips is difficult to meet the requirement of long-term on-orbit tasks. In addition, the traditional on-board system has the following technical bottlenecks in storage management, task scheduling, fault tolerance design and the like: 1) The storage system has insufficient reliability: Conventional on-board storage systems typically employ triple modular redundant TMR or software verification mechanisms (e.g., EDAC) to improve data reliability. However, the hardware TMR scheme has a problem of inconsistent device response time, resulting in asynchronous memory operations, which may cause state transitions or data inconsistencies. And software verification (such as Cyclic Redundancy Check (CRC), hamming code and the like) occupies a large amount of CPU resources, so that the read-write efficiency is reduced (usually to 25% -50% of the original performance). 2) Single event effect SEE lack of protection: The space high-energy particles may cause single event upset SEU or single event locking SEL of the memory unit, and the traditional scheme relies on hardware redundancy (such as antifuse FPGA) or software error correction mechanism, but cannot dynamically adapt to different space radiation environments. For example, during peak solar activity, the probability of single event effect increases greatly, while the conventional fixed redundancy mode cannot optimize the resource occupancy. 3) Localization and autonomous controllable requirements: In recent years, localization substitution of aerospace electronic systems has become a necessary trend. However, domestic high-performance aerospace-level processors (such as MpSOC) still need to be improved in the aspects of radiation resistance, multi-core scheduling optimization and the like, and the requirements of long-service-life tasks such as future deep space exploration and high-orbit satellites are difficult to meet by the traditional scheme. 4) On-orbit reconfigurability is insufficient: conventional on-board systems typically employ a fixed architecture that cannot adjust storage management policies or fault-tolerant modes on-track. For example, when performance of a certain memory chip is reduced due to radiation damage, the system cannot be dynamically switched to a more reliable redundancy mode, and only ground intervention can be relied on, so that autonomous viability of the system is reduced. In order to solve the problems, the invention provides a highly reliable, high-efficiency and dynamically adaptable satellite-borne system reliability protection scheme, and the long-term stable operation capability of a satellite-borne computer in a complex space environment is improved by combining a domestic hardware platform and a software and hardware collaborative optimization technology. Disclosure of Invention The invention aims to solve the defects in the prior art, and provides a satellite-borne system reliability protection method based on software and hardware cooperation, which comprises the following steps: s1, responding to a configuration instruction from a satellite-borne processor, and dynamically reconfiguring a programmable logic device FPGA between a collaborative reinforcement mode and an independent operation mode; s2, under the collaborative reinforcement mode, the programmable logic device FPGA performs parallel access on a plurality of storage copies in a multi-copy storage system, adopts a multi-bit dynamic voting mechanism with dynamically configurable voting bit numbers, performs real-time hardware-level voting on read data or state feedback signals from the plurality of storage copies to obtain voting results, and submits the voting results to a satellite-borne processor; and S3, in the independent operation mode, the satellite-borne processor schedules a background task, and performs data inspection, verification and error correction operation on a single designated storage copy in the multi-copy storage system through a single-path operation interface provided by the programmable logic device FPGA. Preferably, in step S1, the progr