Search

CN-121996469-A - Memory dynamic error correction method and system supporting hybrid check mode

CN121996469ACN 121996469 ACN121996469 ACN 121996469ACN-121996469-A

Abstract

The invention discloses a memory dynamic error correction method and a system supporting a hybrid check mode, which comprise the steps of distributing a predefined number of check structures to form an initial linked list, writing a memory starting address, a data length, an error correction code array, a cyclic redundancy check code and a self-checking cyclic redundancy check code of each protected data and a linked list pointer pointing to the next check structure of the same functional block into the initial linked list to obtain a final linked list, checking whether the protected data corresponding to the interrupt thread or the data of the check structure per se is abnormal or not in a program interrupt thread in real time by adopting a preset error correction code check mode, marking an error mark when the abnormality occurs, and correcting the protected data marked with the error mark in a program main cyclic thread by adopting a preset CRC check mode based on the final linked list check data. The invention has the outstanding advantages of cost controllability, resource utilization efficiency, wide application range, high safety and stability and the like.

Inventors

  • SUN BO
  • BAI SHIQI
  • ZHANG KAIXUAN
  • YUAN HAITAO
  • XIAO ZHENGQIANG
  • SHEN WEI

Assignees

  • 北京四方继保工程技术有限公司
  • 北京四方继保自动化股份有限公司

Dates

Publication Date
20260508
Application Date
20251231

Claims (17)

  1. 1. A memory dynamic error correction method supporting a hybrid check mode is characterized by comprising the following steps: Step1, allocating a predefined number of check structures to form an initial linked list, wherein the check structures comprise a memory starting address of protected data, a data length, an error correction code array, a cyclic redundancy check code, a self-checking cyclic redundancy check code and a linked list pointer pointing to the next check structure of the same functional block, and the functional block comprises a program main loop, a goose release, a goose reception and an SV reception; Step 2, generating a corresponding error correction code array, a cyclic redundancy check code of the protected data and a self-checking cyclic redundancy check code based on each protected data, writing a memory starting address, a data length, the error correction code array, the cyclic redundancy check code and the self-checking cyclic redundancy check code of each protected data into the initial linked list, and writing a linked list pointer pointing to a next checking structure of the same functional block into the initial linked list to obtain a final linked list; And 3, checking whether the protected data corresponding to the interrupt thread or the data of the checking structure body is abnormal or not by adopting a preset error correction code checking mode in real time in the interrupt thread of the program, marking an error mark when the abnormality occurs, adopting a preset CRC checking mode in the main circulation thread of the program, and correcting the error of the protected data marked with the error mark based on the final linked list checking data, wherein the interrupt thread comprises goose release, goose reception and SV reception.
  2. 2. The memory dynamic error correction method according to claim 1, wherein: In step 1, the verification structure further includes a verification pattern and a verification state; The check mode comprises a direct storage mode and a 32-bit unsigned integer error correction code mode; The check state includes normal on, pause requiring error correction, and re-calculation of error correction code.
  3. 3. The memory dynamic error correction method according to claim 1, wherein: in step 1, the verification structures are divided into a data protection verification structure and a self-verification structure; the method comprises the steps of taking preset request protection data as protected data corresponding to a data protection checking structure body, wherein the data protection checking structure body is used for checking and correcting errors of the request protection data; The data protection and verification structure is used as protected data corresponding to the self-verification structure, and the self-verification structure is used for verifying and correcting errors of the data protection and verification structure; the number of the self-checking structures is calculated based on the number of the predefined checking structures and the maximum memory space size of one structure capable of checking and correcting errors.
  4. 4. The memory dynamic error correction method according to claim 3, wherein: After the number of the self-checking structures is calculated, the memory space occupied by the data protection checking structures is divided into segments with the same number as the self-checking structures, each segment is used as protected data corresponding to one self-checking structure, and the initial address and the data length of the memory of each segment are filled into the corresponding self-checking structure.
  5. 5. The memory dynamic error correction method according to claim 3, wherein: in step 2, based on the functional blocks to which the request protection data belong, determining a final linked list to which a data protection check structure corresponding to the request protection data belongs, wherein each functional block corresponds to one final linked list.
  6. 6. The method for dynamic error correction of memory according to claim 1, 2, 3 or 4, wherein: In step 2, the generating a corresponding error correction code array based on each protected data includes: When the memory space size occupied by a certain protected data is smaller than or equal to a preset memory space size threshold value, the corresponding check mode of the protected data is a direct storage mode, and the protected data is backed up to a corresponding check structure body as an error correction code array; When the memory space occupied by a certain protected data is larger than the memory space size threshold, the corresponding check mode of the protected data is a 32-bit unsigned integer error correction code mode, an error correction code array is calculated based on the protected data, and the error correction code array is stored in the corresponding check structure body.
  7. 7. The memory dynamic error correction method according to claim 6, wherein: When the check mode corresponding to certain request protection data is a 32-bit unsigned integer error correction code mode, the number of used data protection check structures is determined based on the memory space size occupied by the request protection data and the memory space size of a structure capable of checking and correcting errors.
  8. 8. The memory dynamic error correction method according to claim 2 or 7, wherein: When the checking mode of a certain checking structure body is a 32-bit unsigned integer error correction code mode, checking whether the starting address and the ending address of protected data corresponding to the checking structure body are aligned in four bytes, if so, extracting the unaligned data, aligning the extracted unaligned data into four-byte aligned data, dividing the unextracted part of the protected data according to a four-byte aligned data format, carrying out bit exclusive OR processing on the divided four-byte aligned data and the four-byte aligned data obtained by the unaligned data according to a preset error correction code array generating rule, and generating an error correction code array of the checking structure body; The preset error correction code array generation rule is that four bytes of aligned data obtained by dividing and four bytes of aligned data obtained by aligning the unaligned data are ordered, binary coding is carried out on each serial number, each bit in the binary coding is carried out, and bitwise exclusive OR is carried out on all four bytes of aligned data corresponding to the serial number with the value of 1 on the bit, so as to be used as an error correction code corresponding to the bit in the error correction code array.
  9. 9. The memory dynamic error correction method according to claim 1, 2 or 5, wherein: In step 3, a preset error correction code verification mode is adopted in real time in an interrupt thread of a program, a final linked list is adopted to verify whether the protected data corresponding to the interrupt thread or the data of the verification structure body itself is abnormal, and when the abnormality occurs, an error mark is marked, which comprises the following steps: Each data protection checking structure body in the final linked list corresponding to the current interrupt thread is adopted circularly to check whether the corresponding request protection data is abnormal or not, and before checking whether the corresponding request protection data is abnormal or not based on each data protection checking structure body, whether the checking state of the data protection checking structure body is normally opened or not is judged; if the self-checking cyclic redundancy check code of the data protection check structure is normally started, recalculating the self-checking cyclic redundancy check code of the data protection check structure, if the self-checking cyclic redundancy check code obtained by recalculation is not equal to the original self-checking cyclic redundancy check code, judging that the self-data of the data protection check structure is abnormal, setting the check state of the data protection check structure to be in pause for error correction; If the data protection and verification structure is not normally opened, continuing to judge whether the verification state of the next data protection and verification structure corresponding to the interrupt thread is normally opened.
  10. 10. The memory dynamic error correction method according to claim 9, wherein: Based on each data protection check structure, checking whether the corresponding request protection data is abnormal or not comprises the following steps: Setting the verification state in the data protection verification structure body as an error correction code needing to be recalculated when the verification mode in the data protection verification structure body is a 32-bit unsigned integer error correction code mode, then recalculating an error correction code array based on protected data corresponding to the data protection verification structure body, judging whether the recalculated error correction code array is equal to an original error correction code array in the data protection verification structure body, if so, setting the verification state of the data protection verification structure body as normal opening, and continuously judging whether the verification state of the next data protection verification structure body corresponding to the interrupt thread is normal opening; When the checking mode in the data protection checking structure body is a direct storage mode, judging whether the protected data corresponding to the data protection checking structure body is consistent with the protected data in the error correction code array of the data protection checking structure body, if so, setting the checking state of the data protection checking structure body to be normally opened, continuously judging whether the checking state of the next data protection checking structure body corresponding to the interrupt thread is normally opened, and if not, judging that the protected data corresponding to the data protection checking structure body is abnormal, and setting the checking state of the data protection checking structure body to be in pause for error correction.
  11. 11. The method for dynamic error correction of memory according to claim 1 or 2 or 5 or 10, wherein: In step 3, a preset CRC check mode is adopted in the main loop thread of the program, and error correction is performed on the protected data marked with the error flag based on the final linked list check data, including: adopting a preset CRC (cyclic redundancy check) mode, checking whether the corresponding request protection data is abnormal or not based on each data protection checking structure body in a final linked list corresponding to the main circulation thread, and correcting errors of the abnormal request protection data; Adopting a preset CRC (cyclic redundancy check) mode, checking whether the corresponding request protection data is abnormal or not based on a data protection check structure body with each check state of normal opening in a final linked list corresponding to each interrupt thread, and correcting errors of the abnormal request protection data; When the checking state in a certain data protection checking structure corresponding to a certain interrupt thread is pause and error correction is needed, error correction is carried out on the abnormal request protection data based on the error correction code array in the data protection checking structure.
  12. 12. The memory dynamic error correction method according to claim 11, wherein: Checking whether the corresponding request protection data is abnormal or not based on each data protection check structure in a final linked list corresponding to the main circulation thread and/or a certain interrupt thread, and before correcting the error of the abnormal request protection data and/or before correcting the error of the abnormal request protection data based on the error correction code array in the data protection check structure with the checking state of suspending the need of error correction, the method further comprises the following steps: the self-checking cyclic redundancy check code of the data protection check structure body is recalculated, whether the self-checking cyclic redundancy check code obtained by the recalculation is equal to the original self-checking cyclic redundancy check code in the data protection check structure body is judged, if so, the corresponding request protection data is checked to be abnormal based on the data protection check structure body, the abnormal request protection data is corrected, or the abnormal request protection data is corrected based on an error correction code array in the data protection check structure body; Judging whether the self-checking cyclic redundancy check code of the self-checking structure body obtained by recalculation is equal to the original self-checking cyclic redundancy check code or not, if not, returning to the state of needing restarting, if so, recalculating the cyclic redundancy check code of the protected data of the self-checking structure body; The method comprises the steps of judging whether a cyclic redundancy check code of the protected data of a self-checking structure obtained through recalculation is equal to that of the original protected data, if so, checking whether the corresponding protected data of the self-checking structure is abnormal based on the data protection checking structure, correcting errors of the abnormal protected data or correcting errors of the abnormal protected data based on an error correction code array in the data protection checking structure, if not, recalculating the error correction code array of the self-checking structure, carrying out exclusive or on the error correction code array obtained through recalculation and the original error correction code array, if the exclusive or result is 0, returning to be restarted, if the exclusive or result is not 0, determining the abnormal position in the data protection checking structure based on the exclusive or result, overturning the value of the position, then recalculating the cyclic redundancy check code of the protected data of the self-checking structure, judging whether the cyclic redundancy check code of the protected data of the self-checking structure obtained through recalculation is equal to that of the original protected data, if not, returning to be restarted, if not, and if not, carrying out error correction on the error correction of the abnormal protected data based on the data of the self-checking structure.
  13. 13. The memory dynamic error correction method according to claim 11, wherein: the method adopts a preset CRC check mode, based on a data protection check structure body, checks whether the corresponding request protection data is abnormal or not, and corrects the abnormal request protection data, and comprises the following steps: The method comprises the steps of calculating the cyclic redundancy check code of protected data of a data protection check structure again, judging whether the cyclic redundancy check code of the protected data obtained by calculation is equal to the cyclic redundancy check code of the original protected data or not, if yes, judging that error correction is not needed, if not, setting the check state of the data protection check structure to be in pause for error correction, calculating an error correction code array in the data protection check structure again, carrying out exclusive or on the error correction code array obtained by calculation and the original error correction code array, and if the exclusive or result is 0, returning to be restarted; After the value of the abnormal position is overturned, the cyclic redundancy check code of the protected data of the data protection check structure body is recalculated again, whether the cyclic redundancy check code of the protected data which is recalculated again is equal to the cyclic redundancy check code of the original protected data or not is judged, if yes, the error correction is judged to be successful, the check state in the data protection check structure body is set to be normally opened, if not, the error correction is judged to be failed, and the data protection check structure body is returned to be restarted.
  14. 14. The memory dynamic error correction method according to claim 11, wherein: when the verification state in a data protection verification structure corresponding to a certain interrupt thread is that error correction is needed for suspension, error correction is performed on the abnormal request protection data based on an error correction code array in the data protection verification structure, including: When the verification mode in the data protection verification structure body is a direct storage mode, comparing and recovering the protected data with abnormality with the protected data backed up in the error correction code array of the data protection verification structure body byte by byte; When the verification mode in the data protection verification structure body is a 32-bit unsigned integer error correction code mode, recalculating an error correction code array of the data protection verification structure body, carrying out exclusive or on the recalculated error correction code array and an original error correction code array, returning to be restarted if an exclusive or result is 0, determining an abnormal position in protected data corresponding to the data protection verification structure body based on the exclusive or result if the exclusive or result is not 0, and turning over a value of the position; After error correction is carried out on the request protection data with abnormality based on the error correction code array in the data protection and verification structure, the method further comprises the steps of recalculating the cyclic redundancy check code of the protected data of the data protection and verification structure, judging whether the cyclic redundancy check code of the protected data obtained through recalculation is equal to the cyclic redundancy check code of the original protected data, if so, judging that error correction is successful, setting the verification state in the data protection and verification structure as normal on, and if not, judging that error correction is failed, and returning to be restarted.
  15. 15. A memory dynamic error correction system using the memory dynamic error correction method according to any one of claims 1 to 14, comprising an initial linked list generation module, a memory registration module, and a memory checksum error correction module: the initial linked list generation module is used for distributing a predefined number of check structures to form an initial linked list; The memory registration module is used for generating a corresponding error correction code array, a cyclic redundancy check code of the protected data and a self-checking cyclic redundancy check code based on each protected data, and writing a memory starting address, a data length, the error correction code array, the cyclic redundancy check code and the self-checking cyclic redundancy check code of each protected data and a linked list pointer pointing to a next checking structure body of the same functional block into the initial linked list to obtain a final linked list; The memory check and error correction module is used for checking whether the protected data corresponding to the interrupt thread or the data of the check structure body per se are abnormal or not by adopting a preset error correction code check mode in real time in the interrupt thread of the program and marking an error mark when the abnormality occurs, and correcting the error of the protected data marked with the error mark by adopting a preset CRC check mode in the main circulation thread of the program based on the final linked list check data.
  16. 16. A terminal comprises a processor and a storage medium, and is characterized in that: The storage medium is used for storing instructions; the processor is configured to operate according to the instructions to perform the steps of the memory dynamic error correction method according to any one of claims 1-14.
  17. 17. A computer readable storage medium having stored thereon a computer program, characterized in that the program when executed by a processor implements the steps of the memory dynamic error correction method according to any of claims 1-14.

Description

Memory dynamic error correction method and system supporting hybrid check mode Technical Field The invention relates to the technical field of power system relay protection, in particular to a memory dynamic error correction method and system supporting a hybrid verification mode. Background Soft Error refers to temporary errors of memory data in an integrated circuit caused by high-energy particles or electromagnetic interference, etc., and the errors do not cause permanent damage to hardware, and can be repaired after the system is restarted. Single event upset (SINGLE EVENT Upset, SEU) refers to the inversion of a bit of a memory cell (i.e., a 1 to 0, or a 0 to 1), and is a typical manifestation of soft errors. In the past, research on soft errors has been mainly focused on the aerospace field with strong radiation and high reliability requirements, but as the integration level is improved and the chip scale contained in the embedded system is continuously enlarged, the probability of occurrence of soft errors in the embedded system is greatly improved. In the current development process of the power system in China towards the direction of intelligentization, digitalization and high-proportion renewable energy sources, the reliability of the power system is higher, the relay protection system is used as an important component for guaranteeing the stable operation of the smart power grid, in order to meet the requirements of the smart power grid and support a higher-level and intelligent analysis decision algorithm, the digitalization and chip degree of the relay protection device are obviously improved, and meanwhile, the probability of soft errors of the device is greatly improved along with the operation of more devices at a high-altitude new energy power station. Soft errors may cause malfunction or refusal of the device, and as a result, local fluctuation of the power grid is caused if the result is light, and damage of key equipment is caused if the result is heavy, so that huge safety accidents and economic losses are caused. Therefore, soft errors of the relay protection device are prevented, and a firmer guarantee can be provided for the development of the high quality of the novel power system. Aiming at the problem of memory soft error protection, some methods exist in hardware, systems and software at present. For example, the single-particle tolerance of the device can be improved by adopting a novel radiation protection process on hardware, error correction code (Error Correcting Code, ECC) functions of the device can be used on a system, such as parity check bits and Hamming codes, so that error detection and correction can be realized, and a triple-modular redundancy (Triple modular redundancy, TMR) mode can be adopted on software, namely, data is copied into three parts, and errors are shielded in a few and most modes. At present, the protection is performed at the hardware and system level, so that the production and use cost is greatly increased, and the triple modular redundancy scheme is easy to realize, but can cause serious waste of resources due to the fact that data are duplicated in three parts. CN117234789A discloses a verification error correction method, a verification error correction device, electronic equipment and a storage medium, which comprise the steps of burning verification data and error correction data in a flash memory, performing software error injection on the verification data to obtain second verification data, generating an error correction test set based on the second verification data and the error correction data, or performing software error injection on the error correction data to obtain second error correction data, generating the error correction test set based on the verification data and the second error correction data, performing error correction test on a memory to be tested based on the error correction test set, performing error correction verification on firmware data when the error correction test of the memory to be tested is successful, and carrying the firmware data to a memory when the firmware data passes the error correction verification. But this patent is a patent application to memory error correction hardware devices and does not relate to error correction algorithms. CN118733312a discloses a memory error correction method, a memory bank, a memory controller and a processor, which includes that the memory controller determines memory particles with errors in the memory bank when error correction of data obtained from the memory bank fails, and then corrects the data again based on the memory particles with errors. But this patent is a patent application to memory error correction hardware devices and does not relate to error correction algorithms. Disclosure of Invention The invention provides a memory dynamic error correction method and system supporting a hybrid check mode, which aims to solve the defects that the cost controllab