CN-121996045-A - Power-down protection method and system for storage server equipment and computer device
Abstract
The invention relates to a power failure protection method, a system and a computer device of storage server equipment, which comprise the steps of monitoring an output signal of a power supply unit through a CPLD, generating an unshieldable interrupt signal when the abnormality of the output signal of the power supply unit is identified, controlling a power supply mode to be switched to a battery backup unit by the power supply unit through an ACPI and hardware cooperation module, controlling relevant necessary hardware to enter a verification mode and interrupt power supply to unnecessary hardware, controlling a dirty data identification unit to read a dirty data bitmap marked by the storage server equipment, positioning dirty data to be protected in a memory of the storage server equipment, checking the dirty data stored in a preset storage unit through a cyclic redundancy check unit, controlling the storage server equipment to execute a shutdown process when the dirty data check is passed, and reading the dirty data from the preset storage unit and recovering the dirty data to a memory cache when the power-on signal is received again.
Inventors
- SHENG LI
- ZHU GUOSHENG
Assignees
- 邦彦技术股份有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20260120
Claims (10)
- 1. The power-down protection method of the storage server device is characterized by comprising the following steps: monitoring an output signal of a power supply unit in real time through a CPLD, and when the abnormality of the output signal of the power supply unit is identified, judging that the power supply unit is in an abnormal power-down state and generating an unshieldable interrupt signal; responding to the non-maskable interrupt signal, controlling a power supply mode to be switched from the power supply unit to the battery backup unit by the power supply switching control unit, controlling related necessary hardware to enter a core closing mode by an ACPI and hardware cooperation module, and interrupting power supply to unnecessary hardware; When an indication signal that the power consumption of the storage server equipment is reduced to a preset threshold value is received, controlling a dirty data identification unit to read a dirty data bitmap marked by the storage server equipment, and positioning dirty data to be protected in the memory of the storage server equipment; the dirty data are directionally restored to a preset storage unit through a PCIe channel, and the dirty data restored to the preset storage unit are verified through a cyclic redundancy verification unit so as to verify the integrity of the restored dirty data; And when the dirty data passes the verification, controlling the storage server equipment to execute a shutdown process, controlling the battery backup unit to stop supplying power, and when the power-on signal is received again, reading the dirty data from the preset storage unit and restoring the dirty data to the memory cache so as to restore the storage server equipment to a normal running state.
- 2. The power-down protection method of a storage server device according to claim 1, wherein the monitoring, by the CPLD, the output signal of the power supply unit in real time, and when the abnormality of the output signal of the power supply unit is identified, determining that the power supply unit is in an abnormal power-down state includes: the POK signal, the output voltage and the AC-OK signal output by the power supply unit are monitored in real time through the CPLD; When the POK signal output by the power supply unit is in a low level, determining that the output signal of the power supply unit is abnormal, and judging that the power supply unit is in an abnormal power-down state; when the output voltage of the power supply unit is lower than a preset value, determining that the output signal of the power supply unit is abnormal, and judging that the power supply unit is in an abnormal power-down state; when the AC-OK signal output by the power supply unit is lost or is in a low level, determining that the output signal of the power supply unit is abnormal, and judging that the power supply unit is in an abnormal power-down state.
- 3. The power-down protection method of a storage server device according to claim 1, wherein controlling the relevant necessary hardware to enter a core-off mode by the ACPI and hardware cooperation module and interrupting the power supply to the unnecessary hardware comprises: Accurately identifying relevant necessary hardware and unnecessary hardware through a preset hardware priority list, wherein the relevant necessary hardware comprises a first CPU and a second CPU, and the unnecessary hardware comprises a first CPU associated memory, PCIe peripheral equipment and a non-critical hard disk; Configuring ACPI parameters by a BIOS of a storage server device, triggering the first CPU to enter a core closing mode by an ACPI and hardware cooperation module based on the ACPI parameters, and controlling the second CPU to be down-converted to the lowest working frequency; and cutting off power supply of the first CPU associated memory, the PCIe peripheral equipment and the non-critical hard disk through the ACPI and hardware cooperation module.
- 4. The power-down protection method of a storage server device according to claim 3, wherein the necessary hardware further includes a core memory of the storage server device, a storage unit, a redundant hard disk built in a controller, and a fan set, the method further comprising: The battery backup unit is used for supplying power to the second CPU, the core memory of the storage server device, the storage unit and the redundant hard disk arranged in the controller; And the BMC of the storage server equipment is used for performing speed reduction on the fan group or only keeping the minimum heat dissipation on the second CPU in parallel, so that the power consumption of the storage server equipment is reduced to a preset threshold value within a preset time.
- 5. The power-down protection method of a storage server device according to claim 1, wherein when receiving an indication signal that power consumption of the storage server device is reduced to a preset threshold, the method controls a dirty data identification unit to read a dirty data bitmap marked by the storage server device, and locates dirty data to be protected in a memory of the storage server device, includes: When an indication signal that the power consumption of the storage server equipment is reduced to a preset threshold value within a preset time is received, controlling a dirty data identification unit to read a dirty data bitmap marked by distributed storage software in the storage server equipment; Precisely positioning non-persistent dirty data in the memory of the storage server equipment, and taking the non-persistent dirty data as the dirty data to be protected; and calculating and storing the original redundancy value of the dirty data to be protected.
- 6. The power-down protection method of a storage server device according to claim 5, wherein verifying the integrity of the transferred dirty data stored in the preset storage unit by the cyclic redundancy check unit comprises: checking the transferred dirty data through a cyclic redundancy check unit to obtain a redundancy check value of the dirty data; Carrying out consistency comparison processing on the original redundancy value of the dirty data to be protected and the redundancy check value of the dirty data; When the original redundancy value of the dirty data is consistent with the redundancy check value of the dirty data, determining that the integrity check of the transferred dirty data passes; And when the original redundancy value of the dirty data is inconsistent with the redundancy check value of the dirty data, determining that the integrity check of the transferred dirty data fails.
- 7. The power down protection method of a storage server device of claim 6, further comprising: When the power supply unit is in an abnormal power-down state, distributed storage software of the storage server equipment protects the cluster synchronization nodes and informs other nodes to pause data writing into the nodes so as to pause data interaction; when the power supply unit is detected to be electrified, dirty data and redundancy check values thereof in the storage unit are read; If the dirty data transferred to the preset storage unit passes the verification, restoring the dirty data to a memory cache, asynchronously writing the dirty data into a storage pool, restoring a state to a cluster synchronous node to restore data interaction, and simultaneously restoring power supply to a fan, a non-key hard disk and PCIe peripherals by the BMC, waking up a first CPU and restoring frequency to enable storage server equipment to return to a normal running state; and if the dirty data transferred to the preset storage unit fails to verify, triggering a cluster data reconstruction process, and synchronizing the dirty data from a copy node to ensure the consistency of a storage pool.
- 8. The power down protection method of a storage server device according to claim 1, further comprising: After the power supply switching control unit controls the power supply mode to be switched from the power supply unit to the battery backup unit, the electric quantity of the battery backup unit is monitored in real time; When the electric quantity of the battery backup unit is lower than a safety threshold value, sending out alarm information; When the electric quantity of the battery backup unit is lower than a first preset electric quantity value, charging the battery backup unit; And stopping charging the battery backup unit when the electric quantity of the battery backup unit reaches a second preset electric quantity value.
- 9. A power-down protection system of a storage server device, for performing the power-down protection method of the storage server device according to any one of claims 1 to 8, comprising: The CPLD is used for monitoring the output signal of the power supply unit in real time, judging that the power supply unit is in an abnormal power-down state when the abnormality of the output signal of the power supply unit is identified, and generating an unshieldable interrupt signal; The power supply switching control unit is used for responding to the non-shielding interrupt signal and controlling the power supply mode to be switched from the power supply unit to the battery backup unit; the ACPI and hardware cooperation module is used for controlling relevant necessary hardware to enter a core closing mode and interrupting power supply to unnecessary hardware; The dirty data identification unit is used for reading a dirty data bitmap marked by the storage server equipment when receiving an indication signal that the power consumption of the storage server equipment is reduced to a preset threshold value, and positioning dirty protection data to be protected in the memory of the storage server equipment; The cyclic redundancy check unit is used for directionally restoring the dirty protection data to be protected to a preset storage unit through a PCIe channel and checking the dirty data restored to the preset storage unit so as to verify the integrity of the restored dirty data; And the storage coordination module is used for controlling the storage server equipment to execute a shutdown process when the verification of the dirty data is passed, controlling the battery backup unit to stop supplying power, and reading the dirty data from the preset storage unit and restoring the dirty data to the memory cache when the power-on signal is received again so as to restore the storage server equipment to a normal running state.
- 10. A computer device comprising a memory and a processor, wherein the processor implements the method of any of claims 1 to 8 when executing a computer program stored in the memory.
Description
Power-down protection method and system for storage server equipment and computer device Technical Field The present invention relates to the field of storage server technologies, and in particular, to a power failure protection method, a system, and a computer device for a storage server device. Background With the rapid development of information technology, storage servers play a vital role in data centers and various types of computing environments. However, during actual operation, the storage server device may be faced with abnormal power down of the Power Supply Unit (PSU), which may lead to serious consequences such as data loss, system crashes, and service interruption. In the related art, when an abnormality occurs in a power supply unit, a storage server device in the related art may not be rapidly switched to a standby power supply mode, or may not be able to effectively control power consumption and an operation state of hardware in a switching process. This may lead to instability of the system during power down and even failure to resume operation normally. When the power supply unit is abnormally powered down, the data transfer and protection mechanism is often imperfect at the moment of power failure, and all dirty data cannot be timely and completely stored. After the power supply unit is abnormally powered down, the storage server equipment in the prior art cannot accurately control the power consumption of hardware, so that the power consumption is too high when the battery backup unit supplies power, the duration of a battery is shortened, and the risk of data loss is increased. Moreover, the prior art lacks an effective checking mechanism in the data transfer process, and cannot ensure that the data remains integrity and consistency in the transfer process. Once the data is transferred and stored or errors occur, the data in the storage pool is possibly inconsistent, and the normal operation of the service is affected. Disclosure of Invention The invention provides a power-down protection method, a power-down protection system and a computer device of storage server equipment, and aims to at least solve one of the technical problems in the prior art. The technical scheme of the invention is a power-down protection method of storage server equipment, which comprises the following steps: monitoring an output signal of a power supply unit in real time through a CPLD, and when the abnormality of the output signal of the power supply unit is identified, judging that the power supply unit is in an abnormal power-down state and generating an unshieldable interrupt signal; responding to the non-maskable interrupt signal, controlling a power supply mode to be switched from the power supply unit to the battery backup unit by the power supply switching control unit, controlling related necessary hardware to enter a core closing mode by an ACPI and hardware cooperation module, and interrupting power supply to unnecessary hardware; When an indication signal that the power consumption of the storage server equipment is reduced to a preset threshold value is received, controlling a dirty data identification unit to read a dirty data bitmap marked by the storage server equipment, and positioning dirty data to be protected in the memory of the storage server equipment; the dirty data are directionally restored to a preset storage unit through a PCIe channel, and the dirty data restored to the preset storage unit are verified through a cyclic redundancy verification unit so as to verify the integrity of the restored dirty data; And when the dirty data passes the verification, controlling the storage server equipment to execute a shutdown process, controlling the battery backup unit to stop supplying power, and when the power-on signal is received again, reading the dirty data from the preset storage unit and restoring the dirty data to the memory cache so as to restore the storage server equipment to a normal running state. According to some embodiments of the present invention, the monitoring, by the CPLD, the output signal of the power supply unit in real time, and when identifying that the output signal of the power supply unit is abnormal, determining that the power supply unit is in an abnormal power-down state includes: the POK signal, the output voltage and the AC-OK signal output by the power supply unit are monitored in real time through the CPLD; When the POK signal output by the power supply unit is in a low level, determining that the output signal of the power supply unit is abnormal, and judging that the power supply unit is in an abnormal power-down state; when the output voltage of the power supply unit is lower than a preset value, determining that the output signal of the power supply unit is abnormal, and judging that the power supply unit is in an abnormal power-down state; when the AC-OK signal output by the power supply unit is lost or is in a low level, determining that the output