Search

US-20260127271-A1 - DATA PROTECTION METHOD AND APPARATUS

US20260127271A1US 20260127271 A1US20260127271 A1US 20260127271A1US-20260127271-A1

Abstract

A data protection method includes: A storage device obtains an IO sequence for accessing the storage device, determines a suspicious IO sequence from the IO sequence, where the suspicious IO sequence is initiated by a suspicious host in response to the suspicious host accessing first data in the storage device, the suspicious IO sequence indicates that the suspicious host has a possibility of damaging data stored in the storage device, and the data stored in the storage device includes the first data, and generates a defense operation and a snapshot in response to determining the suspicious IO sequence, where the snapshot is used to restore the data stored in the storage device, and the defense operation is used to prevent, before generating the snapshot is completed, the suspicious host from accessing the data stored in the storage device.

Inventors

  • Yangchao Wang
  • Mengyu HUANG
  • Qizhang Zhou
  • Gang Hu

Assignees

  • HUAWEI TECHNOLOGIES CO., LTD.

Dates

Publication Date
20260507
Application Date
20260102
Priority Date
20230704

Claims (20)

  1. 1 . A data protection method, comprising: obtaining an input/output (IO) sequence for accessing a storage device; determining a suspicious IO sequence from the IO sequence, wherein the suspicious IO sequence is initiated by a suspicious host in response to the suspicious host accessing first data in the storage device, the suspicious IO sequence indicates that the suspicious host has a possibility of damaging data stored in the storage device, and the data stored in the storage device comprises the first data; and generating a defense operation and a snapshot in response to determining the suspicious IO sequence, wherein the snapshot is used to restore the data stored in the storage device, and the defense operation is used to prevent, before generating the snapshot is completed, the suspicious host from accessing the data stored in the storage device.
  2. 2 . The method according to claim 1 , wherein the defense operation is used to prevent, before generating the snapshot is completed, the suspicious host from accessing second data, and the second data is data other than the first data stored in the storage device.
  3. 3 . The method according to claim 1 , wherein the defense operation comprises at least one of A or B: A: before generating the snapshot is completed, rejecting an access request of the suspicious host to the storage device; and B: adding preset data to the storage device, so that the suspicious host accesses the preset data before generating the snapshot is completed.
  4. 4 . The method according to claim 1 , wherein after determining the suspicious IO sequence, the method further comprises: determining whether the first data is damaged by the suspicious IO sequence; and in response to determining that the first data is damaged by the suspicious IO sequence, determining that the suspicious host is an attacker host; or, in response to determining that the first data is not damaged by the suspicious IO sequence, determining that the suspicious host is a normal host.
  5. 5 . The method according to claim 4 , wherein determining whether the first data is damaged by the suspicious IO sequence comprises: performing entropy value detection on the first data, to obtain an entropy value of the first data; and in response to the entropy value being greater than a preset value, determining that the first data is damaged by the suspicious IO sequence; or in response to the entropy value being less than or equal to the preset value, determining that the first data is not damaged by the suspicious IO sequence.
  6. 6 . The method according to claim 5 , wherein performing entropy value detection on the first data, to obtain the entropy value of the first data comprises: performing sampling on the first data, to obtain sampled data; and performing entropy value detection on the sampled data, to obtain the entropy value of the first data.
  7. 7 . The method according to claim 4 , wherein after determining that the suspicious host is the attacker host, the method further comprises: displaying alarm information, wherein the alarm information is used by an administrator to determine whether to disconnect a connection between the attacker host and the storage device.
  8. 8 . The method according to claim 1 , wherein after generating the snapshot is completed, the method further comprises: disabling the defense operation, so that the data stored in the storage device is normally accessible by the suspicious host.
  9. 9 . The method according to claim 1 , wherein determining the suspicious IO sequence from the IO sequence comprises: determining, from the IO sequence, a first IO sequence used to access a first file, wherein the first data is the first file; and in response to determining that a degree of matching between the first IO sequence and a preset IO sequence is greater than a threshold, determining that the first IO sequence is the suspicious IO sequence, wherein the preset IO sequence is a pre-learned IO sequence that has a possibility of damaging the data stored in the storage device.
  10. 10 . The method according to claim 1 , wherein each IO in the IO sequence carries access information, the access information comprises an access address, and the access information further comprises at least one of an access type or access time; and determining the suspicious IO sequence from the IO sequence comprises: determining a statistical feature of the IO sequence based on the access information carried in each IO in the IO sequence; and determining the suspicious IO sequence from the IO sequence based on the statistical feature of the IO sequence, wherein data at access addresses of IOs in the suspicious IO sequence forms the first data.
  11. 11 . A computing device, comprising at least one processor, wherein the processor is connected to a non-transitory memory, the non-transitory memory is configured to store a computer program, and when the computer program are executed by the at least one processor, it causes the computing device to: obtain an input/output (IO) sequence for accessing a storage device; determine a suspicious IO sequence from the IO sequence, wherein the suspicious IO sequence is initiated by a suspicious host in response to the suspicious host accessing first data in the storage device, the suspicious IO sequence indicates that the suspicious host has a possibility of damaging data stored in the storage device, and the data stored in the storage device comprises the first data; and generate a defense operation and a snapshot in response to determining the suspicious IO sequence, wherein the snapshot is used to restore the data stored in the storage device, and the defense operation is used to prevent, before generating the snapshot is completed, the suspicious host from accessing the data stored in the storage device.
  12. 12 . The computing device according to claim 11 , wherein the defense operation is used to prevent, before generating the snapshot is completed, the suspicious host from accessing second data, and the second data is data other than the first data stored in the storage device.
  13. 13 . The computing device according to claim 11 , wherein the defense operation comprises at least one of A or B: A: before generating the snapshot is completed, rejecting an access request of the suspicious host to the storage device; and B: adding preset data to the storage device, so that the suspicious host accesses the preset data before generating the snapshot is completed.
  14. 14 . The computing device according to claim 11 , wherein after causing the computing device to determine the suspicious IO sequence, it further cause the computing device to: determine whether the first data is damaged by the suspicious IO sequence; and in response to determining that the first data is damaged by the suspicious IO sequence, determine that the suspicious host is an attacker host; or, in response to determining that the first data is not damaged by the suspicious IO sequence, determine that the suspicious host is a normal host.
  15. 15 . The computing device according to claim 14 , wherein determining whether the first data is damaged by the suspicious IO sequence comprises: performing entropy value detection on the first data, to obtain an entropy value of the first data; and in response to the entropy value being greater than a preset value, determining that the first data is damaged by the suspicious IO sequence; or in response to the entropy value being less than or equal to the preset value, determining that the first data is not damaged by the suspicious IO sequence.
  16. 16 . The computing device according to claim 15 , wherein performing entropy value detection on the first data, to obtain the entropy value of the first data comprises: performing sampling on the first data, to obtain sampled data; and performing entropy value detection on the sampled data, to obtain the entropy value of the first data.
  17. 17 . The computing device according to claim 14 , wherein after causing the computing device to determine that the suspicious host is the attacker host, it further cause the computing device to: display alarm information, wherein the alarm information is used by an administrator to determine whether to disconnect a connection between the attacker host and the storage device.
  18. 18 . The computing device according to claim 11 , wherein after causing the computing device to generate the snapshot is completed, it further cause the computing device to: disable the defense operation, so that the data stored in the storage device is normally accessible by the suspicious host.
  19. 19 . The computing device according to claim 11 , wherein determining the suspicious IO sequence from the IO sequence comprises: determining, from the IO sequence, a first IO sequence used to access a first file, wherein the first data is the first file; and in response to determining that a degree of matching between the first IO sequence and a preset IO sequence is greater than a threshold, determining that the first IO sequence is the suspicious IO sequence, wherein the preset IO sequence is a pre-learned IO sequence that has a possibility of damaging the data stored in the storage device.
  20. 20 . A storage system, comprising a computing device and a storage device, wherein the computing device is connected to the storage device, the storage device is configured to store data, and the computing device is configured to: obtain an input/output (IO) sequence for accessing the storage device; determine a suspicious IO sequence from the IO sequence, wherein the suspicious IO sequence is initiated by a suspicious host in response to the suspicious host accessing first data in the storage device, the suspicious IO sequence indicates that the suspicious host has a possibility of damaging data stored in the storage device, and the data stored in the storage device comprises the first data; and generate a defense operation and a snapshot in response to determining the suspicious IO sequence, wherein the snapshot is used to restore the data stored in the storage device, and the defense operation is used to prevent, before generating the snapshot is completed, the suspicious host from accessing the data stored in the storage device.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application is a continuation International Application No. PCT/CN2024/083858, filed on Mar. 26, 2024, which claims priority to Chinese Patent Application No. 202310811989.3, filed on Jul. 4, 2023. The disclosures of the aforementioned applications are hereby incorporated by reference in their entireties. TECHNICAL FIELD This application relates to the field of storage technologies, and in particular, to a data protection method and apparatus. BACKGROUND With rapid development of information technologies, data has become one of the most important assets of enterprises and individuals. Security of storage devices that serve as infrastructures for supporting data storage and protection is very important. With popularization of the Internet, network security threats, such as ransomware, are increasingly rampant, and data security becomes a globally common challenge. Once the storage devices are attacked by viruses, data in the storage devices may be damaged and stolen, and extortion may also occur. In addition, in terms of some regulatory and compliance requirements, high requirements are imposed on the security of the storage devices. Therefore, ensuring the security of the storage devices and security of the data stored in the storage devices becomes new requirements of enterprises. Generally, a storage device periodically generates a snapshot of data in the storage device, to restore the data when the storage device is attacked by viruses. However, in this manner, the data restored based on the snapshot may be a part of the data in the storage device. For example, a snapshot is generated at an interval of 10 minutes, the storage device generates a snapshot at 14:00, and viruses attack the storage device at 14:09. In this case, the storage device can restore data only based on the snapshot generated at 14:00, but cannot restore changed data in the storage device in a time period from 14:00 to 14:09 based on the snapshot. How to restore data in a storage device as completely as possible is an urgent technical problem to be resolved currently. SUMMARY This application provides a data protection method and apparatus, for restoring user data in a storage device as completely as possible. According to a first aspect, this application provides a data protection method. The method may be performed by a storage device, or may be performed by a protection module in the storage device; or the method may be performed by a protection device connected to the storage device. An example in which the storage device performs the method is used. The method includes: The storage device obtains an input/output IO sequence for accessing the storage device; determines a suspicious IO sequence from the IO sequence, where the suspicious IO sequence is initiated by a suspicious host when the suspicious host accesses first data in the storage device, the suspicious IO sequence indicates that the suspicious host has a possibility of damaging user data stored in the storage device, and the user data stored in the storage device includes the first data; and further, generates a defense mechanism and a snapshot in response to determining the suspicious IO sequence, where the snapshot is used to restore the user data stored in the storage device, and the defense mechanism is used to prevent, before generating the snapshot is completed, the suspicious host from accessing the user data stored in the storage device. In the foregoing technical solution, after determining the suspicious IO sequence, the storage device generates the defense mechanism and the snapshot, so that in a time period from determining of the suspicious IO sequence to generating of the snapshot, the suspicious host initiating the suspicious IO sequence cannot continue to access the user data in the storage device. Further, a time point corresponding to generating of the snapshot is a time point at which the suspicious IO sequence is found. Assuming that the suspicious host is a host configured to damage the user data stored in the storage device, that is, the suspicious host is an attacker host, a degree of damage caused by the attacker host to the user data at the time point is relatively low. In this way, the storage device can restore the user data in the storage device as completely as possible based on the snapshot. In a possible implementation, the defense mechanism is specifically used to prevent, before generating the snapshot is completed, the suspicious host from accessing second data, and the second data is user data other than the first data stored in the storage device. In the foregoing technical solution, the storage device protects, by using the defense mechanism, the second data stored in the storage device, to generate a snapshot of the second data, so as to completely restore the second data based on the snapshot. In a possible implementation, the defense mechanism includes one or two of A and B: A: before ge