Search

CN-121996619-A - Mass file authority consistency detection method and system based on hash abstract

CN121996619ACN 121996619 ACN121996619 ACN 121996619ACN-121996619-A

Abstract

The application discloses a mass file authority consistency detection method and system based on hash digests, and relates to the technical field of file management, wherein the method comprises the steps of obtaining DACL authority configuration data of a plurality of directories to be detected in a target file system; the method comprises the steps of carrying out standardized processing on DACL authority configuration data of each directory, removing a non-authority information part containing directory path identifiers, retaining contents which only reflect access control rules, respectively calculating hash abstract values corresponding to the directories based on the standardized authority data, carrying out gathering and analogy on the hash abstract values, identifying normal authority modes with the same hash values, and marking the directories with the hash values deviating from the normal authority modes as authority abnormal objects. The application improves the authority comparison efficiency of the DACL authorities of the massive file directories.

Inventors

  • LIANG RENYOU
  • QIU FENG
  • LONG JIAJUN

Assignees

  • 赛姆科技(广东)有限公司

Dates

Publication Date
20260508
Application Date
20260410

Claims (10)

  1. 1. The method for detecting the consistency of the rights of the mass files based on the hash digests is characterized by comprising the following steps: obtaining DACL authority configuration data of a plurality of catalogues to be checked in a target file system; Carrying out standardized processing on DACL authority configuration data of each directory, removing a non-authority information part containing directory path identifiers, and reserving content only reflecting access control rules; Based on the standardized authority data, respectively calculating hash abstract values corresponding to all catalogues; And comparing the hash digest values in a gathering way, identifying a normal permission mode with the same hash value, and marking a directory with the hash value deviating from the normal permission mode as a permission abnormal object.
  2. 2. The method for detecting the consistency of the rights of the mass files based on the hash digests as claimed in claim 1, wherein the DACL rights configuration data are exported through an operating system built-in tool, the hash digest value is generated by adopting an MD5 algorithm, hash calculation and result summarization are automatically executed through batch scripts, and a mapping list containing directory identifications and corresponding hash values is formed and used for rapidly positioning rights configuration differences.
  3. 3. The method for detecting the consistency of the rights of the mass files based on the hash digests according to claim 1, wherein the standardized processing comprises deleting the first-line directory path information in the DACL derived files by using a text processing tool, and the rights abnormal object is judged according to the fact that the occurrence frequency of the hash digest value of the rights abnormal object in all the directories to be detected is obviously lower than a preset threshold value or is inconsistent with the hash value of a known right rights template.
  4. 4. The hash digest-based massive file authority consistency detection method according to claim 1, further comprising, after marking the authority exception object: Constructing a hierarchical hash tree structure by taking a directory corresponding to the authority abnormal object as a root node, wherein leaf nodes correspond to standardized authority metadata of all files under the directory, and non-leaf nodes are aggregated hashes of child node hash values; Generating leaf node hashes on the basis of parallel computing on the authority metadata under the directory, and generating root hashes of the directory by bottom-up aggregation to serve as authority digests of the current authority state; Comparing the authority abstract with a reference authority abstract of the directory in a history version snapshot, and if the authority abstract is inconsistent with the reference authority abstract, positioning a difference subtree; performing conflict resolution on the nodes suspected to conflict in the difference subtrees by adopting a fuzzy hash primary screening and a secondary verification mechanism for accurate authority field comparison; Performing incremental hash recalculation on only files with changed rights and ancestor path nodes in the hierarchical hash tree, updating the rights abstract of the corresponding directory, and generating a new version snapshot; and calculating the authority risk score of the authority abnormal object based on a plurality of preset authority risk evaluation factors so as to detect risks.
  5. 5. The method for detecting the consistency of the rights of the mass files based on the hash digests as claimed in claim 4, wherein the constructing the hierarchical hash tree structure comprises the following steps: clustering mass files according to file system directory levels or logical groups, wherein each group of files forms a subtree of a hash tree; The hash value of each non-leaf node is generated by performing cryptographic hash operation after splicing the hash values of the direct child nodes according to a preset sequence; The parallel computation adopts a task slicing strategy, divides a file set into a plurality of subsets, distributes the subsets to a plurality of computing units to independently generate local hash trees, and then merges the local hash trees into a global hierarchical hash tree.
  6. 6. The hash digest-based massive file authority consistency detection method according to claim 4, wherein the blurring Ha Xichu sieve comprises: Extracting key fields from authority metadata of suspected conflict nodes to generate simplified hash; if the simplified hashes are the same, judging that no substantial conflict exists, skipping accurate comparison, and if the simplified hashes are different, triggering an accurate comparison flow; The incremental update includes: monitoring file system events or periodically scanning rights change logs, and identifying a target file with rights changed; And (3) recalculating the hash of the leaf node only for the changed target file, updating the hash value to the root node step by step along the father path, and multiplexing the hash values of the other unchanged branches with the historical snapshot data.
  7. 7. The hash digest-based massive file authority consistency detection method according to claim 4, wherein the calculating the authority risk score of the authority abnormal object based on a plurality of preset authority risk evaluation factors comprises: Presetting a plurality of authority risk evaluation factors, wherein the authority risk evaluation factors comprise file sensitivity level, authority change type, identity credibility of a change main body, access control list complexity and whether a privileged user or a key system directory is involved; Carrying out quantitative assignment on each authority risk evaluation factor according to the file attribute and the change context of each file with authority change to obtain a corresponding risk factor score; And calculating the authority risk score of the file with authority change by adopting a weighted comprehensive index method based on the risk factor scores and the preset weights.
  8. 8. The hash digest-based massive file authority consistency detection method according to claim 7, wherein the performing quantization assignment on each authority risk evaluation factor comprises: Mapping directory paths or service labels to which the files to be evaluated belong to a predefined sensitivity level table, and determining file sensitivity level scores; Matching a risk rule base according to the difference types before and after permission change, and outputting a change type score, wherein the difference types comprise newly added write permission, open global reading and audit group removal; combining the identity of the operator initiating the permission modification and the historical behavior credibility of the operator to generate a main body credibility score; And the complexity score of the access control list is dynamically calculated through the number of ACL entries, the depth of the nested group and the authority conflict detection result.
  9. 9. A hash digest-based massive file permission consistency detection system, wherein the hash digest-based massive file permission consistency detection method according to any one of claims 1 to 8 is executed, the system comprising: the data acquisition module is used for acquiring DACL authority configuration data of a plurality of catalogues to be checked in the target file system; The standardized processing module is used for carrying out standardized processing on DACL authority configuration data of each directory, removing a non-authority information part containing directory path identifiers, and reserving content only reflecting access control rules; the hash calculation module is used for calculating hash abstract values corresponding to the catalogues respectively based on the standardized authority data; And the abnormality identification module is used for carrying out aggregation and comparison on the hash abstract values, identifying a normal permission mode with the same hash value, and marking a catalog with the hash value deviating from the normal permission mode as a permission abnormality object.
  10. 10. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the steps of the hash digest-based mass file permission consistency detection method according to any one of claims 1 to 8.

Description

Mass file authority consistency detection method and system based on hash abstract Technical Field The application relates to the technical field of file management, in particular to a method and a system for detecting massive file authority consistency based on hash digests. Background In an enterprise-level Windows server environment, particularly a business system employing an NTFS file system such as medical image archiving, document management platform, etc., DACL (autonomous access control list) rights configuration of files and directories is critical to data security and compliance. When the underlying storage (such as SAN) fails, volume migration or system is abnormally restarted, the DACL metadata corruption or rights resetting phenomenon may occur in part of the directory, resulting in that legitimate users cannot access critical resources. Currently, operation and maintenance personnel manually check the authority state by mainly relying on an operating system graphical interface, namely right clicking a target directory, opening an attribute, switching to a security tab, and manually comparing authority items of all users/groups. This approach is extremely inefficient (3-5 seconds for a single operation) in the face of thousands to tens of thousands of business categories, and is extremely prone to false positives due to visual fatigue or operational omission. Although some scripting tools can derive DACL content (e.g., icacls) in batch, the derived result includes unstructured fields such as path information, and direct text comparison still requires a lot of manual intervention, so that automatic difference recognition is difficult to realize. The prior art lacks an efficient, accurate and extensible mechanism for rapidly locating nodes with abnormal authority configuration in a massive catalog. Disclosure of Invention In order to overcome the defects of the prior art and improve the authority comparison efficiency of DACL authorities for massive file directories, the application provides a massive file authority consistency detection method and system based on hash digests. In a first aspect, the object of the application is achieved by the following technical scheme: The method for detecting the consistency of the rights of the mass files based on the hash abstract comprises the following steps: obtaining DACL authority configuration data of a plurality of catalogues to be checked in a target file system; Carrying out standardized processing on DACL authority configuration data of each directory, removing a non-authority information part containing directory path identifiers, and reserving content only reflecting access control rules; Based on the standardized authority data, respectively calculating hash abstract values corresponding to all catalogues; And comparing the hash digest values in a gathering way, identifying a normal permission mode with the same hash value, and marking a directory with the hash value deviating from the normal permission mode as a permission abnormal object. By adopting the technical scheme, the consistency automatic detection of massive DACL authority configuration is realized by introducing a hash digest mechanism. Because the essence of authority configuration is a set of access control rules, if two directories have identical authority policies, the standardized rule contents are necessarily consistent, and then the same hash value is generated. The method comprises the steps of firstly stripping off non-authority variables such as path identification and the like in DACL data, ensuring that hash calculation only reflects real access control semantics, then taking a hash mode which appears at a high frequency as a normal standard through clustering analysis of hash value distribution, and correspondingly configuring a low-frequency or isolated hash value abnormally. The process does not need to preset a permission template, does not depend on a complex analyzer, can complete large-scale comparison only through a universal hash algorithm, remarkably improves detection efficiency and accuracy, eliminates manual erroneous judgment, has good portability and tool compatibility, and is suitable for permission health inspection under various NTFS storage scenes. In a preferred example, the DACL authority configuration data is exported through an operating system built-in tool, the hash abstract value is generated by adopting an MD5 algorithm, hash calculation and result summarization are automatically executed through batch scripts, and a mapping list containing directory identifiers and corresponding hash values is formed and used for rapidly positioning authority configuration differences. By adopting the technical scheme, DACL data is exported by adopting an operating system built-in tool, and the MD5 algorithm and batch script are combined to automatically execute hash calculation and result summarization, so that the implementation threshold and the manual in