Search

CN-121979846-A - Large-scale directory incremental scanning method, device, storage medium and computer equipment

CN121979846ACN 121979846 ACN121979846 ACN 121979846ACN-121979846-A

Abstract

The invention discloses a large-scale directory incremental scanning method, a device, a storage medium and computer equipment, which comprise the steps of responding to an incremental scanning signal of a target directory, starting scanning from a current file in the target directory, judging whether the current scanning task of the target directory meets a task suspension condition, suspending the current scanning task to release thread resources of directory scanning, determining a current directory context of the scanning directory under the current scanning task and an access handle of the target directory, storing the access handle and the current directory context into a continuous scanning task in a task queue, judging whether the continuous scanning task meets a scheduling condition, determining breakpoint positions between scanned files and unscanned files in the target directory based on the access handle and the current directory context, and continuing scanning unscanned files in the target directory based on the breakpoint positions.

Inventors

  • GUO PENGWEI
  • HE YIQIAN
  • PAN MING
  • HE SHIWEI
  • TIAN YE

Assignees

  • 成都鲁易科技有限公司

Dates

Publication Date
20260505
Application Date
20251216

Claims (10)

  1. 1. A method for incremental scanning of a large-scale directory, comprising: responding to an increment scanning signal of a target directory, and starting scanning from a current file in the target directory; judging whether a current scanning task of the target directory meets a task suspension condition, if so, suspending the current scanning task to release thread resources of directory scanning, determining a current directory context of a scanning directory under the current scanning task and an access handle of the target directory, and storing the access handle and the current directory context into a continuous scanning task in a task queue; and judging whether the continuous scanning task meets a scheduling condition, if so, determining a breakpoint position between a scanned file and an unscanned file in the target directory based on the access handle and the current directory context, and continuously scanning the unscanned file in the target directory based on the breakpoint position.
  2. 2. The method of claim 1, wherein determining whether the current scan task of the target directory satisfies a task suspension condition comprises: in the execution process of the current scanning task, judging whether the number of scanned files in the target directory is larger than a preset number threshold value or not, and/or judging whether a high-priority task needs to be scheduled for execution or not; And if the number of the scanned files is larger than the preset number threshold value and/or high-priority tasks need to be scheduled for execution, judging that the current scanning task of the target directory meets the task suspension condition.
  3. 3. The method of claim 1, wherein the determining whether the subsequent scan task satisfies a scheduling condition comprises: Determining thread pool state information, system memory state information and task priority information in the task queue of a thread pool corresponding to the continuous scanning task; determining a thread Chi Pingce parameter based on the thread pool state information, determining a memory evaluation parameter based on the system memory state information, and determining a priority evaluation parameter based on the task priority information; Respectively determining weight coefficients corresponding to the thread pool evaluation parameter, the memory evaluation parameter and the priority evaluation parameter, carrying out weighted summation on the thread pool evaluation parameter, the memory evaluation parameter and the priority evaluation parameter based on the weight coefficients, and taking a weighted summation result as a scheduling feasibility index; Judging whether the scheduling feasibility index is larger than a preset index threshold, if yes, judging that the continuous scanning task meets the scheduling condition, otherwise, judging that the continuous scanning task does not meet the scheduling condition.
  4. 4. The method of claim 3, wherein the thread pool state information comprises a number of free threads and a total thread pool capacity, the system memory state information comprises a remaining memory capacity, and the task priority information comprises a priority of a head queue task in the task queue and a priority of the continually scanned task; The determining a thread Chi Pingce parameter based on the thread pool state information includes: taking the ratio between the number of idle threads and the total capacity of the thread pool as the thread pool evaluation parameter; the determining a memory evaluation parameter based on the system memory state information includes: determining a memory requirement threshold value required for executing the continuous scanning task, and taking the ratio between the residual memory capacity and the memory requirement threshold value as the memory evaluation parameter; The determining a priority evaluation parameter based on the task priority information comprises the following steps: And taking the difference between the priority of the head queue task and the priority of the continuous scanning task as a relative priority difference, and determining the priority evaluation parameter based on the relative priority difference.
  5. 5. The method of claim 1, wherein after continuing to scan for unscanned files in the target directory based on the breakpoint location, the method further comprises: when the last file in the target directory is scanned, configuring a skip file rule list; Matching all scanned files in the target directory with the skipped file rule list, and determining effective scanned files in all scanned files based on a matching result; And generating a structured report in a preset format based on the valid scanned file.
  6. 6. The method of claim 1, wherein after continuing to scan for unscanned files in the target directory based on the breakpoint location, the method further comprises: When the last file in the target directory is scanned, if an increment scanning signal of a next target directory is received, transferring ownership of an access handle of the target directory to the next target directory through mobile semantics, calling ScanTask a destructor in a structure body to close the access handle of the target directory, and deleting a copy construction function corresponding to the access handle to prohibit copy construction of the access handle.
  7. 7. The method of claim 1, wherein determining the access handle of the target directory comprises: acquiring a resource identifier, a time stamp and a random number of the target directory; Splicing the resource identifier, the time stamp and the random number into a character string according to a preset format; and carrying out hash operation on the character strings to obtain hash values, extracting a preset number of target byte values in the hash values, converting the target byte values, and taking the converted target byte values as the access handles.
  8. 8. A large-scale catalog incremental scanning apparatus comprising: An initial scanning unit, configured to respond to an incremental scanning signal of a target directory, and start scanning from a current file in the target directory; the judging unit is used for judging whether the current scanning task of the target directory meets a task suspension condition, if so, suspending the current scanning task to release the thread resource of directory scanning, determining the current directory context of the scanning directory under the current scanning task and the access handle of the target directory, and storing the access handle and the current directory context into a continuous scanning task in a task queue; And the continuous scanning unit is used for judging whether the continuous scanning task meets a scheduling condition, if so, determining a breakpoint position between a scanned file and an unscanned file in the target directory based on the access handle and the current directory context, and continuing to scan the unscanned file in the target directory based on the breakpoint position.
  9. 9. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.
  10. 10. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the computer program when executed by the processor implements the steps of the method according to any one of claims 1 to 7.

Description

Large-scale directory incremental scanning method, device, storage medium and computer equipment Technical Field The present invention relates to the field of information processing technologies, and in particular, to a method and apparatus for incremental scanning of a large-scale directory, a storage medium, and a computer device. Background In file system operations, scanning through large-scale directories (e.g., 10tens of thousands of+ files) is a common requirement. Currently, the directory is typically scanned continuously until completion. However, in this continuous scanning manner, the thread is continuously occupied for a long time, so that a task queue is blocked, the system responsiveness is drastically reduced, and the system is seriously damaged, so that the directory is damaged in the scanning process. Disclosure of Invention The invention provides a large-scale directory incremental scanning method, a device, a storage medium and computer equipment, which mainly aim at improving the system responsiveness, guaranteeing the stable execution of a task queue and guaranteeing the stable operation of a system, thereby guaranteeing the stability and the accuracy of directory scanning. According to a first aspect of the present invention, there is provided a large-scale directory incremental scanning method comprising: responding to an increment scanning signal of a target directory, and starting scanning from a current file in the target directory; judging whether a current scanning task of the target directory meets a task suspension condition, if so, suspending the current scanning task to release thread resources of directory scanning, determining a current directory context of a scanning directory under the current scanning task and an access handle of the target directory, and storing the access handle and the current directory context into a continuous scanning task in a task queue; and judging whether the continuous scanning task meets a scheduling condition, if so, determining a breakpoint position between a scanned file and an unscanned file in the target directory based on the access handle and the current directory context, and continuously scanning the unscanned file in the target directory based on the breakpoint position. Optionally, the determining whether the current scanning task of the target directory meets a task suspension condition includes: in the execution process of the current scanning task, judging whether the number of scanned files in the target directory is larger than a preset number threshold value or not, and/or judging whether a high-priority task needs to be scheduled for execution or not; And if the number of the scanned files is larger than the preset number threshold value and/or high-priority tasks need to be scheduled for execution, judging that the current scanning task of the target directory meets the task suspension condition. Optionally, the determining whether the continuous scanning task meets a scheduling condition includes: Determining thread pool state information, system memory state information and task priority information in the task queue of a thread pool corresponding to the continuous scanning task; determining a thread Chi Pingce parameter based on the thread pool state information, determining a memory evaluation parameter based on the system memory state information, and determining a priority evaluation parameter based on the task priority information; Respectively determining weight coefficients corresponding to the thread pool evaluation parameter, the memory evaluation parameter and the priority evaluation parameter, carrying out weighted summation on the thread pool evaluation parameter, the memory evaluation parameter and the priority evaluation parameter based on the weight coefficients, and taking a weighted summation result as a scheduling feasibility index; Judging whether the scheduling feasibility index is larger than a preset index threshold, if yes, judging that the continuous scanning task meets the scheduling condition, otherwise, judging that the continuous scanning task does not meet the scheduling condition. Optionally, the thread pool state information includes the number of idle threads and the total capacity of the thread pool, the system memory state information includes the remaining memory capacity, and the task priority information includes the priority of a head queue task in the task queue and the priority of the continuous scanning task; The determining a thread Chi Pingce parameter based on the thread pool state information includes: taking the ratio between the number of idle threads and the total capacity of the thread pool as the thread pool evaluation parameter; the determining a memory evaluation parameter based on the system memory state information includes: determining a memory requirement threshold value required for executing the continuous scanning task, and taking the ratio between the residual me