CN-122020722-A - File sensitive information processing method, system, computer and storage medium
Abstract
The invention relates to the technical field of information processing and provides a file sensitive information processing method, a system, a computer and a storage medium, wherein the file sensitive information processing method comprises the steps of acquiring a first task data set, and converting a file to be processed into a plurality of file pictures to be processed; judging whether to start header and footer detection based on the second task data set, detecting and generating a plurality of area detection results if the header and footer detection is started, judging whether to start seal detection based on the third task data set, detecting and generating a plurality of seal detection results if the seal detection is started, judging whether to start text detection based on the fourth task data set, detecting and generating a plurality of sensitive text detection results if the text detection is started, judging whether to start fuzzy processing based on the fifth task data set, and generating a fuzzy file if the fuzzy processing is started. By adopting the method, the problems of low blind state treatment efficiency, inaccurate blind state treatment and artificial leakage risk can be avoided.
Inventors
- ZHANG HUA
- ZENG JIN
- XIE QIANQIAN
- HU CHAO
- Xing Minggang
- LI XIN
Assignees
- 江西博微新技术有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260410
Claims (9)
- 1. The file sensitive information processing method is characterized by comprising the following steps: Acquiring a first task data set, acquiring a file to be processed according to the first task data set, converting the file to be processed into a plurality of file pictures to be processed, and updating the first task data set into a second task data set; judging whether to start header footer detection based on the second task data set, if so, detecting a plurality of header footer areas in a plurality of file pictures to be processed, generating a plurality of area detection results, and updating the second task data set into a third task data set; Judging whether to start seal detection or not based on the third task data set, if so, detecting a plurality of seals in a plurality of file pictures to be processed, generating a plurality of seal detection results, and updating the third task data set into a fourth task data set; Judging whether to start text detection or not based on the fourth task data set, if so, detecting a plurality of sensitive texts in a plurality of file pictures to be processed, generating a plurality of sensitive text detection results, and updating the fourth task data set into a fifth task data set; Judging whether to start fuzzy processing or not based on the fifth task data set, if so, selecting a plurality of to-be-fuzzy areas from a plurality of to-be-processed file pictures based on a plurality of area detection results, a plurality of seal detection results and a plurality of sensitive text detection results, converting the plurality of to-be-fuzzy areas into a plurality of fuzzy areas so as to convert the plurality of to-be-processed file pictures into a plurality of fuzzy file pictures, and merging the plurality of fuzzy file pictures to generate a fuzzy file.
- 2. The method for processing file sensitive information according to claim 1, wherein the first task data set includes a bid information subset, a first current page number and a first current execution time, the bid information subset includes a file path, a file page number, a blind evaluation identifier and bid sensitive information, the step of obtaining a file to be processed according to the first task data set, converting the file to be processed into a plurality of file pictures to be processed, and updating the first task data set into a second task data set includes: Judging whether to perform blind evaluation according to the blind evaluation identification, and if so, acquiring a file to be processed according to the file path; Dividing the file to be processed into a plurality of file pages to be processed according to the number of the file pages; Converting the file page to be processed into a file picture to be processed, recording the converted page number and the converted time, updating the first current page number to a second current page number based on the converted page number, updating the first current execution time to a second current execution time based on the converted time, and displaying the second current page number and the second current execution time on a user interface; and forming a second task data set by the bid information subset, the second current page number and the second current execution time.
- 3. The method for processing file sensitive information according to claim 2, wherein the step of determining whether to start header footer detection based on the second task data set, if so, detecting a plurality of header footer regions in a plurality of the file pictures to be processed, generating a plurality of region detection results, and updating the second task data set to a third task data set includes: Comparing the second current page number with the file page number, and if the second current page number is equal to the file page number, starting header and footer detection; Detecting a header and footer region in the file picture to be processed, generating a region detection result, recording a region detection page number and a region detection time, updating the second current page number to a third current page number based on the region detection page number, updating the second current execution time to a third current execution time based on the region detection time, and displaying the third current page number and the third current execution time on the user interface; and forming a third task data set by the bid information subset, the third current page number and the third current execution time.
- 4. The method for processing file sensitive information according to claim 3, wherein the step of determining whether to start stamp detection based on the third task data set, if so, detecting a plurality of stamps in a plurality of to-be-processed file pictures, generating a plurality of stamp detection results, and updating the third task data set to a fourth task data set comprises: comparing the third current page number with the file page number, and if the third current page number is equal to the file page number, starting seal detection; Invoking an image recognition algorithm, detecting a seal in the file picture to be processed, generating a seal detection result, recording the seal detection page number and seal detection time, updating the third current page number to a fourth current page number based on the seal detection page number, updating the third current execution time to a fourth current execution time based on the seal detection time, and displaying the fourth current page number and the fourth current execution time on the user interface; And forming a fourth task data set by the bid information subset, the fourth current page number and the fourth current execution time.
- 5. The method for processing sensitive information of a document according to claim 4, wherein the step of determining whether to start text detection based on the fourth task data set, if so, detecting a plurality of sensitive texts in a plurality of pictures of the document to be processed, generating a plurality of sensitive text detection results, and updating the fourth task data set to a fifth task data set includes: Comparing the four current pages with the file pages, and if the fourth current page is equal to the file pages, starting text detection; Detecting sensitive texts in the file pictures to be processed based on the bidding sensitive information in the bidding information subset, generating a sensitive text detection result, recording text detection page numbers and text detection time, updating the fourth current page numbers to fifth current page numbers based on the text detection page numbers, updating the fourth current execution time to fifth current execution time based on the text detection time, and displaying the fifth current page numbers and the fifth current execution time on the user interface; and forming a fifth task data set by the bid information subset, the fifth current page number and the fifth current execution time.
- 6. The method for processing file sensitive information according to claim 5, wherein the step of judging whether to start the blurring process based on the fifth task data set is specifically: And comparing the fifth current page number with the file page number, and if the fifth current page number is equal to the file page number, starting fuzzy processing.
- 7. A document sensitive information processing system, applied to the document sensitive information processing method according to any one of claims 1 to 6, characterized in that the system comprises: The acquisition module is used for acquiring a first task data set, acquiring a file to be processed according to the first task data set, converting the file to be processed into a plurality of file pictures to be processed, and updating the first task data set into a second task data set; The first detection module is used for judging whether to start header and footer detection based on the second task data set, if so, detecting a plurality of header and footer areas in a plurality of file pictures to be processed, generating a plurality of area detection results, and updating the second task data set into a third task data set; The second detection module is used for judging whether to start seal detection or not based on the third task data set, if so, detecting a plurality of seals in a plurality of file pictures to be processed, generating a plurality of seal detection results, and updating the third task data set into a fourth task data set; The third detection module is used for judging whether to start text detection or not based on the fourth task data set, if so, detecting a plurality of sensitive texts in a plurality of file pictures to be processed, generating a plurality of sensitive text detection results, and updating the fourth task data set into a fifth task data set; And the fuzzy processing module is used for judging whether to start fuzzy processing based on the fifth task data set, if so, selecting a plurality of areas to be fuzzy from a plurality of file pictures to be processed based on a plurality of area detection results, a plurality of seal detection results and a plurality of sensitive text detection results, converting the plurality of areas to be fuzzy into a plurality of fuzzy areas so as to convert the plurality of file pictures to be processed into a plurality of fuzzy file pictures, and merging the plurality of fuzzy file pictures to generate a fuzzy file.
- 8. A computer comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of processing file sensitive information according to any one of claims 1 to 6 when executing the computer program.
- 9. A storage medium having a computer program stored thereon, which when executed by a processor implements the file-sensitive information processing method according to any one of claims 1 to 6.
Description
File sensitive information processing method, system, computer and storage medium Technical Field The present invention relates to the field of information processing technologies, and in particular, to a method, a system, a computer, and a storage medium for processing file sensitive information. Background The bid-picking and bid-picking service has increasingly higher requirements on management standardization and lean quality and the fairness and transparency of the bid-picking process, while the traditional bid-picking mode is easily limited by expert resources and regions, so that behaviors such as artificial implications, bid-picking and the like influence the bid-picking result, and more bid-picking processes adopt a blind bid-picking mode. And the blind evaluation mode evaluation enables evaluation experts to evaluate under the condition of not knowing enterprise information through the blind evaluation bidding file, cuts off benefit association, avoids subjective bias and eliminates non-technical factor interference. The bid evaluation expert can concentrate more energy and attention on the quality and technical strength of the bidding scheme, and can deeply analyze the feasibility, innovation, advancement and reasonability of the business terms of the technical scheme, so that more professional and accurate evaluation can be made. The competition fairness among bid evaluation enterprises is guaranteed, the evaluation quality is improved, and the public confidence of bidding activities is enhanced. The existing blind state processing of bidding documents is highly dependent on manual work, the time for positioning sensitive information in a single bidding document is long, enterprise names, qualification numbers, project responsible person information and the like need to be screened page by page, the processing time of the single bidding documents is long, different types of bidding documents have specific specifications and requirements, the professional and normative properties of the processed documents are affected due to the fact that the rule is not thoroughly understood or the classification standard is inaccurately grasped during manual classification, large-scale blind evaluation is difficult to support, the risk of incomplete identity hiding and manual leakage exists in manual desensitization of the documents, and the problems of low efficiency, inaccurate blind state processing, manual leakage risk and the like exist in the existing blind state processing of the documents. Disclosure of Invention In view of the shortcomings of the prior art, the invention aims to provide a method, a system, a computer and a storage medium for processing file sensitive information, the invention can realize automatic blindness processing of sensitive identification, text and other information in the bidding document by carrying out multistage and comprehensive detection and batch blurring processing on the sensitive information in the bidding document. The invention aims to solve the technical problems of low efficiency, inaccurate blindness and artificial leakage risk in the bid file processing in the prior art. In order to achieve the above object, the present invention is achieved by the following technical scheme: a file sensitive information processing method comprises the following steps: Acquiring a first task data set, acquiring a file to be processed according to the first task data set, converting the file to be processed into a plurality of file pictures to be processed, and updating the first task data set into a second task data set; judging whether to start header footer detection based on the second task data set, if so, detecting a plurality of header footer areas in a plurality of file pictures to be processed, generating a plurality of area detection results, and updating the second task data set into a third task data set; Judging whether to start seal detection or not based on the third task data set, if so, detecting a plurality of seals in a plurality of file pictures to be processed, generating a plurality of seal detection results, and updating the third task data set into a fourth task data set; Judging whether to start text detection or not based on the fourth task data set, if so, detecting a plurality of sensitive texts in a plurality of file pictures to be processed, generating a plurality of sensitive text detection results, and updating the fourth task data set into a fifth task data set; Judging whether to start fuzzy processing or not based on the fifth task data set, if so, selecting a plurality of to-be-fuzzy areas from a plurality of to-be-processed file pictures based on a plurality of area detection results, a plurality of seal detection results and a plurality of sensitive text detection results, converting the plurality of to-be-fuzzy areas into a plurality of fuzzy areas so as to convert the plurality of to-be-processed file pictures into a plurality of fuzzy fil