Search

CN-121997896-A - Audit report automatic generation method based on natural language processing

CN121997896ACN 121997896 ACN121997896 ACN 121997896ACN-121997896-A

Abstract

The invention discloses an automatic audit report generation method based on natural language processing, which relates to the technical field of electronic data processing and comprises the steps of collecting multi-source heterogeneous data, establishing corresponding relations among audit matters, audit evidences, rule bases and approval states according to approval records, time information and document identifications, generating audit association data, constructing audit evidence constraint graphs according to the audit association data, calculating evidence support degree, conclusion conflict degree and risk propagation strength of each audit matters, generating a core audit matter set, extracting evidence nodes, rule base nodes and risk analysis results corresponding to each audit matter from the core audit matter set, carrying out combined arrangement to generate audit report paragraph data, and carrying out consistency matching on the audit report paragraph data and the corresponding audit evidences and rule bases to generate an audit report initial manuscript. The invention improves the evidence correspondence, rule suitability and conclusion consistency of paragraph contents in the audit report generation process.

Inventors

  • SONG ZHENGCHEN
  • WU SHAOHUA
  • LI TONGWANG
  • GUO JIAYU
  • BAI JINXING

Assignees

  • 厦门美亚亿安信息科技有限公司

Dates

Publication Date
20260508
Application Date
20260407

Claims (9)

  1. 1. An audit report automatic generation method based on natural language processing is characterized by comprising the following steps of, Collecting multi-source heterogeneous data, and establishing corresponding relations among audit matters, audit evidences, legal basis and approval states according to approval records, time information and document identifications to generate audit associated data; Constructing an audit evidence constraint graph according to the audit associated data, calculating the evidence support degree, conclusion conflict degree and risk propagation strength of each audit item, generating a core audit item set, specifically comprising the following steps of, Identifying different types of nodes and different types of edges according to the audit evidence constraint graph, and respectively configuring corresponding information transfer weights and constraint propagation weights to obtain a propagation parameter set; According to the propagation parameter set, propagation approval constraint information and regulation constraint information are propagated in an audit evidence constraint graph to obtain constraint input, and audit evidence information is propagated to obtain support input; According to constraint input and support input, calculating evidence support degree, conclusion conflict degree and risk propagation strength of each audit item, and screening audit items with high risk propagation strength from audit items with high evidence support degree and low conclusion conflict degree to generate a core audit item set; evidence nodes, rule basis nodes and risk analysis results corresponding to all audit matters are extracted from the core audit matters set, and combined and tidied to generate audit report paragraph data; consistency matching is carried out on the audit report paragraph data and the corresponding audit evidence and rule basis, so as to generate an audit report first draft; And according to a preset audit rule base, carrying out evidence chain integrity check and rule suitability check and conclusion consistency check on the audit report initial draft, carrying out backtracking correction and rechecking on the core audit item set and audit report paragraph data when the verification fails, and repeatedly executing before the backtracking correction times reach a preset time threshold value to generate a post-verification audit report.
  2. 2. The method for automatically generating an audit report based on natural language processing according to claim 1 wherein said multi-source heterogeneous data comprises approval records, financial document data for time information and document identification, business contract data, approval circulation data, audit manuscript data, regulatory text data and historical audit report data.
  3. 3. The method for automatically generating an audit report based on natural language processing according to claim 1, wherein the step of generating audit-related data comprises the steps of, Content segmentation is carried out on financial evidence data, business contract data, approval circulation data, audit manuscript data, rule text data and historical audit report data to obtain fine-granularity audit evidence segments; aligning the fine-granularity audit evidence segments across documents, and identifying associated evidence segments pointing to the same audit item; extracting a corresponding page number position, a table coordinate position, an approval level position and a clause reference position from the associated evidence segment to generate evidence positioning information; and packaging the associated evidence segments and the evidence positioning information to generate audit associated data.
  4. 4. The method for automatically generating an audit report based on natural language processing according to claim 1, wherein the constructing an audit evidence constraint graph comprises the following specific steps of, Extracting audit matters, abnormal indexes, business behaviors, audit evidences, rule basis, approval states and audit conclusions from audit associated data, and respectively establishing corresponding types of nodes to obtain node sets; Identifying an amount auditing relation, a time sequence relation, an approval dependency relation, a rule applicability relation and a conclusion quotation relation from auditing associated data, and establishing corresponding type edges to obtain a relation edge set; According to the node set and the relation edge set, establishing evidence sides for audit evidences which point to the same abnormal index and are different in sources, establishing evidence conflict sides for audit evidences describing opposite facts, and constructing an audit evidence constraint graph.
  5. 5. The method for automatically generating an audit report based on natural language processing according to claim 1, wherein the steps of generating the audit report paragraph data are as follows, Extracting audit subject information, evidence basis content, rule constraint content and risk analysis results corresponding to each audit item from a core audit item set to obtain a paragraph element set; combining and sorting the paragraph element sets to form a paragraph content set, and supplementing the description to be verified and the conflict prompt description; And establishing a mapping relation between the paragraph content set and the corresponding audit evidence and regulation basis, and generating audit report paragraph data.
  6. 6. The method for automatically generating an audit report based on natural language processing according to claim 1, wherein the step of generating an audit report manuscript comprises the following steps, Extracting description information corresponding to the content of each paragraph and identification information in corresponding audit evidence from the audit report paragraph data to obtain a matching element set; Carrying out consistency matching on descriptive contents in the audit report paragraph data and fact semantics, key fields and reference marks in corresponding audit evidences according to the matching element set to obtain a consistency matching result; writing the consistency matching result into the corresponding reference mark, the attached table mark and the clause mark to generate an audit report manuscript.
  7. 7. The method for automatically generating an audit report based on natural language processing according to claim 1, wherein backtracking correction and re-verification are performed on the core audit item set and the audit report paragraph data, and the method is repeatedly performed before the number of backtracking corrections reaches a preset number threshold, and comprises the following steps of, Identifying target paragraphs which do not pass verification from an audit report initial draft, and generating a backtracking association result according to the mapping relation among the target paragraphs, audit evidence and rule basis and the audit event, audit evidence and rule basis; According to the backtracking association result, carrying out evidence support degree, conclusion conflict degree and evidence gap degree calculation on the association audit matters again, and correcting a core audit matter set according to the recalculation result to generate a matter correction result; And regenerating audit report paragraph data corresponding to the target paragraph according to the item correction result, replacing the target paragraph in the audit report manuscript, and re-checking the replaced audit report manuscript to obtain a re-checking result.
  8. 8. The method for automatically generating an audit report based on natural language processing according to claim 1, wherein the preset audit rule base is a rule entry set, and each rule entry comprises an audit item type, an applicable condition, a verification condition and a correction action and is generated by performing corresponding analysis on a historical finalized audit report, a corresponding audit manuscript and a historical return modification record.
  9. 9. The method for automatically generating an audit report based on natural language processing according to claim 1, wherein the step of generating the checked audit report comprises the following steps, Carrying out evidence chain integrity check on each audit conclusion statement in the audit report initial draft, analyzing the corresponding relation among each audit conclusion statement, audit evidence nodes and rule basis nodes, and generating an evidence chain integrity check result; Performing rule suitability verification on each audit conclusion statement in the audit report manuscript, analyzing the adaptation relation between the quoted rule according to the node and the audit main body type, the audit period and the abnormal fact type, and generating a rule suitability verification result; Carrying out conclusion consistency check on each audit conclusion statement in the audit report initial draft, analyzing semantic relations among a plurality of audit conclusion statements corresponding to the same audit item, and generating a conclusion consistency check result; identifying abnormal audit conclusion sentences according to the evidence chain integrity check result, the rule suitability check result and the conclusion consistency check result, and carrying out backtracking correction on related audit matters to obtain backtracking corrected audit matters; updating an audit report initial draft according to the backtracking corrected audit item result, and generating a checked audit report when the rechecking result is that the checking is passed; when the re-checking result is that the checking is not passed and the backtracking correction times do not reach the preset times threshold, the backtracking correction and the re-checking are continuously carried out on the core audit item set and the audit report paragraph data; when the backtracking correction times reach a preset time threshold, outputting the current audit report initial draft as a checked audit report.

Description

Audit report automatic generation method based on natural language processing Technical Field The invention relates to the technical field of electronic data processing, in particular to an audit report automatic generation method based on natural language processing. Background With the continuous development of financial informatization, electronic manuscript management and compliance examination digitization, audit activities have formed a multi-source data environment covering financial vouchers, business contracts, approval flows, legal text and historical reports, and natural language processing, information extraction and structural association analysis are gradually applied to audit data integration and report auxiliary generation. In the existing audit report automatic generation technology, a stable constraint association relation is difficult to build around the same audit item due to scattered evidence, legal basis and approval state, so that an evidence supporting link is incomplete in the audit conclusion refining process, and further the logic consistency, legal suitability and traceability of the report content are affected. Disclosure of Invention The present invention has been made in view of the above-described problems occurring in the prior art. Therefore, the invention provides an automatic audit report generation method based on natural language processing, which solves the problem that multisource audit evidence and regulation constraint are difficult to form a consistent support link around the same audit item. In order to solve the technical problems, the invention provides the following technical scheme: The invention provides an automatic audit report generation method based on natural language processing, which comprises the steps of collecting multi-source heterogeneous data, establishing corresponding relations among audit matters, audit evidences, regulation bases and approval states according to the audit records, time information and document identifications, generating audit association data, constructing an audit evidence constraint graph according to the audit association data, calculating evidence support degree, conclusion conflict degree and risk propagation strength of each audit matter, generating a core audit matter set, extracting evidence nodes corresponding to each audit matter, regulation bases and risk analysis results from the core audit matter set, carrying out combined arrangement, generating audit report paragraph data, carrying out consistency matching on the audit report paragraph data and corresponding audit evidences and regulation bases, generating an audit report initial manuscript, carrying out evidence chain integrity check and regulation compliance check on the audit report initial manuscript according to a preset audit rule base, carrying out backtracking correction and repeated verification after the backtracking correction times reach preset thresholds, and generating an audit report initial manuscript. As a preferable scheme of the automatic audit report generation method based on natural language processing, the multi-source heterogeneous data comprises approval records, financial evidence data of time information and document identification, business contract data, approval circulation data, audit manuscript data, legal text data and historical audit report data. As a preferable scheme of the automatic audit report generation method based on natural language processing, the method for generating audit associated data comprises the following specific steps, Content segmentation is carried out on financial evidence data, business contract data, approval circulation data, audit manuscript data, rule text data and historical audit report data to obtain fine-granularity audit evidence segments; aligning the fine-granularity audit evidence segments across documents, and identifying associated evidence segments pointing to the same audit item; extracting a corresponding page number position, a table coordinate position, an approval level position and a clause reference position from the associated evidence segment to generate evidence positioning information; and packaging the associated evidence segments and the evidence positioning information to generate audit associated data. As a preferable scheme of the automatic audit report generation method based on natural language processing, the method comprises the following steps of constructing an audit evidence constraint graph, Extracting audit matters, abnormal indexes, business behaviors, audit evidences, rule basis, approval states and audit conclusions from audit associated data, and respectively establishing corresponding types of nodes to obtain node sets; Identifying an amount auditing relation, a time sequence relation, an approval dependency relation, a rule applicability relation and a conclusion quotation relation from auditing associated data, and establishing corresponding type edges t