Search

CN-122019639-A - Analysis report generation method and device, electronic equipment and storage medium

CN122019639ACN 122019639 ACN122019639 ACN 122019639ACN-122019639-A

Abstract

The application discloses an analysis report generation method, an analysis report generation device, electronic equipment and a storage medium, which are used for realizing that an analysis system can adaptively adjust processing logic under a complex scene that the original data has heterogeneous conflict or quality defect by establishing a closed loop feedback mechanism of a service scheme and data quality, thereby ensuring the continuity of an automatic analysis flow and the robustness of a result report conclusion.

Inventors

  • XU XIN
  • Shao Ximeng
  • LIU YONGPENG
  • WANG SHIYI
  • LIU XIN
  • WAN GUOHUI
  • ZHANG RONGHUI

Assignees

  • 泰康保险集团股份有限公司
  • 泰康资产管理有限责任公司

Dates

Publication Date
20260512
Application Date
20251223

Claims (10)

  1. 1. A method of generating an analysis report, comprising: In response to receiving a natural language task sent by a user, performing semantic parsing on the natural language task, determining an analysis dimension, a business caliber and an expected target of the natural language task; packaging the analysis dimension, the service aperture and the expected target into a structured task solution; Acquiring original structural index data for executing the natural language task through the mapping relation of the structural task scheme; Eliminating the data source heterogeneous characteristics of the original structural index data, and outputting a data set to be evaluated for eliminating semantic ambiguity; acquiring data quality characteristics for representing the health condition of the data set to be evaluated, and correcting the structured task scheme based on the data quality characteristics to generate a final data processing scheme; And generating a complete analysis report through the final data processing scheme.
  2. 2. The method of claim 1, wherein the step of obtaining raw structural index data for performing the natural language task by the mapping relationship of the structural task scheme comprises: Reading the structured task scheme, and generating a data tag and a field requirement for representing the analysis dimension, the service caliber and the expected target by retrieving a preset semantic dictionary or a metadata mapping table; And accessing an enterprise internal database and/or a metadata center corresponding to the data tag based on the data tag and the field requirement to acquire original structural index data.
  3. 3. The method of claim 2, wherein the step of outputting the semantically disambiguated data set under evaluation by eliminating data source heterogeneous characteristics of the raw structured index data comprises: Acquiring a measurement value for representing the business attribute and the financial performance of a target analysis object from the original structural index data; And performing standardized conversion and caliber alignment operation on the measurement value by utilizing a semantic rule, and outputting a data set to be evaluated with semantic ambiguity removed.
  4. 4. A method according to claim 3, wherein the step of obtaining data quality features characterizing the health of the data set under evaluation comprises: Integrity detection is carried out on the data set to be evaluated, and a missing field is determined; and carrying out quality detection on the data to be evaluated, and identifying an abnormal value.
  5. 5. The method of claim 4, wherein the step of modifying the structured task schema based on the data quality features to generate a final data processing schema comprises: and according to the missing proportion of the missing field, carrying out priority adjustment on the analysis dimension and the expected target of the structured task scheme, or carrying out logic correction on the structured task scheme according to the abnormal severity expressed by the abnormal constant value, and outputting a verified final data processing scheme.
  6. 6. The method of claim 5, further comprising, prior to the step of generating a complete analysis report by the final data processing scheme: Determining a correction data set with an association relationship with the final data processing scheme, and adjusting a record for a task of the correction data set; automatically generating and running feature analysis codes in an isolated sandbox environment to identify spatial distribution of abnormal marks in the corrected dataset and generate a profiling report reflecting the health condition of the dataset to be evaluated.
  7. 7. The method of claim 6, wherein the step of generating a complete analysis report by the final data processing scheme comprises: Taking the analysis report as a decision basis, performing numerical filling on a field with the loss rate lower than a preset threshold value in the decision basis by adopting a preset industry median or regression estimation method, performing elimination or smoothing on abnormal constant values of logic conflicts, and outputting high-quality analysis sample data which is cleaned and accords with statistical significance; Generating analysis codes for the high-quality analysis sample data, performing statistical calculation, completing quantitative analysis of financial analysis information of a target analysis object, and generating a structured calculation result for reflecting the actual business status of the target analysis object; Based on the structured calculation result and the task adjustment record, a complete analysis report containing a data limitation description, a business conclusion derivation and a multidimensional statistical chart is output.
  8. 8. An analysis report generating apparatus, comprising: the semantic analysis module is used for responding to the received natural language task sent by the user, executing semantic analysis on the natural language task and determining analysis dimension, service caliber and expected target of the natural language task; The structured task scheme packaging module is used for packaging the analysis dimension, the service caliber and the expected target into a structured task scheme; The original structured index data acquisition module is used for acquiring original structured index data for executing the natural language task through the mapping relation of the structured task scheme; The semantic ambiguity eliminating module is used for eliminating the heterogeneous characteristics of the data sources of the original structural index data and outputting a data set to be evaluated for eliminating semantic ambiguity; the final data processing scheme generation module is used for acquiring data quality characteristics for representing the health condition of the data set to be evaluated, correcting the structured task scheme based on the data quality characteristics and generating a final data processing scheme; and the complete analysis report generation module is used for generating a complete analysis report through the final data processing scheme.
  9. 9. An electronic device comprising a processor, a memory and a program or instruction stored on the memory and executable on the processor, which program or instruction when executed by the processor implements the method of claims 1-7.
  10. 10. A readable storage medium, characterized in that it stores thereon a program or instructions, which when executed by a processor, implements the method according to claims 1-7.

Description

Analysis report generation method and device, electronic equipment and storage medium Technical Field The present invention relates to the technical field of analysis report generation, and in particular, to an analysis report generation method, an analysis report generation device, an electronic apparatus, and a readable storage medium. Background With the complexity of the financial market, financial analysis (e.g., credit risk assessment, industry research, etc.) is increasingly dependent on multi-source heterogeneous data. However, in implementing automated data analysis, two core technical challenges remain: First, the semantic heterogeneity of multi-source data makes it difficult for an automated processing link to open. The financial data is distributed in an enterprise internal database, a third party financial terminal and various external interfaces. Different data sources often have significant differences in definition, calculated caliber, numerical dimension, or even currency of the same index (e.g., "credit rating" or "liability ratio"), even a large number of business labels exist in the form of enumerated codes. The existing automatic analysis tools often have difficulty in accurately aligning the semantic ambiguity, so that after the system acquires multi-source data, logic operation cannot be directly performed due to caliber conflict. Secondly, quality defects of the original data are very prone to cause interruption of an automated process or distortion of a conclusion. In a real business scenario, due to disclosure of missing, logging delay or systematic errors, the collected raw index data is often accompanied by missing fields or logical anomaly values of varying degrees. The related art automatic analysis flow is mostly linearly executed, and the real-time sensing and self-adaptive adjustment capability of the data quality is lacking, so that the finally produced analysis conclusion often deviates from the current business situation seriously, and a great error guiding risk is generated when the financial analysis conclusion is generated. Disclosure of Invention Embodiments of the present invention provide an analysis report generation method, apparatus, electronic device, and readable storage medium to overcome or at least partially solve the above-described problems. In order to solve the technical problems, the application is realized as follows: in a first aspect, an embodiment of the present application provides an analysis report generating method, including: In response to receiving a natural language task sent by a user, performing semantic parsing on the natural language task, determining an analysis dimension, a business caliber and an expected target of the natural language task; packaging the analysis dimension, the service aperture and the expected target into a structured task solution; Acquiring original structural index data for executing the natural language task through the mapping relation of the structural task scheme; Eliminating the data source heterogeneous characteristics of the original structural index data, and outputting a data set to be evaluated for eliminating semantic ambiguity; acquiring data quality characteristics for representing the health condition of the data set to be evaluated, and correcting the structured task scheme based on the data quality characteristics to generate a final data processing scheme; And generating a complete analysis report through the final data processing scheme. Optionally, the step of obtaining the original structural index data for executing the natural language task through the mapping relation of the structural task scheme includes: Reading the structured task scheme, and generating a data tag and a field requirement for representing the analysis dimension, the service caliber and the expected target by retrieving a preset semantic dictionary or a metadata mapping table; And accessing an enterprise internal database and/or a metadata center corresponding to the data tag based on the data tag and the field requirement to acquire original structural index data. Optionally, the step of eliminating the data source heterogeneous characteristic of the original structured index data and outputting the data set to be evaluated for eliminating the semantic ambiguity includes: Acquiring a measurement value for representing the business attribute and the financial performance of a target analysis object from the original structural index data; And performing standardized conversion and caliber alignment operation on the measurement value by utilizing a semantic rule, and outputting a data set to be evaluated with semantic ambiguity removed. Optionally, the step of obtaining data quality features for characterizing the health of the data set under evaluation comprises: Integrity detection is carried out on the data set to be evaluated, and a missing field is determined; and carrying out quality detection on the data to be evalu