Search

CN-122020557-A - Intelligent fusion analysis method for multi-source heterogeneous data

CN122020557ACN 122020557 ACN122020557 ACN 122020557ACN-122020557-A

Abstract

The invention provides an intelligent fusion analysis method for multi-source heterogeneous data, which relates to the technical field of industrial multi-source heterogeneous data processing and comprises five steps of data access and processing, dynamic target analysis and fusion demand generation, data fusion strategy generation, fusion and analysis execution and fusion strategy optimization. The invention can automatically and flexibly adjust the fusion strategy according to the real-time data demand and the business target on the premise of ensuring the data quality, reduces redundant calculation, improves the calculation efficiency, continuously optimizes the fusion process through a real-time feedback mechanism, and has high flexibility and expansibility.

Inventors

  • WANG WEI

Assignees

  • 江苏城乡建设职业学院

Dates

Publication Date
20260512
Application Date
20260206

Claims (8)

  1. 1. The intelligent fusion analysis method for the multi-source heterogeneous data is characterized by comprising the following steps of: Step one, accessing and processing data Accessing various industrial data sources based on compatible industrial protocols, acquiring equipment industrial data in real time, analyzing unstructured data, preprocessing to obtain preprocessed data, and storing the preprocessed data by adopting a layered storage architecture; step two, dynamic target analysis and demand generation fusion Acquiring a business target input by a user, analyzing the business target by adopting a natural language processing method, converting the business target into a structured data demand, mapping the structured data demand with data in a layered storage architecture according to the structured data demand, confirming a required data source, and generating a fusion demand based on a mapping result and the business target by utilizing a decision tree algorithm; Step three, generating a data fusion strategy According to the fusion requirement in the second step, combining the preprocessing data and the history strategy effect, and dynamically generating an executable data fusion strategy by using a reinforcement learning model; step four, fusion and analysis are carried out Extracting target data from the layered storage architecture according to the fusion requirement of the second step, then executing the data fusion strategy generated in the third step to generate fusion data facing to the business target, inputting the fusion data into a random forest learning model adapted to the business target for analysis, and outputting a structural analysis result and a feature importance score corresponding to the user target; Step five, optimizing fusion strategy And (3) acquiring the judgment of the validity of the structural analysis result by the user, generating a quantized value of the history strategy effect by combining the feature importance scores obtained in the step (IV), and inputting the quantized value as a negative feedback sample to the step (III) to finish the optimization of the fusion strategy.
  2. 2. The intelligent fusion analysis method of multi-source heterogeneous data according to claim 1, wherein in the first step, the preprocessing is performed by data desensitization and data cleaning, and then metadata tags are added, wherein the metadata tags comprise time tags, source tags and quality tags.
  3. 3. The method of claim 1, wherein in the first step, the hierarchical storage architecture includes a distributed file layer for storing raw data, a time-series data layer for storing tagged sensor data, and a graph data layer for storing device topology.
  4. 4. The method of claim 1, wherein in the second step, the structured data requirements include key business entities, analysis indexes and analysis dimensions.
  5. 5. The method of claim 1, wherein in the second step, the fusion requirement includes a data range, key features and a time window.
  6. 6. The method of claim 1, wherein in the third step, the data fusion policy includes data association logic, a fusion operation sequence and a resource scheduling scheme.
  7. 7. The method of intelligent fusion analysis of multi-source heterogeneous data according to claim 1, wherein in the fourth step, the random forest learning model is a pre-trained machine learning model.
  8. 8. The method of claim 1, wherein in the fifth step, validity of the structured analysis result is divided into valid and invalid.

Description

Intelligent fusion analysis method for multi-source heterogeneous data Technical Field The invention relates to the technical field of industrial multi-source heterogeneous data processing, in particular to an intelligent fusion analysis method for multi-source heterogeneous data. Background In the modern industrial field, with rapid development of technology and improvement of production automation, the variety and sources of industrial data are increasingly diversified. Multisource heterogeneous data refers to data from a plurality of different sources and with different structures, formats and semantics, and is widely used in manufacturing, energy, traffic, petrochemical and other industries. Such data typically includes, but is not limited to, several types such as sensor data, device operation data, management system data, and external data. With the increasing degree of industrial automation, enterprises are gradually incorporating more equipment and production links into digital and intelligent management, and therefore, how to efficiently process, fuse and analyze these different sources and forms of data becomes a key issue. In the past industrial data processing, conventional data fusion methods have mostly relied on manually defined data mapping and transformation rules, i.e., predefined clear data transformation and matching rules, and then integrating data from different sources into a unified format for analysis. This approach, while suitable for the case of relatively simple data types and limited sources, is disadvantageous when faced with large-scale data and multi-source heterogeneous data, and is manifested mainly as a deficiency in the following aspects: (1) The traditional method needs to perform a large amount of data preprocessing and standardization before fusion, has poor adaptability to scenes such as data source variation, introduction of new data formats and the like, and is difficult to cope with changeable industrial scenes; (2) The traditional framework generally adopts a mode of constructing a static data structure by adopting historical data, which limits the processing capacity of the system to real-time data, and particularly when the data fusion requirement of dynamic adjustment is faced, complex manual intervention is often required; (3) Conventional data fusion generally adopts a fusion-before-analysis process, and all data needs to be subjected to comprehensive fusion treatment before analysis. The fixed flow causes computational redundancy and performance bottleneck to be difficult to avoid when facing real-time data flow and large-scale data, so the invention provides an intelligent fusion analysis method for multi-source heterogeneous data to solve the problems in the prior art. Disclosure of Invention Aiming at the problems, the invention aims to provide the intelligent fusion analysis method for the multi-source heterogeneous data, which has the advantages of flexibility and intelligence and can solve the problems in the prior art. The invention aims to realize the purpose by adopting the following technical scheme that the intelligent fusion analysis method for the multi-source heterogeneous data comprises the following steps: Step one, accessing and processing data Accessing various industrial data sources based on compatible industrial protocols, acquiring equipment industrial data in real time, analyzing unstructured data, preprocessing to obtain preprocessed data, and storing the preprocessed data by adopting a layered storage architecture; step two, dynamic target analysis and demand generation fusion Acquiring a business target input by a user, analyzing the business target by adopting a natural language processing method, converting the business target into a structured data demand, mapping the structured data demand with data in a layered storage architecture according to the structured data demand, confirming a required data source, and generating a fusion demand based on a mapping result and the business target by utilizing a decision tree algorithm; Step three, generating a data fusion strategy According to the fusion requirement in the second step, combining the preprocessing data and the history strategy effect, and dynamically generating an executable data fusion strategy by using a reinforcement learning model; step four, fusion and analysis are carried out Extracting target data from the layered storage architecture according to the fusion requirement of the second step, then executing the data fusion strategy generated in the third step to generate fusion data facing to the business target, inputting the fusion data into a random forest learning model adapted to the business target for analysis, and outputting a structural analysis result and a feature importance score corresponding to the user target; Step five, optimizing fusion strategy And (3) acquiring the judgment of the validity of the structural analysis result by the user, generating a q