Search

CN-121979872-A - Clean energy multi-source heterogeneous data automatic arrangement method and device

CN121979872ACN 121979872 ACN121979872 ACN 121979872ACN-121979872-A

Abstract

The application discloses a clean energy multi-source heterogeneous data automatic arrangement method and a device, and relates to the technical field of data processing, wherein the method comprises the steps of screening sensitive data of core parameters of a power station based on preset field-level screening rules to obtain corresponding screened power station parameters, and constructing an original data pool; the method comprises the steps of establishing a multi-level tag association rule based on a preset subdivision field, adding tag information to screened power station parameters of an original data pool based on a preset large model and combining the corresponding multi-level tag association rule, and judging data abnormality of the screened power station parameters of the original data pool based on the preset large model. The method overcomes the defects of the prior art, reasonably sorts the clean energy multi-source heterogeneous data by combining with the preset rules, has the advantages of convenience and rapidness, and meets the actual requirements of data sorting.

Inventors

  • LI DONGFANG
  • SHENG HUAYING
  • ZHANG BAOPING
  • YU ZHANGTAO
  • WANG WENYONG
  • CHEN MINGXUAN
  • QIAN DESONG

Assignees

  • 三峡科技有限责任公司

Dates

Publication Date
20260505
Application Date
20260109

Claims (10)

  1. 1. The method for automatically arranging the clean energy multi-source heterogeneous data is characterized by comprising the following steps of: Based on a preset field-level screening rule, sensitive data screening is carried out on core parameters of the power station, corresponding screened power station parameters are obtained, and an original data pool is constructed; constructing a multi-level tag association rule based on a preset subdivision field; based on a preset large model, adding tag information to the screened power station parameters of the original data pool by combining with a corresponding multi-level tag association rule; And based on a preset large model, carrying out data abnormality judgment on the screened power station parameters of the original data pool.
  2. 2. The clean energy multi-source heterogeneous data finishing method according to claim 1, further comprising the steps of: And based on a preset large model, combining with a preset data format requirement, performing data format adjustment on the screened power station parameters of the original data pool.
  3. 3. The clean energy multi-source heterogeneous data finishing method according to claim 1, further comprising the steps of: And classifying the dimension of the screened power station parameters of the original data pool in the normal data state based on a preset classification dimension.
  4. 4. The clean energy multi-source heterogeneous data automatic arrangement method according to claim 1, wherein label information is added to the screened power station parameters of the original data pool based on a preset large model in combination with corresponding multi-level label association rules, and the method comprises the following steps: based on a preset large model, identifying key parameters of the screened power station parameters of the original data pool; And adding tag information to the screened power station parameters of the original data pool based on the key parameters of the screened power station parameters of the original data pool in combination with corresponding multi-level tag association rules.
  5. 5. The clean energy multi-source heterogeneous data automatic arrangement method according to claim 1, wherein the data anomaly judgment is performed on the screened power station parameters of the original data pool based on a preset large model, and the method comprises the following steps: based on a preset large model, combining knowledge of each field, identifying abnormal data in the screened power station parameters of the original data pool, and generating a corresponding quality report.
  6. 6. An apparatus for automatically sorting clean energy multi-source heterogeneous data, the apparatus comprising: the data pool construction module is used for screening the sensitive data of the core parameters of the power station based on a preset field level screening rule, obtaining the corresponding screened power station parameters and constructing an original data pool; the label rule building module is used for building a multi-level label association rule based on a preset subdivision field; The tag adding module is used for adding tag information to the screened power station parameters of the original data pool based on a preset large model and combined with a corresponding multi-level tag association rule; and the abnormality judgment module is used for carrying out data abnormality judgment on the screened power station parameters of the original data pool based on a preset large model.
  7. 7. The clean energy multi-source heterogeneous data finishing device of claim 6, further comprising: and the format adjustment module is used for carrying out data format adjustment on the screened power station parameters of the original data pool based on a preset large model and in combination with a preset data format requirement.
  8. 8. The clean energy multi-source heterogeneous data finishing device of claim 6, further comprising: And the dimension classification module is used for dimension classification of the screened power station parameters of the original data pool in the data normal state based on a preset classification dimension.
  9. 9. The clean energy multi-source heterogeneous data finishing device of claim 6, wherein: the label adding module is also used for identifying key parameters of the screened power station parameters of the original data pool based on a preset large model; the tag adding module is further used for adding tag information to the screened power station parameters of the original data pool based on key parameters of the screened power station parameters of the original data pool and combining corresponding multi-level tag association rules.
  10. 10. The clean energy multi-source heterogeneous data finishing device of claim 6, wherein: the abnormality judgment module is also used for identifying abnormal data in the screened power station parameters of the original data pool based on a preset large model and combining knowledge of each field, and generating a corresponding quality report.

Description

Clean energy multi-source heterogeneous data automatic arrangement method and device Technical Field The application relates to the technical field of data processing, in particular to a clean energy multi-source heterogeneous data automatic arrangement method and device. Background At present, aiming at the arrangement work of clean energy multi-source heterogeneous data, a rapid and effective processing means is lacking at present, and the following technical defects exist in the prior art: the field suitability is lost, the general data management technology has no special label for clean energy, can not distinguish the difference of hydropower, pumping storage and new energy data, has high cross-field data association error rate, and can not support multi-dimensional data calling of result conversion. The unstructured data processing is weak, and the energy data processing technology only covers the equipment to run structured data, and still relies on manual analysis on unstructured data such as patent documents, project reports and the like, so that the processing efficiency is low, and the mass data processing requirement of result conversion cannot be met. The technical semantic understanding is insufficient, namely the general large model has low accuracy in identifying terms in the energy field, so that the deviation is large when unstructured data is converted into structured result data, and the follow-up supply and demand matching cannot be supported. The safety grading protection is insufficient, the existing energy data safety scheme mostly adopts a unified encryption strategy, does not perform grading desensitization, and does not meet the safety requirement of internal and external data interaction in the process of transforming the results. In order to meet the actual demands, a clean energy multi-source heterogeneous data automatic arrangement technology is provided. Disclosure of Invention Aiming at the defects existing in the prior art, the application aims to provide the automatic arrangement method and the device for the clean energy multi-source heterogeneous data, which overcome the defects of the prior art, reasonably arrange the clean energy multi-source heterogeneous data by combining with preset rules, have the advantages of convenience and quickness, and meet the actual demands for data arrangement. In order to achieve the above purpose, the application adopts the following technical scheme: In a first aspect, the application provides a clean energy multi-source heterogeneous data automatic arrangement method, which comprises the following steps: Based on a preset field-level screening rule, sensitive data screening is carried out on core parameters of the power station, corresponding screened power station parameters are obtained, and an original data pool is constructed; constructing a multi-level tag association rule based on a preset subdivision field; based on a preset large model, adding tag information to the screened power station parameters of the original data pool by combining with a corresponding multi-level tag association rule; And based on a preset large model, carrying out data abnormality judgment on the screened power station parameters of the original data pool. On the basis of the technical scheme, the method further comprises the following steps: And based on a preset large model, combining with a preset data format requirement, performing data format adjustment on the screened power station parameters of the original data pool. On the basis of the technical scheme, the method further comprises the following steps: And classifying the dimension of the screened power station parameters of the original data pool in the normal data state based on a preset classification dimension. Based on the technical scheme, based on a preset large model and combined with a corresponding multi-level tag association rule, tag information is added to the screened power station parameters of the original data pool, and the method comprises the following steps: based on a preset large model, identifying key parameters of the screened power station parameters of the original data pool; And adding tag information to the screened power station parameters of the original data pool based on the key parameters of the screened power station parameters of the original data pool in combination with corresponding multi-level tag association rules. On the basis of the technical scheme, based on a preset large model, the data abnormality judgment is carried out on the screened power station parameters of the original data pool, and the method comprises the following steps: based on a preset large model, combining knowledge of each field, identifying abnormal data in the screened power station parameters of the original data pool, and generating a corresponding quality report. In a second aspect, the present application provides a clean energy multi-source heterogeneous data finishing devic