Search

CN-122019596-A - Enterprise data management method and system based on artificial intelligence

CN122019596ACN 122019596 ACN122019596 ACN 122019596ACN-122019596-A

Abstract

The invention belongs to the technical field of enterprise data management, and provides an enterprise data management method and system based on artificial intelligence, comprising the following steps: in the process of carrying out data fusion processing on the incremental data and the full-quantity data after large data acquisition of enterprises, comparing and analyzing the incremental data and the full-quantity data, identifying whether Cartesian products appear, when the Cartesian products appear in the data fusion processing, evaluating the impact amplitude of the Cartesian products on the data fusion processing and the fusion data superposition degree, evaluating the impact amplitude is helpful for ensuring that key information is not missed in the data fusion processing, ensuring the effective integration of the full-quantity data and the incremental data when the impact amplitude is in a controllable range, leading the enterprises to have complete data views, reducing the pressure of the data fusion on system resources, avoiding system collapse or performance reduction caused by overlarge data quantity, and providing quick data query and processing capability for the enterprises.

Inventors

  • XUE FENG
  • HAN LING
  • CAI JIN
  • XU TONGTONG
  • GONG ZHIHAO

Assignees

  • 江苏四象软件有限公司
  • 扬州市中小企业发展服务中心

Dates

Publication Date
20260512
Application Date
20260209

Claims (10)

  1. 1. An enterprise data management method based on artificial intelligence is characterized by comprising the following steps: in the process of carrying out data fusion processing on the incremental data and the full data after the enterprise big data acquisition, comparing and analyzing the incremental data and the full data, and identifying whether a Cartesian product appears; When the Cartesian product appears in the data fusion processing process, evaluating the impact amplitude of the Cartesian product on the fusion data expansion degree and the fusion data superposition degree; when the Cartesian product generates high impact amplitude for data fusion processing, incremental repeated data are acquired in real time, incremental data acquisition repeated analysis is carried out on the Cartesian product according to the incremental repeated data, and whether the Cartesian product is caused by the incremental data acquisition repetition is judged; When the Cartesian product is caused by increment data acquisition repetition, the increment coincidence data is acquired by analyzing the time interval of acquisition, the time interval value of increment coincidence acquisition is acquired, the current time interval adjustment quantity of increment data acquisition is acquired according to the time interval value of increment coincidence acquisition, and the time interval of increment data acquisition is adjusted and managed.
  2. 2. The method for managing enterprise data based on artificial intelligence according to claim 1, wherein the process of comparing and analyzing the incremental data with the full data is as follows: setting a data fusion processing period in the same business scene or the related main body scene, respectively extracting the time stamp of the acquired incremental data and the time stamp of the whole data in the data fusion processing period, acquiring the time difference between the time stamp of the acquired incremental data and the time stamp of the whole data, calculating the ratio of the time difference to the time length of the data fusion processing period to be used as the data time stamp difference, and calculating the sum average value to obtain a data time error analysis value; and under the same business scene or the related main body scene, extracting incremental data corresponding to a plurality of pieces of full-quantity data, and carrying out summation average value calculation on the incremental full-quantity ratio corresponding to each piece of full-quantity data to obtain an incremental full-quantity analysis value.
  3. 3. The method for enterprise data management based on artificial intelligence of claim 2, wherein the Cartesian product recognition process is as follows: And carrying out summation calculation on the data time-error analysis value and the increment multi-full analysis value to obtain a Cartesian product recognition value, and displaying the Cartesian product recognition value as a Cartesian product signal if the Cartesian product analysis value is larger than a Cartesian product analysis threshold.
  4. 4. The method for enterprise data management based on artificial intelligence of claim 1, wherein the process of evaluating the impact amplitude of the Cartesian product on the data fusion process is as follows: Under the same service scene or related main body scene, acquiring the actual data volume after fusion, performing difference with the reasonable data volume of the service, and performing ratio calculation with the actual data volume after fusion to obtain the expansion degree value of the fused data; extracting core fields in each piece of fused actual data under the same business scene or the related main body scene, carrying out field overlapping comparison on the core fields in any piece of fused actual data, and taking the fused actual data with the overlapped core fields as repeated fused data; counting the number of repeated fusion data, and taking the proportion of the invalid data quantity after fusion as a fusion data coincidence degree value; And carrying out summation calculation on the expansion degree value of the fusion data and the coincidence degree value of the fusion data to obtain a Cartesian product impact value, and displaying the Cartesian product impact value as a Cartesian product strong impact signal if the Cartesian product impact value is larger than the Cartesian product impact threshold value.
  5. 5. The method for managing enterprise data based on artificial intelligence according to claim 1, wherein the incremental repeated data is obtained as follows: In the same business scene or related main body scene, extracting a core field of incremental data acquired in real time in a data fusion processing period to be used as incremental analysis data; And (3) performing overlapping comparison on core fields corresponding to any two incremental analysis data, and marking the incremental analysis data overlapped by the core fields as incremental overlapping data.
  6. 6. The method for enterprise data management based on artificial intelligence of claim 5, wherein the incremental data collection and repeated analysis of the Cartesian product is performed as follows: Extracting the repeated number of the incremental overlapping data, and constructing an incremental overlapping number change curve according to the acquired time of the data by taking an X axis as time and a Y axis as the repeated number of the incremental overlapping data; extracting the quantity of the invalid data quantity after fusion, and constructing an invalid data quantity change curve by taking an X axis as time and a Y axis as the quantity of the invalid data quantity after fusion according to the time acquired by the data; Taking a local change curve between two adjacent coordinate points on the increment coincidence quantity change curve as an increment coincidence quantity analysis curve to obtain a plurality of increment coincidence quantity analysis curves; taking a local change curve between two adjacent coordinate points on the invalid data quantity change curve as an invalid data quantity analysis curve to obtain a plurality of invalid data quantity analysis curves; Calculating each increment coincidence quantity analysis curve and each invalid data quantity analysis curve by utilizing a slope calculation formula to obtain an increment coincidence quantity analysis slope and an invalid data quantity analysis slope; Calculating the ratio of the increment coincidence quantity analysis slope and the invalid data quantity analysis slope in the time dimension to obtain a local trend analysis value; and carrying out standard deviation calculation on the local trend analysis value to obtain a quantity change correlation value.
  7. 7. The method for artificial intelligence based enterprise data management as in claim 5, wherein the process of determining whether the Cartesian product is caused by incremental data collection repetition is as follows: Extracting core fields of incremental coincidence data, performing coincidence comparison with the core fields corresponding to the repeated fusion data, taking the repeated fusion data coincident with the core fields of the incremental coincidence data as coincidence association data, and performing ratio calculation with the total number of the repeated fusion data to obtain a field repeated data quantity ratio; And calculating the ratio of the field repeated data quantity ratio to the quantity change related value to obtain a repeated acquisition analysis value, and if the repeated acquisition analysis value is greater than or equal to a repeated acquisition analysis threshold value, judging that the Cartesian product is caused by the repeated increment data acquisition.
  8. 8. The method for managing enterprise data based on artificial intelligence of claim 1, wherein the acquisition process of the incremental coincidence acquisition time interval value is as follows: And extracting incremental coincidence data in a plurality of data fusion processing time periods, respectively extracting data acquisition nodes corresponding to each incremental coincidence data in the data fusion processing time period, sequencing according to the time sequence in the data fusion processing time period, and acquiring acquisition time intervals between the data acquisition nodes corresponding to adjacent incremental coincidence data in the data fusion processing time period after sequencing as incremental coincidence acquisition interval values.
  9. 9. The method for managing enterprise data based on artificial intelligence of claim 8, wherein the current incremental data collection interval adjustment amount is obtained by the following steps: Comparing the magnitude of all increment coincidence acquisition interval values, selecting the maximum increment coincidence acquisition interval value and the minimum increment coincidence acquisition interval value, and calculating the summation average value to obtain an increment coincidence acquisition interval average value; and (3) performing difference between the increment coincidence acquisition interval mean value and the current increment data acquisition interval value, and taking an absolute value to obtain the current increment data acquisition interval adjustment quantity.
  10. 10. An enterprise data management system based on artificial intelligence is characterized by comprising the following modules: The Cartesian product recognition module is used for comparing and analyzing the incremental data with the full data in the data fusion processing process of the incremental data and the full data after the enterprise big data acquisition to recognize whether the Cartesian product appears; the impact amplitude evaluation module is used for evaluating the impact amplitude of the Cartesian product on the expansion degree of the fusion data and the superposition degree of the fusion data when the Cartesian product appears in the data fusion processing process; The Cartesian product reason analysis module is used for acquiring incremental repeated data in real time when the Cartesian product generates high impact amplitude for data fusion processing, carrying out incremental data acquisition repeated analysis on the Cartesian product according to the incremental repeated data, and judging whether the Cartesian product is caused by the incremental data acquisition repetition; And the acquisition adjustment management module is used for carrying out acquisition time interval analysis on the incremental coincidence data when the Cartesian product is caused by the repetition of the incremental data acquisition, acquiring an incremental coincidence acquisition time interval value, acquiring the current incremental data acquisition time interval adjustment quantity according to the incremental coincidence acquisition time interval value and carrying out adjustment management on the incremental data acquisition time interval.

Description

Enterprise data management method and system based on artificial intelligence Technical Field The invention belongs to the technical field of enterprise data management, and particularly relates to an enterprise data management method and system based on artificial intelligence. Background In the enterprise big data management flow, the data fusion processing is a core task. Data fusion aims to organically combine incremental data from different data sources, acquired at different times, with full data to construct a comprehensive, accurate and consistent enterprise data view. Full data represents the complete set of data for an enterprise at a particular point in time, while incremental data is data that is newly acquired after that point in time. By fusing the incremental data with the full data, the enterprise can update the data view in time, reflecting the latest changes of the business. In the actual data fusion process, a series of technical problems to be solved urgently exist. Firstly, when incremental data and full data are compared and analyzed, cartesian products are very easy to occur due to factors such as diversity of data sources, inconsistency of data acquisition time, complexity of data structures and the like. The cartesian product can lead to rapid expansion of data volume, so that the fused data size is far beyond expectations, the burden of data storage is increased, the processing performance of the system is seriously affected, the system response is slow and even crashed possibly, and the real-time query and analysis capability of enterprise data is seriously affected. Secondly, when Cartesian products appear during the data fusion process, there is currently a lack of effective means to accurately evaluate the impact amplitude of the data fusion process. If the impact amplitude cannot be accurately estimated, it is difficult for enterprises to judge whether the data fusion process can be successfully performed or whether key information is missed. Too high an impact amplitude may mean that there is a serious problem in the data fusion process, which may result in that the full amount of data and the incremental data cannot be effectively integrated, so that an enterprise cannot obtain a complete and accurate data view, and further the decision making based on the data is affected. An unreasonable acquisition interval may result in excessive acquisition of similar data in a short period of time, thereby increasing duplicate and redundant content in the data, while also increasing system load, wasting storage resources and network resources. However, at present, enterprises often lack effective means to dynamically adjust incremental data acquisition time intervals according to actual conditions, so that optimal management of data acquisition cannot be realized, and data quality and system performance are difficult to improve while data timeliness is ensured. Therefore, the invention provides an enterprise data management method and system based on artificial intelligence. Disclosure of Invention In order to overcome the deficiencies of the prior art, at least one technical problem presented in the background art is solved. The technical scheme adopted for solving the technical problems is as follows: an artificial intelligence based enterprise data management method, comprising: in the process of carrying out data fusion processing on the incremental data and the full data after the enterprise big data acquisition, comparing and analyzing the incremental data and the full data, and identifying whether a Cartesian product appears; When the Cartesian product appears in the data fusion processing process, evaluating the impact amplitude of the Cartesian product on the fusion data expansion degree and the fusion data superposition degree; when the Cartesian product generates high impact amplitude for data fusion processing, incremental repeated data are acquired in real time, incremental data acquisition repeated analysis is carried out on the Cartesian product according to the incremental repeated data, and whether the Cartesian product is caused by the incremental data acquisition repetition is judged; When the Cartesian product is caused by increment data acquisition repetition, the increment coincidence data is acquired by analyzing the time interval of acquisition, the time interval value of increment coincidence acquisition is acquired, the current time interval adjustment quantity of increment data acquisition is acquired according to the time interval value of increment coincidence acquisition, and the time interval of increment data acquisition is adjusted and managed. The invention further adopts the scheme that the process of comparing and analyzing the incremental data with the full data is as follows: setting a data fusion processing period in the same business scene or the related main body scene, respectively extracting the time stamp of the acquired incremental data and