Search

CN-121996992-A - Cloud collaboration-based intelligent address data identification system

CN121996992ACN 121996992 ACN121996992 ACN 121996992ACN-121996992-A

Abstract

The invention relates to the technical field of data processing, in particular discloses an intelligent address data identification system based on cloud cooperation, which is used for acquiring original address data from a plurality of terminals and forming a data stream, carrying out value attenuation simulation of time dimension and semantic dispersion analysis of content dimension on each terminal data, respectively generating characteristic values of time sequence urgency and content credibility, mapping the characteristic values to a unified space for correlation calibration to construct a dynamic synergy coefficient, carrying out differential asynchronous scheduling on the data stream at a cloud based on the coefficient, immediately triggering identification core update by high coefficient data, buffering and iterative purification by low coefficient data, identifying the main stream change direction by integrating differential results and utilizing density-based clusters, adaptively correcting an address analysis rule, and finally outputting a structured address by applying the optimized rule.

Inventors

  • LV SHANGPING
  • CHEN XIANGHUI

Assignees

  • 福建五色神牛网络科技有限公司

Dates

Publication Date
20260508
Application Date
20260410

Claims (9)

  1. 1. Cloud cooperation-based address data intelligent identification system is characterized by comprising: the multi-source heterogeneous data acquisition module is used for acquiring an original address data set from a plurality of independently operated terminals in real time, and each terminal acquires address information containing unstructured text or images at respective physical positions to form an initial data stream; The system comprises a two-dimensional characteristic quantitative analysis module, a content reliability characteristic value generation module, a content dimension analysis module, a content analysis module and a content analysis module, wherein the two-dimensional characteristic quantitative analysis module is used for carrying out continuous value attenuation simulation on an initial data stream generated by each terminal to generate a time stamp sequence, estimating the current effective information density of the time stamp sequence and generating the time sequence urgency characteristic value; The dynamic collaborative coefficient construction module maps the time sequence urgency characteristic value and the content credibility characteristic value to the same measurement space, performs normalization and correlation calibration, and comprehensively generates a group of dynamic collaborative coefficients uniquely corresponding to each terminal; The asynchronous weighted collaborative scheduling module is used for executing differential scheduling on the continuously inflowing terminal address data at the cloud according to the dynamic collaborative coefficient, immediately triggering the updating flow of the cloud identification core for the terminal data with high dynamic collaborative coefficient, temporarily storing the terminal data with low dynamic collaborative coefficient into a buffer queue for multiple rounds of iterative purification, integrating the differential processing results of different terminals in an asynchronous mode, and correcting and optimizing the address element analysis rule; and the structured address output module is used for applying the stable analysis rule formed after the asynchronous weighted collaborative recognition step processing to the original address data flowing in real time or backlogged in history and outputting structured address entries in a uniform format.
  2. 2. The intelligent address data identification system based on cloud cooperation as claimed in claim 1, wherein the generating process of the time sequence urgency characteristic value is as follows: performing continuous decay mapping on the time stamp sequence to generate an information value curve representing the decay of each data point along with time; Based on a preset information validity threshold, carrying out dynamic window interception on an information value curve, and only reserving curve segments exceeding the information validity threshold to form an effective information window; And calculating the area surrounded by the curve and the time axis in the effective information window, and quantifying the numerical value of the area into a time sequence urgency characteristic value.
  3. 3. The intelligent identification system based on cloud cooperation of claim 1, wherein the generating process of the content credibility feature value is as follows: The unstructured address text in the initial data stream is disassembled and the occurrence frequency of each address element is counted according to a preset address element analysis rule, so that the address element distribution of the corresponding terminal is generated; The address element distribution is compared with a pre-stored standard address element mapping table layer by layer, and the matching completeness and logic conflict degree of elements of each layer are calculated; And synthesizing a numerical value representing the integral deviation degree of corresponding terminal data and the standard reference by a predefined dispersion quantization method according to the matching completeness and the logic conflict degree, and taking the numerical value as a content credibility characteristic value.
  4. 4. The intelligent address data identification system based on cloud cooperation as claimed in claim 1, wherein the calculation process of the dynamic cooperation coefficient is: constructing a two-dimensional interaction matrix of the time sequence urgency characteristic value and the content credibility characteristic value; performing spectrum analysis on the two-dimensional interaction matrix, and extracting a main feature vector of the two-dimensional interaction matrix; projecting a two-dimensional coordinate formed by the time sequence urgency characteristic value and the content credibility characteristic value to the direction of a main characteristic vector; And quantifying the projected length value into a dynamic synergy coefficient corresponding to the terminal through preset scaling.
  5. 5. The intelligent recognition system of address data based on cloud cooperation as claimed in claim 4, wherein the performing spectrum analysis on the two-dimensional interaction matrix to extract the main eigenvector of the two-dimensional interaction matrix specifically comprises: carrying out symmetry and centering treatment on the two-dimensional interaction matrix to obtain a standardized real symmetry matrix; Adopting an orthogonal direction iterative approximation method for the real symmetric matrix, and obtaining a stable vector direction meeting a preset convergence condition through repeated iterative computation; Calculating the Rayleigh Li Shang ratio of the direction of the stable vector relative to the real symmetric matrix, and taking the Rayleigh quotient ratio as a verification basis of final convergence; and determining the unit vector of the verified stable vector direction as a main characteristic vector.
  6. 6. The intelligent recognition system of address data based on cloud cooperation according to claim 1, wherein the integration of the differential processing results of different terminals by an asynchronous manner corrects and optimizes the address element resolution rule, and specifically comprises: extracting the latest output address element analysis results from the immediately triggered updating flow and the buffer queues of the multi-round iterative purification respectively, and comparing the address element analysis results with the analysis rules currently used to generate a difference vector cluster; carrying out density-based spatial clustering on the difference vector clusters, and identifying a change direction set; Carrying out logic fusion and conflict resolution on the main stream change direction set and the current analysis rule to form a rule updating draft; and applying the rule updating draft to an independent historical address verification set for testing, and adopting the rule updating draft only when the resolution accuracy is improved to exceed a preset threshold value, so as to complete formal correction and optimization of the address element resolution rule.
  7. 7. The intelligent recognition system of address data based on cloud cooperation according to claim 6, wherein the spatial clustering of the difference vector clusters based on density is performed to recognize the change direction set, and specifically comprises: For each vector in the difference vector cluster, calculating the adjacent quantity of each vector and other vectors in a preset radius range, forming a local density value of the vector, and constructing a local density distribution map of the whole vector space; According to a preset density reference, selecting a vector with a local density value higher than the density reference from the local density distribution map to form a core vector set; Traversing the core vector set, establishing connection for any two core vectors meeting the space adjacent relation in the core vector set, and merging all the core vectors communicated through the direct or indirect connection relation into an independent core vector group; and synthesizing all vector directions in each core vector group to generate aggregate vectors representing the main stream change directions of the core vector groups, wherein the aggregate vector sets are the main stream change direction sets.
  8. 8. The intelligent recognition system of address data based on cloud cooperation according to claim 1, wherein the outputting the structured address entry in the unified format specifically comprises: Inputting the original address data into a hierarchical decision tree constructed according to a stable analysis rule, judging and marking the type of the address element and the logic level layer by layer from top to bottom, and generating a temporary address tuple with a level mark; performing combined verification on each address element in the temporary address tuple, and identifying and correcting element conflict or hierarchy deletion existing in the temporary address tuple according to a predefined region constraint rule and a logic completeness rule to form a verified intermediate address tuple; Assembling each level element in the intermediate address tuple with the separator according to a fixed sequence according to a preset unified output specification; And executing final consistency check on the assembled address character string to ensure that the address character string accords with a preset complete address pattern, and outputting the complete address pattern into a structured address entry with a uniform format.
  9. 9. The intelligent recognition system of address data based on cloud cooperation according to claim 8, wherein the performing final consistency check on the assembled address character string specifically comprises: Matching the address character string with a preset address character pattern library, and checking whether the address character type sequence, the segmentation length and the separator position accord with a standard pattern; verifying whether effective space containing or abutting relations exist between address elements of different levels in the address character string or not based on a pre-constructed address element space relation map; performing association check on the address character string and the time stamp information in the original address data to ensure that the form of the address character string accords with the acquired time and space logic of the address character string; if and only if the address string passes all of the foregoing verification steps, the party outputs the address string as a structured address entry in a unified format.

Description

Cloud collaboration-based intelligent address data identification system Technical Field The invention relates to the technical field of data processing, in particular to an intelligent address data identification system based on cloud cooperation. Background In a plurality of fields such as smart city, modern logistics, emergency command, population management and the like, the rapid and accurate intelligent identification and structuring processing of address data are key technologies for efficiently running support business. The traditional address identification method is mostly based on a single-point rule base or a model trained by a limited sample, and is difficult to cope with the diversity, the regional difference and the dynamic variability of address expression in reality. Along with the development of the internet of things and the mobile internet, address data is characterized by being multi-source, heterogeneous, massive and generated in real time, so that the industry is promoted to adopt a cooperative processing technology based on a cloud computing platform to try to integrate the scattered data and computing power so as to improve the recognition precision and coverage range. The prior art has the following defects that the existing federal learning-based address collaborative identification method cannot achieve core technical contradiction of 'real-time' and 'robustness' in a dynamic open environment. Specifically, in the scenes of emergency response, real-time logistics and the like, the data of the multi-source terminal are changed severely and the quality is mixed, the traditional method adopts a synchronous federal learning framework with a fixed period, the model updating speed is seriously delayed from the field change (the real-time performance is poor), and meanwhile, the model updating speed is extremely easy to be polluted by massive low-quality data (the robustness is low) due to the fact that all participants are equally aggregated to update. Disclosure of Invention The invention aims to provide an intelligent address data identification system based on cloud cooperation so as to solve the problems in the background. The aim of the invention can be achieved by the following technical scheme: the system comprises an intelligent address data identification system based on cloud cooperation, a multi-source heterogeneous data acquisition module, a data processing module and a data processing module, wherein the multi-source heterogeneous data acquisition module is used for acquiring an original address data set from a plurality of independently operated terminals in real time, and each terminal acquires address information containing unstructured text or images at respective physical positions to form an initial data stream; The system comprises a two-dimensional characteristic quantitative analysis module, a content reliability characteristic value generation module, a content dimension analysis module, a content analysis module and a content analysis module, wherein the two-dimensional characteristic quantitative analysis module is used for carrying out continuous value attenuation simulation on an initial data stream generated by each terminal to generate a time stamp sequence, estimating the current effective information density of the time stamp sequence and generating the time sequence urgency characteristic value; The dynamic collaborative coefficient construction module maps the time sequence urgency characteristic value and the content credibility characteristic value to the same measurement space, performs normalization and correlation calibration, and comprehensively generates a group of dynamic collaborative coefficients uniquely corresponding to each terminal; The asynchronous weighted collaborative scheduling module is used for executing differential scheduling on the continuously inflowing terminal address data at the cloud according to the dynamic collaborative coefficient, immediately triggering the updating flow of the cloud identification core for the terminal data with high dynamic collaborative coefficient, temporarily storing the terminal data with low dynamic collaborative coefficient into a buffer queue for multiple rounds of iterative purification, integrating the differential processing results of different terminals in an asynchronous mode, and correcting and optimizing the address element analysis rule; and the structured address output module is used for applying the stable analysis rule formed after the asynchronous weighted collaborative recognition step processing to the original address data flowing in real time or backlogged in history and outputting structured address entries in a uniform format. The invention further provides a method for generating the time sequence urgency characteristic value, which comprises the following steps: performing continuous decay mapping on the time stamp sequence to generate an information value curve representin