Search

CN-122019617-A - Method for constructing public security industry data asset catalogue

CN122019617ACN 122019617 ACN122019617 ACN 122019617ACN-122019617-A

Abstract

The invention belongs to the technical field of data asset management, and particularly discloses a method for constructing a public security industry data asset catalog, which comprises the following steps of firstly acquiring data in each business system database of a public security institution, storing the data in MaxCompute, and then preprocessing the data; and finally, creating a core directory table, and calling the data in the core directory table in the directory application to present the data in the core directory table in a hierarchical tree form. Aiming at the problems of disordered data management and low utilization efficiency in the current public safety industry, the method and the system finally form the associatable public safety data asset catalogue by means of data acquisition, preprocessing, classification and association relation construction, and solve the problems that required data are difficult to quickly and accurately find by public safety service personnel in case handling and working due to scattered storage of data and different formats, and the data utilization efficiency is low.

Inventors

  • QIAO YANG
  • ZHANG XIAOBO
  • YAN ZICHENG
  • CHANG XUECHENG
  • ZHAO XUEYI
  • LI XIN
  • LIU JINGJING
  • Shi Xingnan
  • Bai Menxin
  • Dai Xingkui

Assignees

  • 中电科联海创智信息科技有限公司

Dates

Publication Date
20260512
Application Date
20251222

Claims (8)

  1. 1. A method of constructing a public safety industry data asset directory, comprising the steps of: The method comprises the steps of S1, data acquisition and access, namely defining connection parameters of a JDBC interface, then connecting to MySQL, oracle and KingBase databases of various business systems of a public safety institution through the JDBC interface to acquire structured data, semi-structured data and unstructured data in the databases, and storing the structured data, the semi-structured data and unstructured data in MaxCompute; S2, preprocessing and standardizing data, namely performing de-duplication, missing value processing, error correction and format standardization processing on the collected structured data, semi-structured data and unstructured data by using a UDF function in MaxCompute; s3, data classification and identification, namely establishing a classification system based on public safety service characteristics, and adding classification labels for the data in a rule marking and intelligent marking mode; S4, constructing a data association relation and mapping, namely connecting association data in different data tables based on main external key association or rule association to generate an association relation table, and importing the association relation table into a GDB to generate a data association map; And S5, creating a data asset main table and a data relation table in MaxCompute, wherein the data asset main table and the data relation table are respectively used for storing the core information of the preprocessed data and storing all the association relation tables generated in the S4, and then calling the data in the data asset main table and the data relation table through a Restful API interface in the directory application so as to present the data in the data asset main table and the data relation table in a hierarchical tree mode.
  2. 2. The method of claim 1, wherein in S1, the connection parameters of the JDBC interface include user、password、useSSL、autoReconnect、useUnicode、serverTimezone、characterEncodinG、cachePrepStmts、rewriteBatchedStatements、prepStmtCacheSize、prepStmtCacheSqlLimit、useServerPrepStmts、allowMultiQueries、useJDBCCompliantTimezoneShift.
  3. 3. The method for creating a public safety industry data asset inventory of claim 1, wherein in S2, the data deduplication is performed by numbering repeated data in the structured data using a windowing function, and filtering out records with a number of "1" from the repeated data.
  4. 4. The method for creating a public safety industry data asset inventory of claim 1, wherein in S2, the data missing values are processed by marking or populating records of structured data, semi-structured data, and unstructured data with NULL key fields using conditional expressions or COALESCE () functions.
  5. 5. The method for constructing a public safety industry data asset inventory of claim 1, wherein in S2, the data error correction is performed by verifying structured data, semi-structured data, and unstructured data using a regular expression function and extracting data from the data that meets specifications.
  6. 6. The method for constructing a public safety industry data asset directory according to claim 1, wherein in S2, the data standardization process comprises format unification and code value conversion, wherein the format unification is to convert date unification in the extracted data into YYY-MM-DD format, time unification into HH: MM: SS format, and simultaneously unify the precision and units of the monetary class numbers, and the code value conversion is to convert code unification in the source system data into standard business terms using JOIN function.
  7. 7. The method of claim 1, wherein in S4, the output fields of the association table include source_data_id, relationship_type, and target_data_id.
  8. 8. The method of claim 1, wherein the step of creating a public safety industry data asset inventory, in S5, the core information includes asset_id, asset_name, data_ category, storage _ path, confidentiality _level, and owner.

Description

Method for constructing public security industry data asset catalogue Technical Field The invention relates to the technical field of data asset management, in particular to a method for constructing a public security industry data asset directory. Background Along with the development of social economy and the improvement of the living standard of people, the demands of people on safety are also higher and higher. Public safety authorities need to maintain social order by collecting and analyzing more data to prevent and fight abnormal behavior. For example, in the aspect of traffic management, a large amount of vehicle travel data, driver information, etc. needs to be collected, and in the aspect of security management, data such as a person's movement track, abnormal behavior, etc. needs to be collected. The data not only covers the structured data generated by each business system in public security, such as population registration information, case basic information, traffic violation records and the like, but also comprises unstructured data, such as case file scanning pieces, field monitoring videos, law enforcement recorder images and the like, and also comprises semi-structured data from external channels, such as the Internet, government affair cooperation platforms and the like, such as public opinion information, cross-department sharing accounts and the like. The collection and analysis of the data are helpful for public security departments to discover and dispose potential safety hazards in time, so that the requirements of society on safety are met. However, due to the lack of unified management specifications, the data are stored in a scattered manner in independent business systems of departments such as criminal investigation, public security, transportation, household registration and the like, and the data format also presents diversified characteristics, such as a database standard format, an Excel table, PDF documents, streaming media files and the like. Because the data has the characteristics of multisource, multi-mode and scattered diversity, the utilization efficiency of the public safety industry data is low, and in actual case handling and daily work, public safety service personnel often need to search the required data one by one across a plurality of systems. For example, when handling cross-regional fraud cases, public security service personnel need to log in a household registration system to inquire the identity information of suspected persons, a criminal investigation system is required to call related case records, a transportation system is required to check travel tracks, a financial collaboration platform is required to obtain funds, time cost of hours or even days is consumed, key information is possibly omitted due to problems of system authority limit or incompatibility of data formats and the like, and therefore a large amount of data value is difficult to fully release through quick integration and accurate call, and the response efficiency of public security business is further restricted. In the early days, manual Excel tables were generally used to comb the data and the asset directory structure, but if the data asset directory is constructed, a great deal of manpower and time are required to collect and sort the basic information of the data only by means of manual data combing, and when the data volume increases, the staff needs to take weeks or even months to complete the work, and the time is consumed and the error is very easy. Therefore, it is necessary to design a method for constructing a public safety industry data asset directory, so as to solve the problem that the public safety service personnel cannot quickly and accurately find the required data in case handling and working due to scattered storage of the data and different formats, and the data utilization efficiency is low. Disclosure of Invention The invention aims to provide a method for constructing a public safety industry data asset catalog, which solves the problems that required data is difficult to quickly and accurately find in case handling and work by public safety service personnel due to scattered storage of data and different formats, and the utilization efficiency of the data is low. In order to achieve the purpose, the basic scheme provided by the invention is that a method for constructing a public security industry data asset directory comprises the following steps: the method comprises the steps of S1, data acquisition and access, namely defining connection parameters of a JDBC interface, then connecting to MySQL, oracle and KingBase databases of public safety business systems through the JDBC interface to acquire structured data, semi-structured data and unstructured data in the databases, and storing the structured data, the semi-structured data and unstructured data in MaxCompute; S2, preprocessing and standardizing data, namely performing de-duplication, missing value p