CN-112907065-B - Data processing method and device for wind control system and electronic equipment
Abstract
The application provides a data processing method and device for an air control system and electronic equipment. According to the data processing method, first data about a plurality of wind control objects provided by different target data sources are firstly obtained, then the data types of the first data are judged, the first data are quantized according to the data types to determine second data to obtain target data sets, and finally the similarity between each target data set and each preset data set is determined according to a preset dimension reduction algorithm, so that a wind control system determines source pasting data from the target data sources according to each similarity, effectively discriminates the data redundancy provided by each target data source based on the similarity, determines source pasting data required by the wind control system, reduces the processing work of the wind control system on mass data when the wind control strategy is implemented, guarantees the wind control effect and timeliness of the wind control system, effectively reduces the maintenance cost of the wind control system, and is beneficial to the wind control system to effectively precipitating data to form a sharing mechanism of the data.
Inventors
- YAN CHAO
- ZHAO HUANSHENG
Assignees
- 深圳前海微众银行股份有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20210210
Claims (10)
- 1. A data processing method for a wind control system, comprising: Acquiring first data about a plurality of wind control objects provided by different target data sources, wherein the first data are used for representing preset characteristic information of the wind control objects; Judging the data type of the first data to determine second data according to the data type to obtain a target data set, wherein the second data is used for representing the first data after the numerical value, and the data type comprises a numerical value type and a non-numerical value type; The judging the data type of the first data to determine second data according to the data type comprises determining the first data as the second data if the data type of the first data is the numerical type, processing the first data according to a preset word segmentation algorithm to obtain the second data if the data type of the first data is the non-numerical type and the first data of the non-numerical type is represented by a non-coded word, and processing the first data according to a preset quantization algorithm to obtain the second data if the data type of the first data is the non-numerical type and the first data of the non-numerical type is represented by a character or a coded word, wherein the preset quantization algorithm comprises the preset coding rule; Determining the similarity between each target data set and a preset data set according to a preset dimension reduction algorithm, so that the wind control system determines source pasting data according to the similarity, wherein the preset data set comprises data of preset target dimension indications; Before the similarity between each target data set and the preset data set is determined according to the preset dimension reduction algorithm, the method further comprises the steps of improving a preset random neighborhood embedding algorithm, so that normally distributed variances which take sample data as centers in the preset random neighborhood filling algorithm are adjusted to be variances of corresponding data sets, and the preset dimension reduction algorithm is obtained, wherein the corresponding data sets comprise each target data set and the preset data set; The method comprises the steps of obtaining a first expected value of each target data set and a second expected value of the preset data set, obtaining a first standard deviation of each target data set and a second standard deviation of the preset data set, and determining distance data between each first standard deviation and each second standard deviation according to the preset dimension reduction algorithm, each first expected value, each first standard deviation, each second expected value and each second standard deviation to determine each distance data as the similarity between the corresponding target data set and the preset data set.
- 2. The method for processing data of a wind control system according to claim 1, wherein after determining the similarity between each target data set and the preset data set according to the preset dimension reduction algorithm, further comprising: classifying the first data provided by the target data source according to the similarity and the preset service requirement, generating different data areas according to the classified first data and preset logic rules, determining the use authority and the use range of the first data through the different data areas, and/or Marking the target data sources according to the similarity and a preset marking rule to record each target data source in a standardized way and/or And determining the source pasting data meeting a preset service scene from the first data provided by the target data source according to the similarity.
- 3. The method for processing data of a wind control system according to claim 1, wherein after determining the similarity between each target data set and the preset data set according to the preset dimension reduction algorithm, further comprising: And carrying out storage operation or deletion operation on the first data provided by the target data source according to the similarity so as to form a standardized data pool.
- 4. The method for processing data of a wind control system according to claim 1, wherein the processing the first data according to a preset word segmentation algorithm to obtain the second data includes: Performing word segmentation processing on the first data and the data of the preset target dimension indication according to the preset word segmentation algorithm to obtain target word segmentation and target word segmentation respectively; And determining the proportion of the target word in the target word so as to determine the proportion as the second data corresponding to the first data.
- 5. The method for processing data for a wind control system according to claim 1, further comprising, after determining each distance data as the similarity between the corresponding target data set and the preset data set: If the similarity is zero, determining that the first data provided by the current target data source and the data of the preset target dimension indication are completely redundant; if the similarity is not zero and is smaller than a preset redundancy threshold, determining the data redundancy of the first data provided by the current target data source and the preset target dimension indication; and if the similarity is larger than a preset redundancy threshold, determining that the first data provided by the current target data source and the data of the preset target dimension indication have no correlation.
- 6. A data processing method for a wind control system according to any of claims 1-3, wherein the preset feature information comprises at least one of business information, tax information, credit information, judicial information, asset information, revenue information, and anti-illegal funds transfer suspicious information of the wind control object.
- 7. A data processing apparatus for a wind control system, comprising: the system comprises an acquisition module, a control module and a control module, wherein the acquisition module is used for acquiring first data about a plurality of wind control objects provided by different target data sources, and the first data are used for representing preset characteristic information of the wind control objects; The first processing module is used for judging the data type of the first data so as to determine second data according to the data type and obtain a target data set, wherein the second data is used for representing the first data after the numerical processing, and the data type comprises a numerical value type and a non-numerical value type; The first processing module is specifically configured to determine the first data as the second data if the data type of the first data is the numeric type; if the data type of the first data is the non-numerical type and the first data of the non-numerical type is represented by non-coding characters, processing the first data according to a preset word segmentation algorithm to obtain the second data, wherein the preset quantization algorithm comprises the preset word segmentation algorithm; if the data type of the first data is the non-numeric type and the first data of the non-numeric type is represented by characters or coded words, processing the first data according to a preset coding rule to obtain the second data, wherein the preset coding rule comprises a unique corresponding relation between the characters or the coded words and the second data, and the preset quantization algorithm comprises the preset coding rule; The second processing module is used for determining the similarity between each target data set and a preset data set according to a preset dimension reduction algorithm so that the wind control system determines the source pasting data according to the similarity, and the preset data set comprises data of a preset target dimension indication; the second processing module is further configured to improve a preset random neighborhood embedding algorithm, so that a variance of normal distribution taking sample data as a center in the preset random neighborhood filling algorithm is adjusted to be a variance of a corresponding data set, and the preset dimension reduction algorithm is obtained, wherein the corresponding data set comprises each target data set and the preset data set; The second processing module is specifically configured to obtain a first expected value of each target data set and a second expected value of the preset data set, obtain a first standard deviation of each target data set and a second standard deviation of the preset data set, and determine distance data between each first standard deviation and each second standard deviation according to the preset dimension reduction algorithm, each first expected value, each first standard deviation, each second expected value and the second standard deviation, so as to determine each distance data as the similarity between the corresponding target data set and the preset data set.
- 8. An electronic device, comprising: processor, and A memory for storing a computer program of the processor; Wherein the processor is configured to perform the data processing method for a wind control system according to any of claims 1 to 6 via execution of the computer program.
- 9. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the data processing method for a wind control system according to any one of claims 1 to 6.
- 10. A computer program product comprising a computer program, characterized in that the computer program, when executed by a processor, implements the data processing method for a wind control system according to any of claims 1 to 6.
Description
Data processing method and device for wind control system and electronic equipment Technical Field The application relates to the technical field of financial science and technology (Fintech), in particular to a data processing method and device for a wind control system and electronic equipment. Background With the rapid development of computer technology and internet technology, financial technology (Fintech) is becoming a product of deep integration of finance and technology, and is now becoming a hotspot for innovative development of the financial industry. In addition, the proportion of the current small micro-enterprises to the total enterprises is over 70%, and more small micro-enterprises are generated. For the financial institutions, a corresponding wind control system is built for each small micro-enterprise based on big data, and a corresponding wind control strategy is implemented for each small micro-enterprise through the wind control system. However, with the rapid development of large data technology, data from various data sources for characterizing various characteristic information of small micro-enterprises is growing exponentially. In order to fully reach the enterprises, the wind control system generally receives the data of various characteristic information as much as possible to ensure the wind control effect for the small and micro enterprises. However, because these mass data diverse and confused numerous and disorderly and the elements such as the quality and the value density of the data are uneven, in the current application, the data provided by each data source is directly input as the source pasting data of the wind control system, so that various aspects such as the data processing capability, the maintenance cost, the wind control aging, the wind control strategy and the wind control effect of the wind control system can be more challenged, even more serious negative effects can be brought, for example, redundancy may exist in the data provided by the data sources, but the wind control system cannot discriminate, repeated processing of the data can be caused, or the wind control system cannot precipitate the effective data, and thus a sharing mechanism of the data cannot be formed, and the like. It can be seen that a solution is needed to overcome various problems existing in the prior art that mass data provided by each data source is directly used as the source pasting data of the wind control system. Disclosure of Invention The application provides a data processing method, a device and electronic equipment for an air control system, which are used for solving the technical problem that in the prior art, massive data provided by various data sources can directly serve as source pasting data of the air control system to cause greater challenges and even serious negative influence on the air control system. In a first aspect, the present application provides a data processing method for a wind control system, including: Acquiring first data about a plurality of wind control objects provided by different target data sources, wherein the first data are used for representing preset characteristic information of the wind control objects; judging the data type of the first data to determine second data according to the data type, wherein the second data is used for representing the first data after the quantization, and the data type comprises a numerical value type and a non-numerical value type; And determining the similarity between the second data corresponding to each target data source and the data of the preset target dimension indication according to a preset dimension reduction algorithm, so that the wind control system determines the source pasting data according to the similarity. In one possible design, after determining the similarity between each target data set and the preset data set according to the preset dimension reduction algorithm, the method further includes: classifying the first data provided by the target data source according to the similarity and the preset service requirement, generating different data areas according to the classified first data and preset logic rules, determining the use authority and the use range of the first data through the different data areas, and/or Marking the target data sources according to the similarity and a preset marking rule to record each target data source in a standardized way and/or And determining the source pasting data meeting a preset service scene from the first data provided by the target data source according to the similarity. In one possible design, after determining the similarity between each target data set and the preset data set according to the preset dimension reduction algorithm, the method further includes: And carrying out storage operation or deletion operation on the first data provided by the target data source according to the similarity so as to form a standardized data pool. In one pos