CN-116821275-B - Data investigation processing method and device
Abstract
The invention provides a data checking processing method and device, wherein the method comprises the steps of receiving data to be checked, wherein the data to be checked at least comprises one of customer information and account information, acquiring supplementary data of the data to be checked in a mode of comparing the data to be checked with total data in a data lake, wherein the supplementary data comprises basic data and associated data corresponding to the data to be checked, generating a relation map of the data to be checked according to the data to be checked and the supplementary data, generating a data checking result of the data to be checked based on the relation map, and solving the problem that risks cannot be predicted in advance for abnormal data by using an after-supervision and prevention mechanism mainly based on experience rules in the related art.
Inventors
- QIN SHIZHONG
- SHI CHENYANG
- Pei Yamin
- ZHANG JIE
- ZHANG TAO
- KE JI
Assignees
- 中国光大银行股份有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20230609
Claims (10)
- 1. A data-checking processing method, comprising: receiving data to be inspected, wherein the data to be inspected at least comprises one of customer information and account information; obtaining supplementary data of the data to be inspected by comparing the data to be inspected with the total data in the data lake, wherein the supplementary data comprises basic data and associated data corresponding to the data to be inspected; Generating a relation map of the data to be inspected according to the data to be inspected and the supplementary data; Generating a data investigation result of the data to be investigated based on the relation graph; wherein generating a relationship graph of the data to be inspected according to the data to be inspected and the supplementary data comprises: Acquiring a plurality of index information of the data to be examined and the basic data; Determining hit results corresponding to the index information from the associated data according to a preset spectrum analysis rule, wherein the hit results comprise hits meeting the spectrum analysis rule and misses not meeting the spectrum analysis rule; And generating the relation map according to hit results corresponding to the data to be examined, the supplementary data and the index information.
- 2. The method of claim 1, wherein generating the relationship graph from hit results corresponding to the data to be examined, the supplemental data, and the plurality of index information comprises: Classifying the plurality of index information to obtain multi-class information; And taking the client information or the account information as a central node of the relation graph, wherein the multi-class information is a child node of the central node, and the information contained in the multi-class information generates the relation graph for leaf nodes of the child node, wherein each class of information corresponds to one child node.
- 3. The method of claim 2, wherein the index information comprises at least one of a certificate address, an account address, an office address, a registration address, a contact address, a transaction frequency, a transaction amount, an Internet protocol IP address, a media access control MAC address, a communication number, a certificate number, relative information, an account name, a responsible person, a manager, a service agent, an authorized manager, a frequent transaction account, an account with a transaction amount greater than a predetermined value, a service sponsor, a service recipient, an account opening time, an account opening site, and wherein the plurality of types of information comprises address information, account information, customer information, transaction information, communication information, The address information comprises a certificate address, an account address, an office address, a registration address and a contact address; the account information comprises account names, responsible personnel or management personnel, service agents, authorized sponsors, account opening time and account opening network points; The client information comprises a communication number, a certificate number and relative information; the transaction information comprises service transaction frequency, transaction amount, frequent transaction account, account with transaction amount larger than preset value, service initiator and service receiver; the communication information comprises an IP address, a MAC address and a communication number.
- 4. The method of claim 3, wherein the step of, The profile analysis rules include at least one of: The same as the certificate address, the account address, the office address, the registration address or the contact address or the distance difference value is smaller than a preset distance; the business transaction frequency of the account corresponding to the business application data is greater than or equal to a preset frequency; the transaction limit of the account corresponding to the business application data is greater than or equal to a preset limit; The IP address or the MAC address of the account corresponding to the service application data is the same; The communication number or the certificate number of the account corresponding to the service application data is the same; the responsible person, manager, service agent or authorized manager of the account corresponding to the service application data belongs to the relatives; The account name, responsible person, manager, service agent or authorized manager of the account corresponding to the service application data are the same; the transaction times of the frequent transaction account are larger than the preset times; an account with a transaction amount greater than a preset value; the account corresponding to the service application data has the same service initiator or service receiver; the account opening time of the account corresponding to the service application data is the same as or the time difference is smaller than the preset time; The account opening net point of the account corresponding to the service application data is the same as or the distance difference is smaller than the preset distance.
- 5. A method according to claim 3, wherein generating a data investigation result of the data to be investigated based on the relationship graph comprises: Generating the data checking result at least comprising hit times of the map analysis rules, certificate types, account properties, account debit transaction times sum, account opening information and all IP addresses used by the account belonging areas based on the relation map for the personal account; and generating the data checking result at least comprising hit times of the map analysis rules, account opening mechanism information, account debit transaction times sum, account opening information and all IP addresses used by the account for the public account based on the relation map.
- 6. The method according to any one of claims 1 to 5, further comprising: acquiring personal service information and total data of public service information, wherein the personal service information at least comprises personal client information and personal account information, and the public service information at least comprises public client information and public account information; And storing the full data into the data lake according to a preset format.
- 7. The method according to any one of claims 1 to 5, characterized in that after generating a data investigation result of the data to be investigated based on the relationship map, the method further comprises: splitting the data investigation result into a total service unit and a sub-service unit; distributing the total service units according to the total data distribution paths corresponding to the total service units; and distributing the sub-service units according to the sub-service unit corresponding sub-service distribution paths.
- 8. A data-checking processing apparatus, comprising: the receiving module is used for receiving data to be inspected, wherein the data to be inspected at least comprises one of customer information and account information; The acquisition module is used for acquiring the supplementary data of the data to be inspected in a mode of comparing the data to be inspected with the total data in the data lake, wherein the supplementary data comprises basic data and associated data corresponding to the data to be inspected; the first generation module is used for generating a relation map of the data to be inspected according to the data to be inspected and the supplementary data; the second generation module is used for generating a data investigation result of the data to be investigated based on the relation graph; Wherein the first generation module comprises: the acquisition sub-module is used for acquiring a plurality of index information of the data to be examined and the basic data; a determining submodule, configured to determine hit results corresponding to the plurality of index information from the associated data according to a preset spectrum analysis rule, where the hit results include hits that satisfy the spectrum analysis rule and misses that do not satisfy the spectrum analysis rule; And the generation sub-module is used for generating the relation map from the hit results corresponding to the data to be checked, the supplementary data and the plurality of index information.
- 9. A computer-readable storage medium, characterized in that the storage medium has stored therein a computer program, wherein the computer program is arranged to execute the method of any of the claims 1 to 7 when run.
- 10. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to run the computer program to perform the method of any of the claims 1 to 7.
Description
Data investigation processing method and device Technical Field The invention relates to the field of data processing, in particular to a data checking processing method and device. Background Along with the rapid development of science and technology, the abnormal transaction chain breaks through the physical limit of space, is doped in mass transaction data of daily business, and is not easy to be clearly and intuitively distinguished. The financial industry is challenged by a post-supervision prevention and control mechanism based on traditional experience rules. Aiming at the problem that the risk cannot be predicted in advance by a post supervision and prevention and control mechanism based on experience rules for abnormal data in the related technology, no solution has been proposed yet. Disclosure of Invention The embodiment of the invention provides a data checking processing method and device, which at least solve the problem that the risk cannot be predicted in advance for abnormal data in the related technology by using an after-supervision prevention and control mechanism based on experience rules. According to an embodiment of the present invention, there is provided a data investigation processing method including: receiving data to be checked, wherein the data to be checked at least comprises one of customer information and account information; obtaining supplementary data of the data to be inspected by comparing the data to be inspected with the total data in the data lake, wherein the supplementary data comprises basic data and associated data corresponding to the data to be inspected; Generating a relation map of the data to be inspected according to the data to be inspected and the supplementary data; And generating a data investigation result of the data to be investigated based on the relation graph. Optionally, generating the relationship map of the data to be inspected according to the data to be inspected and the supplementary data includes: Acquiring a plurality of index information of the data to be examined and the basic data; Determining hit results corresponding to the index information from the associated data according to a preset spectrum analysis rule, wherein the hit results comprise hits meeting the spectrum analysis rule and misses not meeting the spectrum analysis rule; And generating the relation map according to hit results corresponding to the data to be examined, the supplementary data and the index information. Optionally, generating the relationship graph from the hit results corresponding to the data to be examined, the supplemental data, and the plurality of index information includes: Classifying the plurality of index information to obtain multi-class information; And taking the client information or the account information as a central node of the relation graph, wherein the multi-class information is a child node of the central node, and the information contained in the multi-class information generates the relation graph for leaf nodes of the child node, wherein each class of information corresponds to one child node. Optionally, the index information at least comprises one of certificate address, account address, office address, registration address, contact address, business transaction frequency, transaction amount, internet protocol (Internet Protocol, IP for short), media access Control (MEDIA ACCESS Control, MAC for short) address, communication number, certificate number, relative information, account name, responsible person, manager, business agent, authorized manager, frequent transaction account, account with transaction amount greater than preset value, business initiator, business receiver, account opening time, account opening site, the multiple types of information comprise address information, account information, client information, transaction information, communication information, wherein, The address information comprises a certificate address, an account address, an office address, a registration address and a contact address; the account information comprises account names, responsible personnel or management personnel, service agents, authorized sponsors, account opening time and account opening network points; The client information comprises a communication number, a certificate number and relative information; the transaction information comprises service transaction frequency, transaction amount, frequent transaction account, account with transaction amount larger than preset value, service initiator and service receiver; the communication information comprises an IP address, a MAC address and a communication number. Optionally, the profile analysis rule includes at least one of: The same as the certificate address, the account address, the office address, the registration address or the contact address or the distance difference value is smaller than a preset distance; the business transaction frequency of the account correspondi