CN-122027298-A - Advertisement anti-cheating method and system based on flow identification model

CN122027298ACN 122027298 ACN122027298 ACN 122027298ACN-122027298-A

Abstract

The invention discloses an advertisement anti-cheating method and system based on a flow identification model, which relate to the technical field of internet advertisement, wherein a public network address, a source port and a device identifier in a communication request are extracted, and numerical continuity analysis is carried out on the source port to divide port blocks, so that a port block account book containing the association relation of the public network address, the source port and the device identifier is established; calculating contradiction index to quantify structure conflict based on account book statistics cross-block dispersion and intra-block multiplexing, determining main attribution port block of equipment according to the contradiction index, marking non-attribution record as structure conflict and generating conflict list, searching the conflict list with public network address of request to be detected, equipment identification and current port block as conditions, if hit, executing differentiation treatment. According to the invention, by constructing the refined topology map, the granularity limit of public network addresses is broken through, abnormal traffic in a carrier-level network address conversion scene can be accurately identified, and the accidental injury rate is effectively reduced.

Inventors

ZHU YAN
WANG YINGZHAN
LIU YONG
ZHANG XIN

Assignees

上海聚告德业文化发展有限公司

Dates

Publication Date: 20260512
Application Date: 20260225

Claims (10)

1. The advertisement anti-cheating method based on the flow identification model is characterized by comprising the following steps of: acquiring communication request data, extracting a public network address, a source port and a device identifier, carrying out numerical continuity analysis on the source port to divide port blocks, counting occurrence frequency of the device identifier in each port block, and establishing a port block account book containing association relations among the public network address, the port block and the device identifier; Based on the port block account book, respectively counting the number of port blocks associated with the device identifications under the public network address as the inter-block dispersion, taking the number of different device identifications carried in the port blocks as the intra-block multiplexing degree, and taking the ratio relation of the inter-block dispersion degree and the intra-block multiplexing degree as a contradiction index; performing risk rating on the public network address according to the contradiction index, determining a port block with highest occurrence frequency as a main home port block aiming at the equipment identification under the target public network address, marking the association records of the equipment identification in other port blocks except the main home port block as structural conflicts, and generating a conflict list; And receiving a communication request to be detected, determining a current port block to which a source port of the communication request to be detected belongs, searching in a conflict list by taking a public network address, a device identifier and the current port block of the communication request to be detected as search conditions, and executing differentiation treatment if the consistent records are found.
2. The traffic identification model-based advertisement anti-cheating method according to claim 1, wherein performing numerical continuity analysis on source ports to divide port blocks comprises: Acquiring all source ports associated with the public network address, generating a port value list, and performing ascending sort on the port value list; calculating the difference value of two adjacent port values in the ordered port value list; And comparing the difference value with a preset continuity threshold value, if the difference value is smaller than or equal to the continuity threshold value, judging that the two corresponding source ports belong to the same port block, and if the difference value is larger than the continuity threshold value, judging that the two corresponding source ports belong to different port blocks.
3. The method for anti-cheating advertisement based on traffic recognition model according to claim 2, further comprising a step of adaptively adjusting a continuity threshold, wherein the step specifically comprises identifying a network operator autonomous domain to which the public network address belongs, dynamically adjusting a value of the continuity threshold according to historical average port dispersion of the network operator, or performing a small sample test on the public network address in an initialization stage, and selecting a value that makes the number of port blocks most stable as the continuity threshold.
4. The traffic identification model-based advertisement anti-cheating method according to claim 1, further comprising, before establishing the port block ledger: Maintaining a sample set containing a plurality of communication request data in a storage space, and dynamically updating the sample set according to a preset window capacity; The dynamic updating specifically includes removing the record with the earliest timestamp according to a first-in first-out strategy when the number of records in the sample set exceeds the preset window capacity, and executing numerical continuity analysis only based on the record currently reserved in the sample set.
5. The traffic recognition model-based advertisement anti-cheating method according to claim 1, wherein the contradiction index comprises: For each associated equipment identifier under the public network address, calculating a value obtained by subtracting the value 1 from the corresponding cross-block dispersion of the equipment identifier, and accumulating and summing calculation results to obtain an excess cross-block aggregation value; for each associated port block under the public network address, calculating a numerical value obtained by subtracting the numerical value 1 from the intra-block multiplexing degree corresponding to the port block, and accumulating and summing the calculation results to obtain an overmuch multiplexing aggregation value; Dividing the excess cross-block aggregate value by the excess multiplexing aggregate value to obtain the contradiction index of the public network address.
6. The traffic identification model-based advertisement anti-cheating method of claim 1, wherein determining a primary home port block further comprises: If the occurrence frequency of the equipment identifier in the plurality of port blocks is the maximum value, selecting the port block with the smallest port block number value as the main home port block; Or selecting the port block in which the last active timestamp identified by the device is the latest as the primary home port block.
7. The traffic recognition model-based advertisement anti-cheating method of claim 1, wherein generating a conflict manifest comprises: for each port block other than the primary home port block, a conflict triplet record is generated containing the public network address, the device identification, and the number of the port block, and the conflict triplet record is added to a conflict list.
8. The traffic recognition model-based advertisement anti-cheating method of claim 1, further comprising, after generating the conflict manifest: Directly removing the occurrence frequency of the equipment identifier in all port blocks except the main home port block; Or removing the occurrence frequency of the equipment identifier in all port blocks except the main home port block, and accumulating the removed frequency value to the record of the main home port block.
9. The traffic recognition model-based advertisement anti-cheating method of claim 1, wherein performing differentiated handling comprises: if the communication request to be detected is in the real-time bidding stage, returning a response not participating in bidding to the source side sending the request or reducing the bidding weight aiming at the request; if the communication request to be detected is in the advertisement monitoring or settlement stage, marking the exposure or clicking action corresponding to the request as invalid flow, and refusing to pay the related fee when settlement is checked.
10. An advertisement anti-cheating system based on a traffic recognition model for implementing the advertisement anti-cheating method based on the traffic recognition model as claimed in any one of claims 1 to 9, comprising: the account book construction module is used for acquiring communication request data, extracting a public network address, a source port and a device identifier, carrying out numerical continuity analysis on the source port to divide port blocks, counting occurrence frequencies of the device identifier in each port block, and establishing a port block account book containing the association relation of the public network address, the port block and the device identifier; the contradiction calculation module is used for respectively counting the number of port blocks associated with the device identifications under the public network address based on the port block account book to be used as the cross-block dispersion, the number of different device identifications carried in the port blocks to be used as the intra-block multiplexing degree, and the ratio relation between the cross-block dispersion and the intra-block multiplexing degree to be used as a contradiction index; The attribution cleaning module is used for carrying out risk grading on the public network address according to the contradiction index, determining a port block with highest occurrence frequency as a main attribution port block aiming at the equipment identification under the target public network address, marking the association records of the equipment identification in other port blocks except the main attribution port block as structural conflicts, and generating a conflict list; And the traffic handling module is used for receiving the communication request to be detected, determining the current port block to which the source port belongs, searching in the conflict list by taking the public network address, the equipment identifier and the current port block of the communication request to be detected as search conditions, and executing differentiation handling if the consistent records are found.

Description

Advertisement anti-cheating method and system based on flow identification model Technical Field The invention relates to the technical field of internet advertisements, in particular to an advertisement anti-cheating method and system based on a traffic identification model. Background In the field of Internet advertisement delivery and flow distribution, accurately identifying the source main body of a communication request is the basis for guaranteeing the authenticity of advertisement transaction and the safety of a system. Conventional network anti-cheating systems typically consider public network internet protocol addresses as core identifiers defining the identity of the user, with each public network address defaulting to a separate home broadband account or single terminal device. Under the architecture, the risk control system often carries out aggregation statistics on historical access data from the same public network address, judges whether machine brushing amount or malicious attack behavior exists behind the address by monitoring fluctuation conditions of indexes such as request frequency, click rate distribution or access time interval and the like, and accordingly carries out interception or current limiting measures on the public network address. With the increasing exhaustion of internet protocol fourth edition address resources and the evolution of mobile communication networks, carrier-level network address conversion technology is generally deployed by basic network operators, so that massive mobile terminals or private network users need to share a very small amount of public network address outlets to access the internet. In such a network architecture, the superposition traffic of hundreds to thousands of individual terminal devices is actually carried behind a single public network address. The existing anti-cheating technology faces serious challenges when facing such a shared address scene, when a small number of terminal devices for executing malicious behaviors are mixed in a certain shared address, high-frequency attack indexes generated by abnormal terminals can be diluted and covered by compliance access data of most normal users in the same address, so that weak risk signals are difficult to capture by a detection model based on total amount statistics or average analysis, and therefore false negatives are generated. On the other hand, if the system adopts a cut-off blocking policy for the shared public network address in a high concurrency state in order to improve the detection rate, all innocent normal users under the address can not access the service, and a large range of service accidental injuries are caused. Therefore, how to break through the granularity limitation of public network addresses in the address sharing scene, accurately strip and locate the abnormal terminal from the mixed traffic is a problem to be solved in the current anti-cheating technology. Disclosure of Invention Aiming at the defects of the prior art, the invention provides an advertisement anti-cheating method and an advertisement anti-cheating system based on a traffic identification model, which solve the problems of abnormal traffic identification missing report and normal user accidental injury caused by incapability of breaking through public network address granularity limitation in a carrier-level network address conversion scene. In order to achieve the above purpose, the invention is realized by the following technical scheme: acquiring communication request data, extracting a public network address, a source port and a device identifier, carrying out numerical continuity analysis on the source port to divide port blocks, counting occurrence frequency of the device identifier in each port block, and establishing a port block account book containing association relations among the public network address, the port block and the device identifier; Based on the port block account book, respectively counting the number of port blocks associated with the device identifications under the public network address as the inter-block dispersion, taking the number of different device identifications carried in the port blocks as the intra-block multiplexing degree, and taking the ratio relation of the inter-block dispersion degree and the intra-block multiplexing degree as a contradiction index; performing risk rating on the public network address according to the contradiction index, determining a port block with highest occurrence frequency as a main home port block aiming at the equipment identification under the target public network address, marking the association records of the equipment identification in other port blocks except the main home port block as structural conflicts, and generating a conflict list; And receiving a communication request to be detected, determining a current port block to which a source port of the communication request to be detected belongs, searching in a con