Search

CN-121997190-A - Digital marketing big data processing method and system

CN121997190ACN 121997190 ACN121997190 ACN 121997190ACN-121997190-A

Abstract

The invention relates to the technical field of data processing, in particular to a digital marketing big data processing method and a system, wherein the invention extracts channel distribution and click fields for marketing participant identification and training round number and executes segmentation left shift or remainder calculation, advertisement bid values are converted into integers through fractional truncation and multiple multiplication products, abnormal scores are calculated based on isolated forest multi-tree splitting depth after removing interval values, abnormal items are removed, and then bit addition and carry are executed by peak valley values and modulus, generating an accumulated ciphertext sequence, obtaining a safe aggregation gradient set through comparison statistics of bit by bit and abstract bit sequences, performing index item by item product and row level accumulation on row vectors corresponding to the aggregation gradient set to obtain an update section, performing item by item addition on the update section and corresponding column vectors, and performing local update on the update section by multiplying the update section by a set scale factor, wherein the BitFit mode is adopted to realize weight adjustment only aiming at key parameters of a submatrix, avoid global synchronization and reduce calculation load.

Inventors

  • LI YAOYAO

Assignees

  • 广州大事件网络科技有限公司

Dates

Publication Date
20260508
Application Date
20251225

Claims (10)

  1. 1. A method for processing digital marketing big data, which is characterized by comprising the following steps: S1, extracting channel distribution vectors and click event fields based on mark of each marketing participant and training round numbers, dividing the spliced mark according to 8 bits, shifting left, sequentially xoring and merging, and taking the total number of residual nodes from the combined result to generate an encrypted communication index table; S2, based on the encryption communication index table, adopting an isolated forest algorithm, converting advertisement bid values into fixed-point forms according to numerical precision, then executing modulus multiplication and accumulation operation on each digit value, and performing consistency check on the obtained ciphertext sequence and each participant abstract bit sequence by bit comparison to generate a safe aggregation gradient set; S3, based on the safety aggregation gradient set, performing inner product calculation on row vectors and column vectors of the advertisement casting weight submatrices by adopting BitFit algorithm, subtracting the feature vectors and the atomic array item by item to obtain difference vectors, multiplying each element of the difference vectors by a learning rate, and adding the elements of the difference vectors with corresponding elements of the original submatrices to obtain a local adaptive weight matrix; S4, based on the local adaptive weight matrix, transversely cutting the matrix, extracting sub-matrices, respectively covering the sub-matrices with average gradient matrices of corresponding columns in the safety aggregation gradient set, and re-splicing all the sub-matrices according to the original sequence to generate a field identification value vector sequence; S5, dividing the sequence into a plurality of position segments according to a fixed length based on the dispatching task allocation list, rearranging the sequence, extracting the index mapping relation of each position segment in the original sequence, and executing segment rearranging and number updating by combining the displacement difference value of each segment to obtain a recombined bit sequence index table.
  2. 2. The method for processing the digitized marketing big data according to claim 1, wherein the encrypted communication index table comprises a participant identification mapping value, a training round index value, an exclusive or merging result and a surplus index value, the security aggregation gradient set comprises an encryption gradient sequence, an accumulated ciphertext result, a summary bit sequence and a consistency check mark of each participant, the local adaptation weight matrix comprises an updated backward vector value, an updated column vector value, a difference accumulation result and a learning rate product matrix, the field identification value vector sequence comprises an overlaid sub-matrix segment, a splicing order index, a vectorization value array and a sequence length mark, and the reorganization bit sequence index table comprises an original index map, a new index map, a displacement difference list and a number updating map.
  3. 3. The digital marketing big data processing method of claim 1, wherein the specific steps of generating the encrypted communication index table are: Extracting channel distribution vectors and click event fields based on the marks of all marketing participants and training round numbers, positioning and reading channel marks and click magnitude values through the fields, aligning the sequences, splicing the field values, dividing the field values into 8-bit paragraphs according to byte boundaries, and generating an identification string segmentation set; based on the identification string segment set, performing left shift operation on each segment of numerical value, performing exclusive-or combination, and performing integer division operation on the combination result and the number of nodes to obtain remainder, so as to generate an encrypted communication index table.
  4. 4. The digital marketing big data processing method of claim 1, wherein the specific steps of generating the secure aggregate gradient set are: Based on the encryption communication index table, extracting each advertisement bid value, intercepting decimal digits, multiplying the decimal digits by a fixed multiple, converting the decimal digits into integers, verifying the numerical range of an integer value sequence, clearing overflow items, and generating a fixed point number list; Based on the fixed point number list, carrying out real-time evaluation on abnormal scores of each numerical value in the list by adopting an isolated forest algorithm, removing bidding terms with scores exceeding a threshold value, respectively extracting peak values and valley values, carrying out multiplication operation with a specified modulus, carrying out bitwise addition on products and carrying out item-by-item carry adjustment on the sum, and generating an accumulated ciphertext sequence; Based on the accumulated ciphertext sequence, each bit in the ciphertext sequence is compared with the abstract bit provided by the corresponding party, and statistical analysis is performed after the comparison result is marked as consistent and conflict, so that a safe aggregation gradient set is obtained.
  5. 5. The method for processing the digitized marketing big data according to claim 4, wherein the isolated forest is characterized in that each item of numerical value is input into an isolated forest model based on an advertisement bid fixed point numerical value list extracted by the encrypted communication index table, an abnormal score is calculated according to path lengths in binary trees constructed by a plurality of random sub-samples, each tree is constructed by randomly selecting sub-samples from an original data set, node division is performed according to random selection of features and random division of a threshold until the samples are completely isolated and reach a preset tree depth, the abnormal score is deduced from the difference between the average path length of the samples and the theoretical expected path length, all fixed point numerical values are judged according to scores, the samples with the scores exceeding the set threshold are removed, and the bid item judged to be normal is reserved and enters a subsequent processing flow.
  6. 6. The digital marketing big data processing method of claim 1, wherein the specific steps of generating the local adaptation weight matrix are: Based on the security aggregation gradient set, reading row vectors of the positions corresponding to the gradients and the global weight matrix, matching and verifying the row length, and recording an effective index section to generate a gradient row vector set; based on the gradient row vector set, performing multiplication operation on corresponding elements of each group of vectors one by one according to an index alignment sequence, recording a product sequence, performing row-level accumulation operation on the product sequence, and storing the product sequence into a temporary storage area to obtain a row vector updating section; Based on the row vector updating section, a BitFit algorithm is adopted to carry out addition operation on each updating value and the current column vector element, the updating value and the current column vector element are multiplied by a set scaling factor, and the updating values and the current column vector element are combined into the corresponding area of the original matrix according to the column index, so that the local adaptive weight matrix is obtained.
  7. 7. The method of claim 6, wherein the BitFit algorithm selects a set of column vector elements corresponding to the updated segment index in the original global weight matrix, adds each updated value to the corresponding column vector element, immediately multiplies the addition result by a preset scale factor, the factor is a fixed scalar, and then writes the fixed scalar into the corresponding matrix area according to the original column index position to complete the local fine tuning update of the weight matrix.
  8. 8. The method for processing digital marketing big data according to claim 1, wherein the specific steps of generating the field identification value vector sequence are: Based on the local adaptive weight matrix, firstly segmenting the matrix from left to right according to column width, then sequentially extracting corresponding column blocks from each segment to store the corresponding column blocks into a list, marking the initial column index and the end column index of each segment, and generating a submatrix block list; based on the sub-matrix block list, replacing the numerical value in the corresponding column position of each segment by the same-column average value in the safety aggregation gradient set, and splicing and restoring each segment into a complete matrix according to the original column index sequence to generate a field identification value vector sequence.
  9. 9. The digital marketing big data processing method of claim 1, wherein the specific steps of generating the dispatch task allocation list are: Dividing the list into a plurality of segments according to a fixed length from the beginning based on the dispatching task allocation list, adding a number identifier to each segment and recording the index position of the element in the segment in the original list to generate a position segment mapping set; And rearranging the element sequence in each segment according to the original index mapping result based on the position segment mapping set, updating the segment number mapping table, and outputting all segments according to the number combination to generate a recombined bit sequence index table.
  10. 10. A digital marketing big data processing system is characterized in that, the digital marketing big data processing method of any of claims 1-9, the system comprising: The identification index construction module is used for extracting channel distribution vectors and click event fields based on the identification of the marketing participants and the training round number, splicing and cutting the channel distribution vectors and the click event fields into 8-bit segments, leftwards shifting or merging the channel distribution vectors and the click event fields, and then taking the total number of nodes for redundancy to generate an encrypted communication index table; The bid anomaly judging module extracts advertisement bid values based on the encryption communication index table, performs decimal cutoff multiplication on fixed multiple conversion integers and removes super-threshold items to form a fixed point number list, obtains average path values by adopting isolated forests according to multi-tree splitting comparison to calculate anomaly scores, eliminates products of peak-valley values selected by the anomaly values and specified moduli, performs bit addition and bit adjustment to obtain an accumulated ciphertext sequence, and performs bit-by-bit comparison statistics with the abstract bit sequence to obtain a safe aggregation gradient set; The weight matrix adaptation module extracts the row vector of the global matrix based on the security aggregation gradient set, compares the length record indexes, accumulates the row of the item-by-item product to generate a row vector update section, and uses BitFit to update the extracted column to the corresponding position of the matrix to obtain a local adaptation weight matrix; The matrix gradient covering module is used for dividing the local adaptive weight matrix according to columns to obtain a sub-matrix block list, replacing the sub-matrix block list by a column average value corresponding to the security aggregation gradient set, and splicing the sub-matrix block list according to an original index to form a field identification value vector sequence; and the sequence index rearrangement module is used for forming a position segment mapping set based on the field identification value vector sequence and the scheduling task allocation list according to the fixed-length segment number record index, and outputting a rearrangement bit sequence index table according to the mapping rearrangement combination.

Description

Digital marketing big data processing method and system Technical Field The invention relates to the technical field of data processing, in particular to a method and a system for processing digital marketing big data. Background The technical field of data processing aims at orderly, efficiently and structurally converting and utilizing original data, supporting subsequent calculation, analysis, modeling and decision making, solving the problem of processing flow of mass, multi-source and heterogeneous data after acquisition, improving the usability, consistency and computing performance of the data, enabling the data to have the analyzability and operability, and providing stable and accurate data support for upper algorithms, application systems and business flow. A digital marketing big data processing method aims at realizing accurate customer identification, marketing path evaluation and delivery effect prediction, and aims at processing large-scale original data in a marketing scene in a systematic and structured mode, supporting quantitative analysis and dynamic adjustment of a marketing strategy, improving the accuracy of customer conversion rate identification, shortening data processing and response time and enhancing the marketing automation capability of data driving. In the prior art, aiming at multisource marketing data, the multisource marketing data is usually stored in a tiling and table connection mode, mapping processing based on node total number or bit operation is not needed, so that distribution unbalance and positioning efficiency are low, bid values are screened only according to fixed areas or simple deviation values, a method for measuring abnormality according to sample representation in depths of a plurality of random trees is not needed, short-term fluctuation misjudgment and omission hidden abnormality influence the reliability of subsequent data, matrix updating is not usually carried out, average replacement according to columns is not carried out, parameter faults easily occur among result fragments, task allocation is carried out according to static table sequence, original index mapping and dynamic paragraph rearrangement are not reserved, flexible traceability is not needed in scheduling, the execution sequence is difficult to dynamically optimize according to data difference, and the intelligent level of scheduling is further limited. Disclosure of Invention The invention aims to solve the defects in the prior art, and provides a digital marketing big data processing method and system. In order to achieve the purpose, the invention adopts the following technical scheme that the digital marketing big data processing method comprises the following steps: S1, extracting channel distribution vectors and click event fields based on mark of each marketing participant and training round numbers, dividing the spliced mark according to 8 bits, shifting left, sequentially xoring and merging, and taking the total number of residual nodes from the combined result to generate an encrypted communication index table; S2, based on the encryption communication index table, adopting an isolated forest algorithm, converting advertisement bid values into fixed-point forms according to numerical precision, then executing modulus multiplication and accumulation operation on each digit value, and performing consistency check on the obtained ciphertext sequence and each participant abstract bit sequence by bit comparison to generate a safe aggregation gradient set; S3, based on the safety aggregation gradient set, performing inner product calculation on row vectors and column vectors of the advertisement casting weight submatrices by adopting BitFit algorithm, subtracting the feature vectors and the atomic array item by item to obtain difference vectors, multiplying each element of the difference vectors by a learning rate, and adding the elements of the difference vectors with corresponding elements of the original submatrices to obtain a local adaptive weight matrix; S4, based on the local adaptive weight matrix, transversely cutting the matrix, extracting sub-matrices, respectively covering the sub-matrices with average gradient matrices of corresponding columns in the safety aggregation gradient set, and re-splicing all the sub-matrices according to the original sequence to generate a field identification value vector sequence; S5, dividing the sequence into a plurality of position segments according to a fixed length based on the dispatching task allocation list, rearranging the sequence, extracting the index mapping relation of each position segment in the original sequence, and executing segment rearranging and number updating by combining the displacement difference value of each segment to obtain a recombined bit sequence index table. As a further scheme of the invention, the encryption communication index table comprises a participant identification mapping value, a training round i