Search

CN-120634624-B - Customer portrait construction method based on big data acquisition

CN120634624BCN 120634624 BCN120634624 BCN 120634624BCN-120634624-B

Abstract

The invention relates to the technical field of customer portrait construction, and discloses a customer portrait construction method based on big data acquisition, which comprises the steps of constructing a customer feature matrix and carrying out missing value processing and standardization by acquiring two types of core data sources, namely a website click log and a transaction record; the method comprises the steps of combining processed customer feature matrixes to automatically evaluate feature significance, generating feature weight vectors, calculating customer image scalar by means of weighted summarization, achieving unsupervised clustering by means of density clustering of self-adaptive thresholds based on image and neighbor distance sets, finally performing differentiated marketing management and resource allocation by combining image scores and clustering labels, and achieving closed-loop optimization from data acquisition to marketing execution according to marketing conversion rate change and real-time self-correction of clustering parameters.

Inventors

  • ZHOU YULIN

Assignees

  • 深圳市天禹数智科技有限公司

Dates

Publication Date
20260512
Application Date
20250716

Claims (7)

  1. 1. A customer image construction method based on big data acquisition is characterized by comprising the following steps: Collecting multisource behavior data and transaction data containing unique customer identification to generate an original data set And ; Based on the original data set, extracting numerical characteristics related to customer behaviors, and constructing a customer characteristic matrix ; Performing missing value processing and normalization operation on the customer characteristic matrix, and outputting a standardized matrix ; Calculating a saliency score for each feature based on the normalized matrix Generating feature saliency vectors ; Calculating the portrait scalar value of each client based on the standardized features and the saliency vector ; Based on the customer portrait scalar, combining the neighbor distance set obtained by calculation, and adopting a density-based clustering algorithm to perform customer clustering to generate a customer clustering label set ; Based on the client image scalar and the grouping label set, personalized marketing hierarchical management is executed, including resource allocation and content delivery strategy formulation, and self-adaptive correction is carried out on grouping parameters based on the change of the client conversion rate after grouping; wherein the acquisition of the multi-source behavior data and the transaction data containing the unique identification of the client generates an original data set And Comprising: Applying for acquiring information acquisition rights of acquiring client website click logs and CRM transaction records; setting a time interval for collecting the click log of the client website, and recording the time interval as a log collecting interval; setting an acquisition time point for acquiring customer CRM transaction records every day, and recording the acquisition time point as a record acquisition node; for client website click logs, the containing fields are derived from the Web server at every log collection interval ; For a customer's CRM transaction records, at each record collection node, a containing field is derived from the CRM system Is a standard table of (2); Establishing an original record set of click logs And transaction record original record set ; Writing derived website click logs Each record is formatted as: Writing derived CRM transaction records Each record is formatted as: Wherein the method comprises the steps of Respectively expressed in And In (a) customer Website click logs and CRM transaction records; wherein the computing of the saliency score for each feature based on the normalized matrix generates a feature saliency vector Comprising: For standardized matrix Is defined in the specification, each column of: Calculating a probability distribution matrix : Calculating information entropy : Wherein, the Represents the first Information entropy of the features; Performing significance score assignment: Wherein, the Representing the feature quantity in the customer feature matrix, wherein the value is 5; representing the first of all features Information entropy of each feature; ; Calculating to obtain a standardized matrix The saliency score of each column is summarized to obtain a saliency vector ; The client image scalar-based method comprises the steps of carrying out client clustering by adopting a density-based clustering algorithm in combination with a neighbor distance set obtained by calculation, and generating a client clustering label set, wherein the client clustering label set comprises the following steps: setting for neighbor distance statistics For each customer image point Calculate the first Nearest neighbor distance Form a collection , Total number of customers; Constructing adaptive thresholds : Wherein, the In order to be the median of the distances, Is a quarter bit distance; the method is used for adjusting a distance threshold value in density clustering as an initial experience coefficient; The distance median and the quartile range are specifically as follows: Will be Sorting from small to large, and taking the first after sorting A number of elements, which is used as a distance median; Will be After sorting from small to large, selecting the first in the sorting Individual elements as Select the first Individual elements as Then ; Calling by adopting density clustering technology Generating customer grouping label set 。
  2. 2. The method for constructing a customer image based on big data collection according to claim 1, wherein the method is characterized in that based on the original data set, numerical characteristics related to customer behaviors are extracted to construct a customer characteristic matrix Comprising: From the slave And In, for each customer The following basic features are constructed: website access times over 7 days; Total transaction amount over the past 30 days; Last visit is from the current day; The last transaction is distant from the current days; Primary transaction channel codes, including online=1, offline=2; All clients are spelled into a matrix: wherein each row of the matrix represents a customer, the first Behavioural clients Is in turn 、 、 、 、 。
  3. 3. The method for constructing a customer image based on big data acquisition according to claim 1, wherein the missing value processing and normalization operations are performed on the customer feature matrix, and a standardized matrix is output Comprising: Obtaining a customer feature matrix The first of (3) Number of deletions in a column The absence represents the first Null, or special placeholder present in the column; For customer characteristic matrix The first of (3) Column calculation deletion ratio: Wherein the method comprises the steps of For customer feature matrix The number of rows of (3); setting a decision threshold for determining whether the missing is excessive ; If it is Then by the first Median of column non-null values Filling the missing; Wherein the median is To get the first Non-null value of column after ascending order Data; otherwise determine the first Excessive column miss for the first The column data is subjected to deletion processing.
  4. 4. The method for constructing a customer image based on big data acquisition according to claim 3, wherein the missing value processing and normalization operations are performed on the customer feature matrix, and a standardized matrix is output Further comprising: all columns after screening by deletion were calculated separately: Wherein, the Represent the first The average value of the individual features is used, Represent the first Standard deviation of individual features; and (3) carrying out normalization transformation: Wherein the method comprises the steps of Derived from customer feature matrices Represents the first Strip record No A plurality of features; normalized matrix of output 。
  5. 5. The method for constructing a customer image based on big data collection according to claim 1, wherein calculating the scalar value of the image of each customer based on the normalized feature and the saliency vector comprises: Weighting the normalized features by using the significance scores to obtain the portrait scalar of each customer; Wherein, the Representing the feature quantity in the customer feature matrix, wherein the value is 5; is the first A saliency score for each feature; is the first The individual customer is at the first Normalized values on each feature.
  6. 6. The method for constructing a customer image based on big data collection according to claim 5, wherein the step of performing personalized marketing hierarchical management based on the customer image scalar and the grouping tag set, including resource allocation and content delivery policy formulation, and adaptively correcting grouping parameters based on the grouped customer conversion rate change comprises the steps of: All clients are labeled with their labels Carrying out cluster division, wherein each customer cluster label set is a cluster; For each cluster, an average portrait value is calculated: Wherein, the Representing clusters Number of clients in; Representing clusters The sum of the portraits of all clients in the interior; distributing popularization budget proportionally: Wherein, the Representation allocation to clusters Marketing budget of the inner client; Representing a total marketing budget available within a current marketing period; the sum of the image means for all clusters.
  7. 7. The method for constructing a customer image based on big data collection according to claim 6, wherein the personalized marketing hierarchical management is performed based on the customer image scalar and the grouping label set, including resource allocation and content delivery policy formulation, and adaptively correcting grouping parameters based on the grouped customer conversion rate change, further comprising: setting a behavior target of successful conversion; Recording clusters after each round of marketing delivery The total number of clients whose behavioral targets occur as the conversion success client number ; Computing clusters Conversion of (2) : Wherein, the Representing clusters Number of clients in; Calculating the conversion rate difference after two adjacent marketing projects : Wherein, the Indicating the conversion rate after the marketing delivery of the present round, Representing the conversion rate after the marketing and delivery of the previous round; Setting a clustering threshold coefficient for judging whether to update Threshold of (2) ; If it is Then the original clustering threshold coefficient is maintained Unchanged; When (when) Updating the cluster threshold coefficient : Wherein, the A cluster threshold coefficient used in the present round of marketing; Representing step size factor, control The magnitude of each update; Positive increase threshold, negative decrease threshold as a sign function; To be used for Reconstructing adaptive thresholds : Will be And clustering portraits for the clients for the next round of marketing.

Description

Customer portrait construction method based on big data acquisition Technical Field The invention relates to the technical field of customer portrait construction, in particular to a customer portrait construction method based on big data acquisition. Background The construction process of the customer image is dependent on deep acquisition, analysis and modeling of customer behavior data as a key support technology for modern enterprise accurate marketing and customer relationship management, and along with development of a digital platform and online business, customers frequently leave behavior tracks of access, transaction, interaction and the like in a plurality of channels, so that image data sources are increasingly abundant, dimension structures are increasingly complex, however, despite the fact that data volume is sufficient, the conventional customer image construction technology still has defects of a plurality of layers, and particularly has limitation in the aspects of actual marketing strategy linkage and system self-adaptive optimization capability. The current common customer portrait technology mostly adopts an offline acquisition mode based on platform logs and builds a user portrait through manually setting rules or predefined weight combinations, in the aspect of feature selection, expert experience is still dominant, fixed weight or single index sorting is adopted to conduct subjective screening, a part of technology introduces statistical methods such as information gain, mutual information and the like to evaluate feature importance but lacks comprehensive measurement capability of feature global distribution stability, in the aspect of cluster modeling, most schemes depend on classical algorithms such as K-Means or DBSCAN and the like, but usually use fixed parameter settings, lack dynamic adaptation mechanisms of data distribution fluctuation, and for cold-start customers or short-term fluctuation scenes, model construction is difficult to stably support, in addition, in the aspect of portrait output application, most systems only generate static labels or classification results for subsequent calling, and lack an effective feedback mechanism between actual marketing delivery results. Therefore, the invention provides a full-flow self-adaptive client portrait construction method without experience weight and capable of feeding back online, so as to support the enterprise intelligent marketing and the continuous optimization of a precise marketing scheme. Disclosure of Invention The invention provides a customer portrait construction method based on big data acquisition, which promotes and solves the problems of the prior art in the background technology that the data acquisition is dependent on offline logs, excessively depends on expert preset rules or fixed weights, lacks comprehensive evaluation on global distribution stability, solidifies algorithm parameters, lacks dynamic adaptive data fluctuation capability, has weak cold start and short-term fluctuation processing, and only outputs static labels, lacks and has a closed-loop optimization mechanism of actual marketing effect. The invention provides a customer portrait construction method based on big data acquisition, which comprises the following steps: Collecting multi-source behavior data and transaction data containing unique customer identifiers, and generating original data sets D clicks and D tx; Based on the original data set, extracting numerical characteristics related to customer behaviors, and constructing a customer characteristic matrix X; performing missing value processing and normalization operation on the client feature matrix, and outputting a standardized matrix X '= [ X' ij ]; calculating a saliency score s j of each feature based on the standardized matrix to generate a feature saliency vector Calculating an image scalar value p i for each client based on the normalized features and the saliency vector; Based on the customer portrait scalar, combining the neighbor distance set obtained by calculation, and adopting a density-based clustering algorithm to perform customer clustering to generate a customer clustering label set { C i }; and based on the client image scalar and the grouping label set, personalized marketing hierarchical management is executed, including resource allocation and content delivery strategy formulation, and self-adaptive correction is carried out on grouping parameters based on the grouped client conversion rate change. Optionally, the collecting the multi-source behavior data and the transaction data including the unique identification of the client generates the original data sets D clicks and D tx, including: Applying for acquiring information acquisition rights of acquiring client website click logs and CRM transaction records; setting a time interval for collecting the click log of the client website, and recording the time interval as a log collecting interval; setting an ac