CN-122022890-A - Cigarette consumption demand prediction method and device, electronic equipment and product
Abstract
The invention discloses a cigarette consumption demand prediction method, a device, electronic equipment and a product, and relates to the technical field of cigarette consumption prediction. The method comprises the steps of obtaining social consumption index system data of historical years and cigarette consumption data through correlation analysis, obtaining first correlation of index data and cigarette consumption data in the social consumption index system data, removing data items with low correlation to obtain screened social consumption index system data, analyzing second correlation of each data dimension, cigarette annual sales and cigarette single box coefficients, determining first data dimensions related to the annual sales and second data dimensions related to the single box coefficients based on the second correlation, taking all index data of the first data dimensions and the second data dimensions as input, taking annual sales and the single box coefficients as output training prediction models, and taking corresponding index data in the social consumption index data before the year to be tested as model input to obtain cigarette consumption prediction data of the year to be tested. The invention can accurately predict the consumption demand of cigarettes.
Inventors
- WANG XIAOFENG
- YANG PENG
- Yang Muhao
- ZHANG XINYUN
- CHEN FENGXIA
Assignees
- 四川省烟草公司资阳市公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260202
Claims (10)
- 1. A method for predicting cigarette consumption demand, comprising: Acquiring social consumption index system data and cigarette consumption data of a target area in a plurality of historical years, wherein the social consumption index system data comprises a plurality of data dimensions, each data dimension comprises at least one index data, the plurality of data dimensions comprise an industry structure dimension, a town and country structure dimension, an economic total amount dimension, employment and wage dimensions, an age structure dimension, a population size and flow dimension, a human capital dimension, a income level dimension, a consumption expenditure structure dimension and/or an overall consumption activity dimension, and the cigarette consumption data comprises cigarette annual sales volume of each price segment and cigarette single-box coefficients of each price segment; performing correlation analysis on all index data in the social consumption index system data of a plurality of historical years and cigarette consumption data of a plurality of historical years to obtain first correlation of all index data in the social consumption index system data of a plurality of historical years and the cigarette consumption data; removing data items, of which the first correlation with cigarette consumption data is lower than a preset correlation threshold value, in the social consumption index system data of a plurality of historical years to obtain screened social consumption index system data of a plurality of historical years; Performing correlation analysis on each data dimension in the screened social consumption index system data of a plurality of historical years, and the cigarette annual sales and the cigarette single box coefficients of each price segment in the cigarette consumption data of a plurality of historical years to obtain second correlation of each data dimension in the screened social consumption index system data of a plurality of historical years, the cigarette annual sales and the cigarette single box coefficients; Determining a first data dimension associated with the annual sales of cigarettes and a second data dimension associated with the single-box cigarette coefficient from the screened social consumption index system data of the historical years based on the second relativity of each data dimension in the screened social consumption index system data of the historical years with the annual sales of cigarettes and the single-box cigarette coefficient; taking all index data of a first data dimension in the screened social consumption index system data of at least one year before any one year and all index data of a second data dimension in the screened social consumption index system data of at least one year before any one year as sample input of a pre-established prediction model, and taking the cigarette annual sales of all price sections and the cigarette single-box coefficient of all price sections in the cigarette consumption data of any one year as sample output of the prediction model to train to obtain a trained prediction model; And calculating all index data corresponding to the index data input by the sample in social consumption index data of at least one year before the year to be measured in the target area as input of a trained prediction model to obtain cigarette consumption prediction data of the target area in the year to be measured, wherein the cigarette consumption prediction data comprises predicted cigarette annual sales of all price segments and predicted cigarette single-box coefficients of all price segments.
- 2. The method of claim 1, wherein performing a correlation analysis on each of the plurality of historical year social consumption index system data and the plurality of historical year cigarette consumption data to obtain a first correlation between each of the plurality of historical year social consumption index system data and the cigarette consumption data comprises: Extracting residual sequences of all index data in social consumption index system data of a plurality of historical years by a linear trending-based residual extraction method; Based on residual sequences of all index data in the social consumption index system data of a plurality of historical years and residual sequences of all index data in the cigarette consumption data of a plurality of historical years, determining first correlation of all index data in the social consumption index system data of a plurality of historical years and the cigarette consumption data through Pearson correlation analysis.
- 3. The method of claim 1, wherein performing correlation analysis on each data dimension in the filtered social consumption index system data of the plurality of historical years and the cigarette annual sales and the cigarette single box coefficients of each price segment in the cigarette consumption data of the plurality of historical years to obtain a second correlation of each data dimension in the filtered social consumption index system data of the plurality of historical years and the cigarette annual sales and the cigarette single box coefficients comprises: Performing standardized processing on all index data in the screened social consumption index system data of a plurality of historical years, and determining comprehensive scores of all data dimensions in the screened social consumption index system data of the plurality of historical years based on all index data in the screened social consumption index system data of the plurality of historical years after the standardized processing; Determining third relativity of each data dimension in the screened social consumption index system data of the plurality of historical years, the cigarette annual sales and the cigarette single box coefficients of each price segment based on the comprehensive scores of each data dimension in the screened social consumption index system data of the plurality of historical years and the cigarette annual sales and the cigarette single box coefficients of each price segment in the cigarette consumption data of the plurality of historical years through pearson correlation analysis; Taking comprehensive scores of all data dimensions in the screened social consumption index system data of a plurality of historical years as sample input, taking cigarette annual sales of all price sections in the cigarette consumption data of the historical years and cigarette single box coefficients of all price sections as sample output to train a pre-established XGBoost regression model, quantifying contribution of all input features in the trained XGBoost regression model through a SHAP game theory, and determining fourth relativity of all data dimensions in the screened social consumption index system data of the historical years, the cigarette annual sales and the cigarette single box coefficients; Wherein the second correlation includes the third correlation and the fourth correlation.
- 4. The method of claim 1, wherein determining the composite score for each data dimension in the filtered social consumption index system data for the plurality of historical years based on each index data in the filtered social consumption index system data for the plurality of historical years after the normalization process comprises: And carrying out arithmetic average on data items belonging to the same data dimension based on all index data in the screened social consumption index system data of a plurality of historical years after standardized processing to obtain comprehensive scores of all data dimensions in the screened social consumption index system data of a plurality of historical years.
- 5. The method of claim 1, wherein the prediction model comprises a first prediction model and a second prediction model; Correspondingly, each item of index data corresponding to the index data input by the sample in the social consumption index data of the target area in the year to be tested is used as input of a trained prediction model to calculate, so as to obtain the cigarette consumption prediction data of the target area in the year to be tested, and the method comprises the following steps: Calculating all index data corresponding to index data input by the sample in social consumption index data of a target area in the year to be tested as input of a trained first prediction model to obtain first cigarette consumption prediction data of the target area in the year to be tested, wherein the first cigarette consumption prediction data comprises first predicted cigarette annual sales of all price segments and first predicted cigarette single-box coefficients of all price segments; Calculating all index data corresponding to the index data input by the sample in the social consumption index data of the target area in the year to be tested as input of a trained second prediction model to obtain second cigarette consumption prediction data of the target area in the year to be tested, wherein the second cigarette consumption prediction data comprises second predicted cigarette annual sales of all price segments and second predicted cigarette single-box coefficients of all price segments; and carrying out weighted operation on the first cigarette consumption prediction data of the target area in the year to be measured and the second cigarette consumption prediction data of the target area in the year to be measured to obtain the cigarette consumption prediction data of the target area in the year to be measured.
- 6. The method of claim 5, wherein the first prediction model is XGBoost model and the second prediction model is BiLSTM model.
- 7. The method of claim 1, wherein the index data in the industry structure dimension comprises a total annual production value for each industry; index data in urban and rural structural dimensions includes end-of-year town population, end-of-year rural population, end-of-year non-agricultural population, and/or town ratio; Index data in the dimension of the total economic amount comprises a regional production total value, a people average GDP and/or a civil economic increment value; Index data in the employment and wage dimension includes non-private unit practitioner average wages, all unit practitioner average wages, social practitioner numbers, first industry practitioner numbers, second industry practitioner numbers, and/or third industry practitioner numbers; index data in the age structure dimension includes the general population under 18 years old, the general population 18-34 years old, the general population 35-59 years old, and/or the general population over 60 years old; index data in population size and mobile dimension includes end of year total population, end of year total population of men, end of year total population of women, birth population in year, migration population in year, and/or resident population; index data in the human capital dimension includes a college count for school students, a college count for higher professional technical schools, a college count for medium professional schools, and/or a college count for general middle schools; Index data in the income level dimension includes a income of a city resident, a income of a city resident of a third industry, a income of a city resident of a city, a income of a city resident of a third industry, a income of a country resident of a third industry, a income of a country resident of a country industry, a income of a country resident, and/or a income of a country resident of a third industry; the index data in the consumption expenditure structure dimension comprises the average cash expenditure of urban households, the average food tobacco and wine cash expenditure of urban households, the average cash expenditure of rural households and/or the average food tobacco and wine cash expenditure of rural households; The index data in the overall consumption activity dimension includes a social consumer retail total, a social consumer town retail total, a social consumer rural retail total, a commodity retail total, and/or a meal fee revenue total.
- 8. A cigarette consumption demand prediction apparatus, comprising: The system comprises an acquisition unit, a storage unit and a control unit, wherein the acquisition unit is used for acquiring social consumption index system data and cigarette consumption data of a target area in a plurality of historical years, the social consumption index system data comprises a plurality of data dimensions, each data dimension comprises at least one index data, the plurality of data dimensions comprise an industry structure dimension, an urban and rural structure dimension, an economic total amount dimension, a employment and wage dimension, an age structure dimension, a population scale and flow dimension, a human capital dimension, a income level dimension, a consumption expenditure structure dimension and/or an overall consumption activity dimension, and the cigarette consumption data comprises cigarette annual sales volume of each price segment and cigarette single box coefficient of each price segment; The first analysis unit is used for carrying out correlation analysis on all index data in the social consumption index system data of a plurality of historical years and cigarette consumption data of a plurality of historical years to obtain first correlation of all index data in the social consumption index system data of a plurality of historical years and the cigarette consumption data; The screening unit is used for eliminating data items, of which the first correlation with the cigarette consumption data is lower than a preset correlation threshold value, in the social consumption index system data of a plurality of historical years to obtain screened social consumption index system data of a plurality of historical years; The second analysis unit is used for carrying out correlation analysis on each data dimension in the screened social consumption index system data of a plurality of historical years, the cigarette annual sales of each price segment in the cigarette consumption data of a plurality of historical years and the cigarette single box coefficient to obtain second correlation of each data dimension in the screened social consumption index system data of a plurality of historical years, the cigarette annual sales and the cigarette single box coefficient; The determining unit is used for determining a first data dimension related to the annual sales of the cigarettes and a second data dimension related to the single-box coefficient of the cigarettes from the screened social consumption index system data of the historical years based on the second relativity of each data dimension in the screened social consumption index system data of the historical years to the annual sales of the cigarettes and the single-box coefficient of the cigarettes; The model training unit is used for training the first data dimension index data in the screened social consumption index system data of at least one year before any one year and the second data dimension index data in the screened social consumption index system data of at least one year before any one year as the sample input of a pre-established prediction model, and the cigarette annual sales of each price segment and the cigarette single box coefficient of each price segment in the cigarette consumption data of any year as the sample output of the prediction model to obtain a trained prediction model; The prediction unit is used for calculating all index data corresponding to the index data input by the sample in social consumption index data of at least one year before the year to be detected in the target area as input of a trained prediction model to obtain cigarette consumption prediction data of the target area in the year to be detected, wherein the cigarette consumption prediction data comprises predicted cigarette annual sales of all price sections and predicted cigarette single-box coefficients of all price sections.
- 9. An electronic device, comprising a memory, a processor and a transceiver, which are in communication connection in sequence, wherein the memory is used for storing a computer program, the transceiver is used for receiving and transmitting a message, and the processor is used for reading the computer program and executing the cigarette consumption demand prediction method according to any one of claims 1-7.
- 10. A computer program product comprising a computer program or instructions which, when executed by a computer, implements the cigarette consumption demand prediction method of any one of claims 1 to 7.
Description
Cigarette consumption demand prediction method and device, electronic equipment and product Technical Field The invention belongs to the technical field of cigarette consumption prediction, and particularly relates to a cigarette consumption demand prediction method, a device, electronic equipment and a product. Background The cigarettes are used as special commodities with financial properties and consumption properties, and the market operation of the cigarettes is not only constrained by policies such as internal quota and structure adjustment of industries, but also deeply embedded into macroscopic social and economic environments of the region where the cigarettes are located. For a local market grade tobacco business enterprise, whether the cigarette market capacity and the structure upgrading space can be scientifically researched and judged under the dual targets of taxes and profits growth and total quantity control becomes an important scale for measuring the marketing management capacity and the high-quality development level gradually. At present, for cigarette consumption demand prediction, prediction is mostly performed in industry based on historical sales data of cigarettes in industry, and the influence of external factors is not considered, so that accurate prediction of the sales of cigarettes and single-box coefficients is difficult to realize. Therefore, how to provide an effective scheme to accurately predict the sales and single-bin coefficients of cigarettes has become a urgent problem in the prior art. Disclosure of Invention The invention aims to provide a cigarette consumption demand prediction method, a device, electronic equipment and a product, which are used for solving the problems in the prior art. In order to achieve the above purpose, the present invention adopts the following technical scheme: In a first aspect, the present invention provides a method for predicting cigarette consumption demand, comprising: Acquiring social consumption index system data and cigarette consumption data of a target area in a plurality of historical years, wherein the social consumption index system data comprises a plurality of data dimensions, each data dimension comprises at least one index data, the plurality of data dimensions comprise an industry structure dimension, a town and country structure dimension, an economic total amount dimension, employment and wage dimensions, an age structure dimension, a population size and flow dimension, a human capital dimension, a income level dimension, a consumption expenditure structure dimension and/or an overall consumption activity dimension, and the cigarette consumption data comprises cigarette annual sales volume of each price segment and cigarette single-box coefficients of each price segment; performing correlation analysis on all index data in the social consumption index system data of a plurality of historical years and cigarette consumption data of a plurality of historical years to obtain first correlation of all index data in the social consumption index system data of a plurality of historical years and the cigarette consumption data; removing data items, of which the first correlation with cigarette consumption data is lower than a preset correlation threshold value, in the social consumption index system data of a plurality of historical years to obtain screened social consumption index system data of a plurality of historical years; Performing correlation analysis on each data dimension in the screened social consumption index system data of a plurality of historical years, and the cigarette annual sales and the cigarette single box coefficients of each price segment in the cigarette consumption data of a plurality of historical years to obtain second correlation of each data dimension in the screened social consumption index system data of a plurality of historical years, the cigarette annual sales and the cigarette single box coefficients; Determining a first data dimension associated with the annual sales of cigarettes and a second data dimension associated with the single-box cigarette coefficient from the screened social consumption index system data of the historical years based on the second relativity of each data dimension in the screened social consumption index system data of the historical years with the annual sales of cigarettes and the single-box cigarette coefficient; taking all index data of a first data dimension in the screened social consumption index system data of at least one year before any one year and all index data of a second data dimension in the screened social consumption index system data of at least one year before any one year as sample input of a pre-established prediction model, and taking the cigarette annual sales of all price sections and the cigarette single-box coefficient of all price sections in the cigarette consumption data of any one year as sample output of the prediction model to tra