CN-121996914-A - Dynamic crowd sensing true value discovery method for resource optimization

CN121996914ACN 121996914 ACN121996914 ACN 121996914ACN-121996914-A

Abstract

The invention relates to a dynamic crowd sensing true value discovery method for resource optimization, belonging to the field of crowd sensing and crowd-sourced data management. The method comprises the steps of obtaining initial capacity estimated values of users through initial golden standard problem tasks, distributing common perception tasks for the users after sequencing, carrying out iterative computation on perception data of all users participating in the same common perception task to obtain final true value estimation of the current common perception task, calculating temporary capacity indexes of the users, constructing state vectors, inputting the state vectors into a reinforcement learning model to obtain decisions of golden standard problem calibration tasks, distributing the golden standard problem calibration tasks for the users, generating reward signals based on task release cost to update parameters of the reinforcement learning model, and optimizing generation of the decisions. The method aims to solve the technical problems that in the prior art, a task allocation strategy cannot be dynamically adjusted, and a GSQ release mode with fixed frequency causes resource waste or insufficient calibration precision.

Inventors

FU XIAODONG
LI JINGCHENG
LIU CHI
PENG WEI
DING JIAMAN

Assignees

昆明理工大学

Dates

Publication Date: 20260508
Application Date: 20260107

Claims (7)

1. A method for dynamic crowd sensing true value discovery of resource optimization, characterized in that the method comprises the following steps: Step1, calculating to obtain an initial capacity estimated value of the user according to an error between a result of actually completing the initial golden standard problem task by the user and a known true value of the initial golden standard problem task; Step2, sorting according to the initial capacity estimated value of each user from high to low, and distributing common perception tasks to each user according to sorting results, wherein the common perception tasks are commonly completed by a plurality of users; Step3, carrying out iterative computation on the perception data of all users participating in the same common perception task to obtain the final true value estimation of the current common perception task, and calculating temporary capability indexes of all users according to the deviation between the perception data of each user and the final true value estimation; step4, constructing a state vector based on an initial capacity estimated value and a temporary capacity index of a user, and inputting the state vector into a reinforcement learning model to obtain a decision of a golden standard problem calibration task output by the reinforcement learning model so as to distribute the golden standard problem calibration task to the user based on a decision result, wherein the reinforcement learning model comprises the state vector, an action space and a reward function; step5, generating a reward signal based on the estimated value of the user capacity and the task release cost, and updating the parameters of the reinforcement learning model by using the reward signal so as to optimize the generation of the decision based on the updated parameters.
2. The method for discovering dynamic crowd sensing truth value for resource optimization according to claim 1, wherein the expression of the initial capability estimation value of the user is: ; Wherein, the For users Is used to determine the initial capability estimate of the (c), For errors in the user measurement value and the actual value of the calibration task, Is the upper limit of the error that is preset, A lower threshold is estimated for user capacity.
3. The method for discovering dynamic crowd sensing truth value for resource optimization according to claim 2, wherein Step3 is specifically: step3.1, according to the initial capacity estimation value of the user, carrying out normalization on the initial weight of each user participating in the same common perception task, wherein the expression is as follows: ; Wherein, the For users Is used to determine the initial weight of the (c) for the (c), To participate in this round in the same general perception task, Is a user index variable participating in the same common perception task of the round, and the value range is 1 to 1 For referring to each user one by one; step3.2, calculating a weighted average based on the initial weights, as a result of the first aggregation, expressed as: ; Wherein, the For the initial aggregate estimate value, For users Is a measurement of (2); Step3.3-carrying out In each iteration, the absolute error between the measured value of each user and the current aggregate result The method comprises the following steps: ; Wherein, the Is the first And (3) distributing error weights according to the error magnitudes in the iterative process of the aggregate estimated values of the multiple iterations, wherein the error weights are calculated as follows: ; Wherein, the For users In the first place The error weight obtained by the iteration is used for the iteration, Is a constant; Step3.4, combining the error weight with the user capacity estimated value to obtain a comprehensive weight, wherein the comprehensive weight calculation formula is as follows: ; Wherein, the For users In the first place The comprehensive weight obtained by the iteration is normalized to obtain the first Weights for +1 iterations The expression is: ; step3.5 calculation of the first The result of +1 iterations gives a new polymerization result The expression is: ; And step3.6, stopping iteration when the maximum relative change of the weight is smaller than a preset iteration convergence threshold, wherein the expression is as follows: ; Wherein, the Is an iterative convergence threshold; step3.7, after the user completes the common perception task, obtaining a temporary capacity index according to the error between the user measured value and the final aggregation result, wherein the expression is as follows: ; Wherein, the For users Is used to determine the temporary capacity value of the (c), Is the error between the user measurement and the final aggregate result.
4. The method for discovering dynamic crowd sensing truth value for resource optimization according to claim 1, wherein the reinforcement learning model is specifically: State vector S i = [ ability, interval, variance, direction_mae ], wherein ability is the current capacity estimation value of the user and is used for reflecting the quality level of the user completing the task, interval is the number of the common perception tasks completed since the user completes the gold standard problem task last time, variance is the standard deviation of the most recent n times of capacity estimation of the user and is used for reflecting the uncertainty or fluctuation degree of the capacity estimation, and direction_mae is the average absolute error value of the user in the most recent n times of common perception tasks and is used for reflecting the error between the perception data and the aggregation true value of the most recent n times of users; an action space A= { issue_GSQ, not_issue }, wherein the issue_GSQ represents that a user issues a gold standard problem for calibrating the capability of the user; The reward function reward= (base_reward+gsq_total+stability_reward), wherein base_reward reflects whether the gold standard problem GSQ release decision improves the accuracy of the user capability estimation, gsq_total is the effect and frequency of controlling GSQ usage, ensuring that GSQ release is valuable, and stability_reward is the encouragement to produce a stable and reliable capability estimation.
5. The method for dynamic crowd-aware truth discovery of resource optimization of claim 4, wherein the base_forward expression is: ; Wherein, the To adjust the threshold, for determining whether the magnitude of the adjustment is within a predefined range, Is a natural constant which is used for the production of the high-temperature-resistant ceramic material, Estimating for new capabilities And old capability estimation For measuring the amplitude of the adjustment; For a positive prize coefficient, for adjusting the prize strength, For a negative penalty factor, for adjusting the penalty strength, And For the error attenuation coefficient, the sensitivity of rewards and penalties along with the change of the adjustment amplitude is controlled respectively.
6. The resource-optimized dynamic crowd-sourced awareness truth-value discovery method of claim 4, wherein the gsq_total expression is: ; Wherein, the Awarding GSQ effect, the expression is: ; Wherein, the The coefficients are awarded for the GSQ effect, For the GSQ invalidation penalty factor, And The reward and punishment intensity is used for controlling the GSQ calibration effect; in order to effectively calibrate the error threshold value, In order to disable the calibration error threshold, And An error threshold for determining whether GSQ is valid or invalid, Absolute difference for new and old capability estimates; Wherein, the Awarding the GSQ frequency, wherein the expression is as follows: ; Wherein, the To accumulate the number of GSQs published for a user, 、 Segmenting a threshold for GSQ frequency; in order to use the bonus coefficients early, In order to overuse the penalty coefficients, And The rewarding and punishing intensity is used for controlling the GSQ use frequency; Attenuation coefficients are penalized for excess.
7. The method for dynamic crowd-aware truth discovery of resource optimization of claim 4, wherein the stability_forward expression is: ; Wherein, the Variance of the capability estimation is used for measuring the stability of the capability of the user; A bonus base coefficient for stability, a bonus upper limit for controlling a stability capability value, For stability decay factor, for controlling reward-with-ability estimation variance Increased decay rate.

Description

Dynamic crowd sensing true value discovery method for resource optimization Technical Field The invention relates to a dynamic crowd sensing true value discovery method for resource optimization, belonging to the field of crowd sensing and crowd-sourced data management. Background The rapid development of the internet and mobile computing technology has driven the wide application of crowd sensing (Crowdsensing) and crowd sourcing (Crowdsourcing) modes in the fields of environmental monitoring, traffic management, business decision making, data labeling and the like. Crowd sensing relies on a large number of users to collect and submit sensing data through intelligent devices to complete various tasks. However, due to uneven user capability, perceived environmental noise interference and possibly malicious attack, the collected data has the problems of low quality, poor reliability and the like, and the accuracy of subsequent data analysis and decision making is directly affected. To improve data quality, truth-finding techniques are widely used to estimate the true value of a task from the noisy answers of multiple users. Well-known truth-finding methods include simple averaging, weighted averaging, and the like. In addition, to calibrate user's ability, reduce systematic errors ,Panagiotis G. Ipeirotis and Evgeniy Gabrilovich. 2014. Quizz: Targeted crowdsourcing with a billion (potential) users. In Proceedings of the 23rd International Conference on World Wide Web (WWW'14). ACM, 143–154, propose a gold standard question (Gold Standard Questions, GSQ) as a benchmark task, whose true value is known, user reliability is assessed by comparing user answers to the true value. However, the prior art has the following significant drawbacks in both GSQ release policy and data aggregation algorithm, and is difficult to meet the complex scene requirements. On the one hand, the existing GSQ issuing lacks dynamic adaptability, a fixed frequency or random issuing mode is adopted, and dynamic adjustment of user states and system resource constraints is not combined. The stiffness strategy is easy to cause two problems, namely the system cost is increased due to the excessive GSQ release, or the user capacity change cannot be tracked timely due to the insufficient GSQ release, so that the user capacity estimation is invalid, the true value inference deviation is further enlarged, and the relationship between the calibration precision and the resource cost is difficult to balance. On the other hand, the aggregation algorithm discovered by the traditional truth value has weak anti-interference capability, is dependent on static weight distribution or simple outlier filtering mechanism, and cannot effectively attack a high-noise environment and malicious users. When the device measures noise data such as errors, the static weight is difficult to dynamically weaken the influence of low-quality data, so that the accuracy of true value inference is greatly reduced, and the robustness is insufficient. The invention provides a dynamic crowd sensing true value discovery method for resource optimization by taking crowd sensing data quality control as a background, which realizes continuous estimation and true value accurate inference of user capability under high noise and malicious attack environments by intelligently deciding' when and to whom GSQ is issued and combining an iterative weighted aggregation method based on user capability and data errors, and provides a new technology for solving the problems of the existing method in the aspects of dynamic adaptability, resource efficiency and malicious attack resistance. Disclosure of Invention The invention aims to provide a dynamic crowd sensing true value discovery method for resource optimization, which aims to solve the problems that in the prior art, a task allocation strategy is not dynamically adjusted, so that GSQ is excessively or inadequately issued and cannot adapt to user capacity change in real time, the traditional data aggregation method is poor in anti-interference capability when facing malicious users and high-noise data, accuracy of true value inference is affected, a GSQ issuing mode with fixed frequency causes resource waste or insufficient calibration accuracy, and system cost and accuracy of task results are difficult to balance. In order to achieve the purpose, the technical scheme of the invention is that the dynamic crowd sensing true value discovery method for resource optimization is used for carrying out gold standard problem dynamic release through reinforcement learning to optimize calibration task release and realizing high-efficiency true value inference by combining with robust aggregation, thereby improving the inference precision of true value discovery and optimizing the resource use efficiency. The method comprises the following steps: Step1, calculating to obtain an initial capacity estimated value of the user according to an error between