CN-122018326-A - Personnel perception speed self-adaptive control method

CN122018326ACN 122018326 ACN122018326 ACN 122018326ACN-122018326-A

Abstract

The invention provides a personnel perception speed self-adaptive control method, which belongs to the technical field of intelligent self-adaptive control and personnel perception cross, and comprises the steps of S1, multi-mode full-dimensional data acquisition and space-time standardization pretreatment, S2, personnel perception characteristic manifold embedding and risk index generation, S3, dynamic constraint boundary and multi-target weight priori generation, S4, maximum entropy depth reinforcement learning initial speed decision with constraint, S5, quaternary weighted dynamic reward function calculation and feedback, S6, three-way interaction iterative optimization and optimal speed output, and S7, smooth speed control curve generation and execution. The intelligent control system and the intelligent control method of the passenger conveying equipment finally achieve multi-objective collaborative optimization of safety priority, energy conservation, high efficiency, comfortable experience and equipment life extension, and promote the intelligent control technology of various passenger conveying equipment to develop to a higher safety level and a higher intelligent level.

Inventors

LAN YONG
ZHOU ZHONGHUA
ZHOU YIFAN
LUO HENG
LIU CHUANQI
PU YANGQIANG
TANG YAJUN

Assignees

四川经准特种设备检验有限公司
重庆市特种设备检测研究院(重庆市特种设备事故应急调查处理中心)

Dates

Publication Date: 20260512
Application Date: 20260409

Claims (10)

1. A personal-aware speed adaptive control method, comprising: s1, acquiring personnel sensing data, equipment operation data and environment working condition data of target conveying equipment, and executing space-time alignment processing and standardized preprocessing to obtain a time sequence synchronous standardized multi-source data set; s2, performing feature extraction and dimension reduction processing on the standardized multi-source data set by adopting a manifold regularization non-negative matrix factorization algorithm to generate a low-dimensional state vector and a personnel risk quantization index set; S3, finishing personnel risk grade judgment based on the personnel risk quantification index set, and dynamically generating a hard constraint interval of a speed decision, a speed change rate constraint threshold and a multi-target weight priori parameter according to the personnel risk grade and the inherent rated parameter of the target conveying equipment; S4, constructing a maximum entropy depth reinforcement learning decision model fused with manifold regularization constraint, inputting a low-dimensional state vector, a hard constraint interval and a speed change rate constraint threshold value into the maximum entropy depth reinforcement learning decision model, and outputting an initial optimal target running speed in a constraint feasible domain; S5, based on multiple target weight priori parameters, combining personnel perception data, equipment operation data, environment working condition data and initial optimal target operation speed, constructing a quaternary weighted rewarding function covering safety, energy conservation, riding experience and equipment protection, and calculating to obtain a real-time total rewarding value; S6, adopting a Gaussian process regression anomaly detection algorithm, completing full-working condition anomaly detection based on the low-dimensional state vector and real-time operation monitoring data, and generating an anomaly detection result, combining a real-time total rewarding value, an anomaly detection result and a hard constraint interval, executing two-way interactive iterative updating and parameter optimization on a regular non-negative matrix decomposition algorithm of a flow pattern, a maximum entropy depth reinforcement learning decision model and the Gaussian process regression anomaly detection algorithm, and outputting an optimized optimal target operation speed; S7, generating a smooth speed control curve based on the optimized optimal target running speed and the speed change rate constraint threshold, and issuing a speed control instruction to a controller of the target conveying equipment to complete speed control.
2. The personnel sensing speed self-adaptive control method is characterized in that the specific flow of time-space alignment processing is that a unified clock of an acquisition system is used as a reference, millisecond-level time stamps are respectively added to personnel sensing data, equipment operation data and environment working condition data, three types of data under the same time stamp are matched and aligned one by one, invalid data with time deviation exceeding 2 times of an acquisition period are removed, linear interpolation completion is carried out on the missing data by adopting a historical operation data mean value under the same equipment type and the same working condition, and a min-max normalization method is adopted for standardized preprocessing, so that all the data are mapped to a 0-1 interval in a unified mode, and dimensional differences of different dimension data are eliminated.
3. The personnel-aware speed self-adaptive control method according to claim 1, wherein when the manifold regularization non-negative matrix factorization algorithm is adopted for processing, decision-related feature constraints reversely output by S6 are synchronously received, the decision-related feature constraints are generated by strategy gradient deduction after iterative updating of a maximum entropy deep reinforcement learning decision model in S6, manifold regularization weight coefficients of the manifold regularization non-negative matrix factorization algorithm are adjusted by combining the decision-related feature constraints, and collaborative fine adjustment is carried out on basis matrixes and coefficient matrixes in the manifold regularization non-negative matrix factorization algorithm through multiplicative iterative updating rules, so that generated low-dimensional state vectors preferentially retain personnel core risk features and speed decision strong related features, and pertinence of feature characterization is improved.
4. The personal perception speed self-adaptive control method according to claim 1, wherein when the personal risk quantification index set is generated, four types of core indexes including a high risk passenger ratio, a passenger average safety interval deviation rate, a safety protection device compliance rate and a passenger unstable state ratio are extracted based on passenger age, action state, safety protection compliance and adjacent interval data in the personal perception data acquired by S1, wherein the high risk passenger ratio is the sum of the proportion of old passengers, children passengers and inconvenient action passengers, and the four types of core indexes are mapped to a 0-1 interval through min-max normalization processing, and the higher the numerical value is, the higher the corresponding risk is represented.
5. The personnel-aware speed self-adaptive control method according to claim 1 is characterized in that an entropy regularization term and a constraint violation penalty term are synchronously introduced into a strategy optimization objective function of a maximum entropy depth reinforcement learning decision model, the entropy regularization term maintains model exploration capacity by calculating entropy values of strategy distribution, the strategy is prevented from being trapped into local optimization, the constraint violation penalty term applies linear penalty to candidate speed actions exceeding a hard constraint interval or a speed change rate constraint threshold generated by S3, the penalty coefficient positively correlates with the degree of violation constraint, and the output initial optimal target running speed is ensured to strictly meet constraint requirements.
6. The personnel-aware speed self-adaptive control method according to claim 1, wherein after the maximum entropy deep reinforcement learning decision model outputs an initial optimal target running speed, a constraint verification process is additionally executed, wherein the current actual running speed of the target conveying equipment is obtained through equipment running data acquired in S1, a difference value between the initial optimal target running speed and the current actual running speed is calculated, and if the difference value exceeds a speed change rate constraint threshold generated in S3, a speed with a suboptimal Q value is selected as an adjusted initial optimal target running speed in all candidate speed actions conforming to the speed change rate constraint threshold, so that no pause impact is ensured in a speed adjustment process.
7. The personal perception speed self-adaptive control method according to claim 1, wherein the weight dynamic adjustment process of the quaternary weighted reward function is characterized in that if the ratio of high-risk passengers in the personnel risk quantification index set generated by S2 exceeds 40%, or environmental condition data collected by S1 show that bad weather and/or power grid voltage fluctuation exceeds a preset range, the total ratio of the safety reward weight and the equipment protection reward weight is increased to not lower than 70%, if the abrasion coefficient of a core component of equipment operation data collected by S1 shows that the abrasion coefficient of the core component is lower than 0.2 and the working condition is stable, the energy-saving reward weight and the riding experience reward weight are respectively increased by 5% -10%, and after all the weights are adjusted, the total sum is kept to be 1 through normalization processing.
8. The personal perception speed self-adaptive control method according to claim 1, wherein when the Gaussian process regression anomaly detection algorithm is operated, the low-dimensional state vector output by S2 is used as a core input feature, the initial optimal target operation speed output by S4 is received at the same time, a speed ratio is calculated based on the initial optimal target operation speed and the rated speed in inherent rated parameters of target conveying equipment, the anomaly detection threshold value is dynamically adjusted according to the speed ratio, and the higher the speed ratio is, the lower the anomaly detection threshold value is.
9. The personnel-aware speed self-adaptive control method is characterized by comprising the steps of firstly, taking a real-time total rewarding value calculated in the step S5 as a core feedback signal for parameter updating of a maximum entropy deep reinforcement learning decision model, taking an abnormal detection result as a hard constraint for strategy adjustment, updating a decision model network parameter through a near-end strategy optimization algorithm, secondly, reversely deducing an optimal feature constraint based on a strategy gradient updated by the maximum entropy deep reinforcement learning decision model, feeding back the optimal feature constraint to the step S2, adjusting an objective function of a manifold regularized non-negative matrix decomposition algorithm, optimizing a feature extraction direction, thirdly, adding abnormal sample data marked by the abnormal detection result into a training set of a Gaussian process regression abnormality detection algorithm, finishing incremental optimization of algorithm super-parameters through a maximum likelihood estimation method, and improving detection accuracy of similar abnormal conditions.
10. The personnel-aware speed adaptive control method according to claim 1, wherein for different types of target conveying equipment, only intrinsic rated parameters and constraint threshold references of the corresponding equipment are required to be adapted, and core architectures of a manifold regularized non-negative matrix factorization algorithm, a maximum entropy depth reinforcement learning decision model and a Gaussian process regression anomaly detection algorithm are not required to be reconstructed, so that the adaptive application of the method on various types of target conveying equipment is realized, wherein the intrinsic rated parameters comprise rated speed, rated acceleration, rated load and core component tolerance threshold.

Description

Personnel perception speed self-adaptive control method Technical Field The invention relates to the technical field of intelligent self-adaptive control and personnel perception crossing, in particular to a personnel perception speed self-adaptive control method. Background The transportation equipment with the personnel carrying function is a core carrying facility of personnel-intensive scenes such as public transportation, commercial complexes, scenic spots, transportation hubs, industrial and mining areas and the like, and the rationality of the operation speed directly determines the operation safety of the equipment, the energy utilization efficiency, the riding experience of passengers and the service life of core components. The current speed control technology of various passenger conveying equipment still has various common pain points and technical defects in industry: The speed regulation mode is single in solidification, and the perceived depth and dimension of personnel are seriously insufficient. Most passenger conveying equipment runs at a rated fixed speed, cannot adapt to dynamic changes of passenger load and passenger structural characteristics, is prone to crowding and insufficient in evacuation efficiency due to passenger flow peaks, causes serious energy waste and equipment ineffective loss due to passenger flow valleys, and is small in number, has a basic speed regulation function, takes real-time number and load weight as unique speed regulation basis, does not deeply integrate refined personnel perception characteristics such as passenger age, action capacity, safety spacing and behavior state, cannot carry out adaptive speed regulation for high-risk passenger groups, and is insufficient in safety redundancy and extremely poor in speed regulation precision and scene adaptability. The feature processing is disjointed from the decision link, and the multi-algorithm is isolated and operates without cooperation. In the prior art, modules such as personnel perception data processing, speed decision, anomaly detection, safety control and the like are mostly designed separately and run independently, a depth linkage interaction mechanism is not constructed to form a serious data island, a feature extraction link is not deeply bound with a subsequent decision target, key features related to personnel safety are easy to lose in dimension reduction processing, anomaly detection only has a post alarm function, detection results are not reversely fused into a decision model for optimization, and a full-flow closed loop of perception-feature-decision-execution-feedback-optimization cannot be formed. The control model has the advantage that the multi-target coordination capability is lost, so that the control model is easy to sink into local optimum. The existing speed control logic is more severely biased to a single optimization target, energy conservation is pursued to be maximized on one side and speed is reduced continuously, so that the stay time of passengers is overlong and experience is extremely poor, or the existing speed control logic is excessively pursued to carry efficiency and run at a high speed for a long time, the safety risk of high-risk passengers and the loss of equipment core components are ignored, meanwhile, the problem that the conventional reinforcement learning control model is unbalanced in exploration and utilization is easy to converge to a local optimal strategy, and generalization capability and robustness under complex working conditions are insufficient. Disclosure of Invention The invention provides a personnel perception speed self-adaptive control method, which realizes the full scene self-adaptive matching of the running speed of conveying equipment to personnel characteristics, equipment states and environmental working conditions through the deep representation of personnel perception core characteristics, the deep fusion innovation of intelligent algorithms and the construction of a full-flow closed-loop interaction mechanism, finally achieves the multi-objective collaborative optimization of safety priority, energy conservation, high efficiency, comfortable experience and equipment life extension, and promotes the intelligent control technology of various passenger conveying equipment to develop to a higher safety level and a higher intelligent level. A personal perception speed self-adaptive control method comprises the following steps: s1, acquiring personnel sensing data, equipment operation data and environment working condition data of target conveying equipment, and executing space-time alignment processing and standardized preprocessing to obtain a time sequence synchronous standardized multi-source data set; s2, performing feature extraction and dimension reduction processing on the standardized multi-source data set by adopting a manifold regularization non-negative matrix factorization algorithm to generate a low-dimensional state vector and a