CN-121973251-A - Intelligent daylily picking robot control system

CN121973251ACN 121973251 ACN121973251 ACN 121973251ACN-121973251-A

Abstract

The invention discloses an intelligent daylily picking robot control system which comprises a field picking data acquisition module, a picking robot motion regulation module, a sample library construction module, an integrated state evolution model design module, a self-adaptive picking strategy learning module and a picking action execution module. The invention belongs to the field of intelligent control, in particular to an intelligent daylily picking robot control system, which is characterized in that a picking robot motion regulation module is constructed, and plant damage caused by pose deviation is avoided from the source through differential and dynamic speed regulation design; based on the self-adaptive line order design, sample pairing calibration is carried out, instant rewards are constructed by adopting maturation-damage, pose deviation guidance and equipment-plant protection, the method is suitable for various picking uncertainty scenes, the instant rewards and agronomic risk values are predicted as dual triggering conditions, the control decision scene of the picking robot is covered, and further the control effect of the daylily picking robot is improved.

Inventors

LI XINYI
HUO RAN
HAO WANGLI
WANG HAN
WANG XIAOYING
SUN GAOYUAN
LIU YONGQIANG
Ji Linya
Jia Xichao
WU YUXIN

Assignees

山西农业大学

Dates

Publication Date: 20260505
Application Date: 20260407

Claims (8)

1. The intelligent daylily picking robot control system is characterized by comprising a field picking data acquisition module, a picking robot motion regulation module, a sample library construction module, an integrated state evolution model design module, a self-adaptive picking strategy learning module and a picking action execution module; the field picking data acquisition module acquires historical robot operation data to form an observation space; The motion regulation and control module of the picking robot determines the speed threshold value of each motion direction through the continuous control coefficient, so that the motion direction of the picking robot can be independently regulated according to the requirement; the sample library construction module generates a simulation sample by adaptively adjusting the order of picking lines based on an observation space, and constructs a high-quality sample library; The integrated state evolution model design module uses a high-quality sample library as training data, adopts maximum expected likelihood as a training target, fits the transition rule of picking scene state-action-next state and instant rewards, and constructs a probability integrated state evolution model; The self-adaptive picking strategy learning module fuses the real-time observation data with the prejudging result of the integrated state evolution model, and aims at maximizing long-term accumulated rewards to realize self-adaptive picking strategy learning; And the picking action executing module executes the picking action on the real-time observation state.
2. The intelligent daylily picking robot control system according to claim 1, wherein the picking robot motion control module specifically comprises: Calculating the speed threshold value of each movement direction, configuring rated maximum speed of each movement direction, and taking the independent continuous control coefficient as a speed regulating weight; and dynamically updating the continuous control coefficient based on the joint pose deviation of each movement direction of the robot.
3. The intelligent daylily picking robot control system according to claim 2, wherein the sample library construction module specifically comprises: The self-adaptive line sequence order design is adopted to dynamically adjust the action steps of the single picking line sequence; generating a real-simulation dual sample structure through line sequence calibration, and generating a standardized simulation sample library; Setting sample discrimination threshold conditions, and screening a simulation sample; performing partial decomposition on the Q value deviation, and quantifying the influence of the state transition probability difference on the future value; Designing instant rewards; and constructing a high-quality sample library, and combining the simulation sample meeting the sample discrimination threshold condition with the real sample to obtain the high-quality sample library.
4. The intelligent daylily picking robot control system according to claim 3, wherein the instant rewards are designed to be picking instant rewards comprising ripe-damage rewards, pose deviation guiding rewards and equipment-plant protection rewards, the ripe-damage rewards are only optimally ripe + undamaged to be optimal picking, the rest are prizes and punishes are set according to value decrements, the pose deviation guiding rewards take maximum deviation directions to be quantized, and the equipment-plant protection rewards are designed aiming at hardware protection core indexes of the robot and cluster characteristics of the daylily.
5. The intelligent daylily picking robot control system according to claim 4, wherein the integrated state evolution model design module specifically comprises: the integrated state evolution model is designed to fit the continuous transfer rule of state-action-next state+instant rewards under the day lily picking scene; And constructing a model training target, and constructing the training target by adopting a maximum expected likelihood criterion.
6. The intelligent daylily picking robot control system according to claim 5, wherein the adaptive picking strategy learning module specifically comprises: The method comprises the steps of constructing an end-to-end strategy network, defining long-term jackpot to reflect the overall value of picking actions by taking an observation state as input and simultaneously outputting picking action instruction distribution and continuous control coefficients of all motion directions, wherein the strategy network adopts a fully-connected network structure, the dimension of an input layer is an observation space feature dimension, the output layer adopts Gaussian distribution output to the action instruction distribution, and the continuous control coefficients adopt Sigmoid activation output; Calculating a long-term jackpot reflecting the overall value from a series of picking actions; And constructing a clipping objective function to realize policy updating.
7. The intelligent daylily picking robot control system according to claim 6, wherein the picking action execution module specifically comprises: The method comprises the steps of generating a picking strategy, combining a training-completed integrated state evolution model and a self-adaptive picking strategy to form a final picking decision, inputting a real-time perceived environment observation state into a strategy network, outputting an optimal picking action command and speed thresholds of all motion directions, inputting the observation state and the action command into the state evolution model, and predicting the pose and picking execution effect of a robot at the next moment; The design of re-decision is carried out, the agronomic risk early warning of daylily picking is introduced, and the daylily picking damage rate is reduced to the greatest extent; and executing the action of the end effector, converting the optimal picking action instruction and the optimal speed threshold value output by the strategy network into an electric signal control value of the robot driving module, and finally picking daylily.
8. The intelligent daylily picking robot control system according to claim 7, wherein the field picking data acquisition module acquires historical robot operation data and performs preprocessing to finally form an observation space, and the preprocessing comprises missing value interpolation, outlier processing and standardization processing.

Description

Intelligent daylily picking robot control system Technical Field The invention relates to the field of intelligent control, in particular to an intelligent daylily picking robot control system. Background The daylily picking robot control system is a closed-loop intelligent control system which relies on sensing, decision making, hardware execution and a matched algorithm to realize environment sensing, mature flower identification and positioning, picking action planning and driving control in the daylily picking process. However, the common day lily picking robot control system has the problems that the motion speed regulation is not differentiated, plant damage and picking failure are easy to occur, the reward mechanism is coarse in design, optimal picking behaviors are not guided in a targeted manner, and the control reliability is poor, and the common day lily picking robot control system has the problems that uncertainty in the picking process cannot be captured, more ineffective picking is achieved, the damage rate is high, operation is blocked easily due to single decision failure, and the control effect is poor. Disclosure of Invention Aiming at the situation, in order to overcome the defects of the prior art, the invention provides an intelligent daylily picking robot control system, aiming at the problems that the common daylily picking robot control system has no difference in motion speed regulation, is easy to cause plant damage and picking failure, has rough design of a reward mechanism, does not lead to the optimal picking behavior in a targeted way, and further leads to poor control reliability, the scheme constructs a picking robot motion regulation module, and avoids plant damage caused by pose deviation from the source through differential and dynamic speed regulation design; the method comprises the steps of carrying out sample pairing calibration based on self-adaptive line order design, disassembling the deviation into instant rewarding deviation and state transition probability deviation, accurately positioning deviation sources of simulated samples and real samples, further completing efficient sample discrimination, guaranteeing control reliability of a final day lily picking robot, aiming at the problems that uncertainty of a picking process cannot be captured, invalid picking is large, damage rate is high, operation is blocked easily due to single decision failure and control effect is poor in a general day lily picking robot control system, the instant rewarding is constructed by adopting maturation-damage, pose deviation guidance and equipment-plant protection, the agricultural and hardware double constraint refined rewarding design is realized, various picking uncertainty scenes are adapted, the instant rewarding and agricultural risk value is predicted to be a double triggering condition, not only guaranteeing that a re-decision covers all invalid and high risk picking behaviors, but also avoiding nonsensical blind re-decision, and covering a control decision scene of the day lily picking robot, and further improving the control effect of the day lily picking robot. The intelligent daylily picking robot control system comprises a field picking data acquisition module, a picking robot motion regulation module, a sample library construction module, an integrated state evolution model design module, a self-adaptive picking strategy learning module and a picking action execution module; the field picking data acquisition module acquires historical robot operation data to form an observation space; The motion regulation and control module of the picking robot determines the speed threshold value of each motion direction through the continuous control coefficient, so that the motion direction of the picking robot can be independently regulated according to the requirement; the sample library construction module generates a simulation sample by adaptively adjusting the order of picking lines based on an observation space, and constructs a high-quality sample library; The integrated state evolution model design module uses a high-quality sample library as training data, adopts maximum expected likelihood as a training target, fits the transition rule of picking scene state-action-next state and instant rewards, and constructs a probability integrated state evolution model; The self-adaptive picking strategy learning module fuses the real-time observation data with the prejudging result of the integrated state evolution model, and aims at maximizing long-term accumulated rewards to realize self-adaptive picking strategy learning; And the picking action executing module executes the picking action on the real-time observation state. Further, the field picking data acquisition module acquires historical robot operation data, performs preprocessing, and finally forms an observation space, wherein the preprocessing comprises missing value interpolation, outlier processing and standardizat