CN-122024204-A - Low-cost AI auxiliary forklift operating system

CN122024204ACN 122024204 ACN122024204 ACN 122024204ACN-122024204-A

Abstract

The invention discloses a low-cost AI auxiliary forklift operation system which comprises a data acquisition and preprocessing module, a target detection module, a space-time modeling module, an operation scene understanding module, a path planning and decision module and a path planning and decision module, wherein the data acquisition and preprocessing module is used for acquiring and processing video data in a forklift operation environment, the target detection module is used for identifying and positioning cargoes, obstacles and personnel by utilizing an RT-DETR target detection model, extracting target types, position information and motion trends, the space-time modeling module is used for carrying out space-time information modeling on target characteristics by adopting an improved TimeSformer network, extracting the motion track and environment relation of targets, the operation scene understanding module is used for analyzing the forklift operation state, calculating the relative speed and distance change of the targets, detecting abnormal operation modes and predicting operation risks, and the path planning and decision module is used for adjusting forklift running routes and operation strategies and generating navigation and safety early warning information. The intelligent auxiliary operation system utilizes AI visual perception and space-time modeling to improve the understanding capability of the forklift operation environment and realize intelligent auxiliary operation.

Inventors

YUAN ZHENG

Assignees

合肥协力仪表控制技术股份有限公司

Dates

Publication Date: 20260512
Application Date: 20260203

Claims (7)

1. A low cost AI-assisted forklift operating system, comprising: the data acquisition and preprocessing module is used for acquiring video data in a forklift working environment and carrying out denoising, normalization, frame rate adjustment and key frame extraction; the target detection module is used for analyzing the data by utilizing the RT-DETR target detection model, identifying and positioning cargoes, obstacles and personnel, and extracting target types, position information, confidence coefficient and movement trend; The space-time modeling module is used for modeling the space-time information of the target features by adopting an improved TimeSformer space-time modeling network; the feature extraction module is used for extracting the motion trail, variable speed information and global environment relation of the target and generating space-time feature vectors; The operation scene understanding module is used for analyzing the operation state of the forklift, calculating the relative speed and distance change of the target, detecting an abnormal operation mode and predicting the operation risk; and the path planning and decision-making module is used for adjusting the forklift driving route and the operation strategy and generating real-time navigation and safety early warning information.
2. The low-cost AI-assisted forklift operating system of claim 1, wherein the modules are implemented by: S1, collecting video data in a forklift working environment, denoising, normalizing, adjusting a frame rate and extracting key frames to construct a data set; S2, analyzing the data set by using an RT-DETR target detection model, identifying and positioning cargoes, obstacles and personnel, extracting target types, position information, confidence and movement trend, and constructing a target feature set; S3, improving TimeSformer a space-time modeling network, adopting a local window attention mechanism to reduce the computational complexity, introducing a dynamic attention allocation strategy into a multi-head self-attention module, adding bidirectional time differential coding into a time sequence coding layer, and adding a local feature pyramid fusion module into a transform coding layer; S4, inputting the target feature set into an improved TimeSformer space-time modeling network, extracting dynamic change features of the target in a time dimension, and obtaining a space-time feature vector, wherein the space-time feature vector comprises a motion track, speed change information and a global environment relation of the target; S5, constructing an operation scene understanding module based on the space-time feature vector, analyzing the operation state of the forklift, calculating the relative speed and distance change of the target, detecting an abnormal operation mode, and predicting operation risks by combining historical operation data; and S6, adjusting the driving route and the operation strategy of the forklift based on the analysis result of the operation scene understanding module, and generating real-time navigation and safety early warning information.
3. The low-cost AI-assisted forklift operating system of claim 2, wherein S2 specifically comprises: s21, acquiring an image frame contained in the data set: ; Wherein, the An image frame is represented and, Representing pixel coordinates Color channel at Is used for the display of the display panel, For the width of the image to be the same, Is the image height; s22, use of The feature extraction network of the target detection model carries out convolution calculation on the input image to obtain a multi-scale feature map: ; Wherein the method comprises the steps of Representing a feature extraction function of the backbone network, A characteristic diagram representing a time step t is shown, An input image representing a time step t; s23, calculating target category and position information based on the detection head: ; ; Wherein the method comprises the steps of As a matrix of weights, the weight matrix, As a result of the bias term, In order to be a category of the object, Representing position information including center coordinates, width and height, and rotation angle of the target frame, Representing an activation function for normalizing the target class score to a sum of 1; S24, screening targets according to the confidence threshold value Selecting effective targets to meet Is preserved, and adjusts the regression location; S25, calculating the cross ratio of the target frames, executing non-maximum suppression, removing the overlapped targets, and only reserving the detection result with the highest confidence coefficient: ; Wherein the method comprises the steps of For two candidate target frames, Representing a target frame And Cross-ratio of (1) The target with higher confidence coefficient is reserved; s26, constructing a target feature set according to the detection result The method comprises the steps of including target category, position information, confidence and motion trend, and is used for subsequent space-time modeling.
4. The low-cost AI-assisted forklift operating system of claim 2, wherein S3 specifically comprises: s31, acquiring the target feature set The method comprises the steps of constructing input data for space-time modeling, wherein the input data comprises target types, position information, confidence and motion trend: ; Wherein, the In order to be a category of the object, Center coordinates and size information representing the object, For the degree of confidence of the target, As a trend of the movement of the object, The number is the target number; S32, based on target feature set Construction of spatiotemporal modeling input sequences Encoding the target state in successive time steps as time series input data: ; Wherein, the To the point of Representing the length of the time sequence, ensuring that the spatio-temporal features can cover the motion state of the target, and serving as input data of TimeSformer spatio-temporal modeling network, Representing objects At the central coordinate of the time step t, Representing the width and height of the object i at time step t, Representing the confidence of the model in the identification of the object, Representing the velocity components of the target in the horizontal and vertical directions; s33, splitting global time sequence modeling into local time window calculation by adopting a local window attention mechanism, and setting the window size At each time window And calculating an attention matrix in the window to reduce calculation complexity and improve short-time dynamic perception capability: ; Wherein, the A query matrix and a key matrix respectively, As a dimension of the features, In the form of a matrix of keys within a window, Representing the correlation between time steps in the window, and ensuring that target features in the time window are effectively correlated; s34, based on target confidence Calculating self-adaptive attention weights, and dynamically adjusting the weights of the time-space information of different targets according to the detection confidence level: ; Wherein, the Representing the adaptive attention weight of time step t, Representing the confidence score of the target t, The start time step of the local time window is indicated, Representing window size; S35, introducing a dynamic attention allocation strategy into the multi-head self-attention module, and adjusting an attention calculation mode in a time window based on the calculated confidence degree weight: ; Wherein, the Representing a weighted attention matrix; s36, adding bidirectional time differential coding into the time sequence coding layer, calculating short-time dynamic change of the current time step, and enhancing modeling capability of target acceleration, deceleration and stagnation states: ; Wherein, the Representing the bi-directional time differential encoding of time step t, A feature vector representing a time step t; S37, adding a local feature pyramid fusion module in the transform coding layer, respectively calculating the space-time features of windows with different scales, and performing feature stitching to enhance the information fusion capability of different time scales: ; Wherein, the Representing the time-space characteristic matrix after fusion, Representing the feature stitching operation of different scales, Representing different window sizes The calculated feature matrix; S38, carrying out normalization processing on the fused space-time characteristics to ensure that the characteristics of different scales have consistency, and carrying out processing on the processed space-time characteristics And sending the vector into a TimeSformer transducer coding layer, carrying out final space-time feature extraction, and outputting a high-dimensional space-time feature vector for understanding a subsequent operation scene.
5. The low-cost AI-assisted forklift operating system of claim 2, wherein S4 specifically comprises: S41, inputting a target feature set into an improved TimeSformer space-time modeling network, wherein the target feature set comprises the category, position information, confidence coefficient and motion trend of a target, a time step sequence is constructed, and input data comprises the target feature of a current frame and information of a front time frame and a rear time frame; s42, performing time dimension modeling on the input data, and setting the size of a time window Ensure that each time step contains a front and a back The state information of the frame enables the model to calculate short-term and long-term motion characteristics of the target; S43, calculating the speed and acceleration of the target in the time dimension, and extracting a motion change mode: ; ; Wherein, the For the speed of the target at time step t, In order for the acceleration to be a function of the acceleration, For the amount of horizontal and vertical displacement change of the target, Is a time interval; s44, calculating relative motion characteristics among targets, constructing a target interaction matrix, and measuring distance change, speed difference and track offset conditions among the targets: ; ; Wherein, the For the euclidean distance between object i and object j, Relative speed of the two; S45, extracting time step characteristics, target motion characteristics and interaction information among targets, calculating state change amplitudes of the targets at different time steps, and acquiring a global motion mode by combining environment information; S46, inputting time sequence data to an improved TimeSformer network, extracting short-term and long-term motion modes by adopting multi-scale feature fusion, and carrying out normalization processing to ensure that features of different time spans are kept consistent; S47, generating space-time feature vectors, integrating the motion trail, speed, interaction information and global environment relation of the targets to form space-time feature expression, and inputting the space-time feature expression to the operation scene understanding module.
6. The low-cost AI-assisted forklift operating system of claim 2, wherein S5 specifically comprises: S51, constructing an operation scene understanding module based on space-time feature vectors, associating targets in a forklift operation environment, and building a target state matrix: ; Wherein, the Representing all of the target states at the current time, As a result of the spatio-temporal characteristics of the object, Is the number of targets in the current scene; s52, calculating the relative speed between the forklift and the target: ; Wherein, the The current position of the forklift is indicated, Representing the center coordinates of the target object; s53, calculating the Euclidean distance between the forklift and the target: ; S54, calculating the deviation degree of the target path based on the movement trend of the target, and setting the current position of the target Position of last time step Calculating the change rate of the angle of the movement direction of the target: ; Wherein, the Representing the change of the movement direction of the target in continuous time steps, and reflecting whether the target has abnormal steering or deviates from a normal running track; s55, calculating abnormal behavior judgment parameters of the operation scene based on the relative speed, the Euclidean distance and the track deviation condition And set an abnormality threshold When (when) When the current operation scene is marked as high risk, and risk early warning information is output; s56, calculating the risk distribution situation in the operation scene by combining the historical operation data, predicting the possible operation abnormal area, and outputting an operation risk assessment result.
7. The low-cost AI-assisted forklift operating system of claim 2, wherein S6 specifically comprises: s61, acquiring the current position of the forklift based on an analysis result of the operation scene understanding module Speed and velocity of Job target position Calculating the distance between the forklift and the target: ; s62, calculating the estimated approach time according to the relative speed of the forklift and the target : ; ; Wherein, the In order to prevent a minimum value of zero being divided, Reflecting the time at which the truck approaches the target, Representing the relative speed of the forklift and the target; S63, calculating a path adjustment amount based on the current path and the target track of the forklift: ; ; Wherein, the Indicating the current path of the forklift, The trajectory of the object is represented and, The amount of path adjustment is indicated and, Indicating the driving direction angle of the forklift, The target direction angle is indicated as such, Indicating the angle adjustment amount; S64, calculating a speed adjustment value of the forklift and updating a running strategy according to the approaching time of the forklift and the target and the path adjustment amount: ; Wherein, the Indicating the approach time of the forklift to the target, The adjustment amount of the forklift path is represented, The speed adjustment amount of the forklift is represented, Representing a velocity component of the forklift; s65, calculating the speed adjustment value Path adjustment amount As input, real-time navigation information including driving direction, speed adjustment and obstacle avoidance strategy is generated and output to the forklift control system.

Description

Low-cost AI auxiliary forklift operating system Technical Field The invention relates to the technical field of mechanical equipment, in particular to a low-cost AI auxiliary forklift operating system. Background Fork truck is used as the important equipment of logistics and warehousing operations, and is widely applied to links such as cargo handling, loading and unloading, storage and the like. Traditional fork trucks rely on manual operation and drivers need to have a lot of experience to deal with complex and varied work environments. However, in the actual operation process, a large number of obstacles are usually present in the forklift operation area, the goods are stacked irregularly, personnel frequently travel, and the manually driven forklift is easily affected by factors such as misoperation, environmental interference and the like, so that the operation efficiency is reduced, and even safety accidents are caused. In addition, in a high-strength operation scene, fatigue is easily generated when a driver operates for a long time, and the stability and the safety of operation are further affected. With the development of artificial intelligence and computer vision technologies, intelligent forklift auxiliary operation systems gradually become research hotspots. At present, some automatic forklift systems adopt technologies such as laser radar, millimeter wave radar, ultrasonic sensor and the like to perform environment sensing, and semi-automatic or full-automatic driving is realized by combining a path planning algorithm. The system can reduce manual intervention to a certain extent and improve the operation efficiency, but is easy to interfere in a complex environment due to higher cost of hardware equipment, so that the large-scale application of the system is limited. In addition, under the environment that the illumination changes greatly or the obstacle is blocked seriously, the detection accuracy of the sensors may be reduced, and the decision making capability of the forklift is affected. The computer vision technology provides a more cost-effective scheme for intelligent operation of the forklift, and the target detection algorithm based on deep learning can acquire operation environment information through the camera and identify cargoes, obstacles and personnel in real time. At present, common target detection algorithms comprise YOLO, fast R-CNN, SSD and the like, and the algorithms have good detection effects on static images, but have certain limitations in dynamic operation scenes. The target detection model is usually used for target identification based on a single frame image, lacks modeling capability for target movement trend, and is difficult to effectively adapt to real-time change of target position and state in the forklift operation process. In addition, part of high-precision target detection algorithms have large calculation amount and high requirements on equipment calculation force, and are difficult to meet real-time processing requirements in an edge computing environment. Aiming at the defects of a target detection algorithm in a dynamic working environment, the video space-time modeling technology is focused gradually. A space-time modeling method based on a transducer structure, such as TimeSformer, can model time sequence information of a target through a self-attention mechanism and extract motion characteristics of the target in a time dimension. However, the original TimeSformer structure has higher computational complexity, and is difficult to meet the requirement of real-time forklift operation, and meanwhile, a global self-attention mechanism may introduce a large amount of redundant information to influence the modeling accuracy of the key target movement mode. Therefore, in the forklift operation environment, the time modeling network needs to be optimized, so that the calculation complexity is reduced, and the modeling capability of the dynamic target is improved. In the aspect of operation scene understanding, the existing method mainly relies on target detection results to conduct static environment modeling, and deep analysis on a dynamic operation environment is lacking. For example, the conventional path planning method generally performs calculation based on static obstacle information, and cannot accurately predict the movement trend of the obstacle, so that the forklift cannot adjust the driving strategy in time in the operation process. In addition, the complexity of the operation scene causes that a decision system based on fixed rules is difficult to cope with changeable operation environments, and the adaptability and the operation efficiency of the forklift are affected. The conventional AI forklift auxiliary operating system still has certain limitations in aspects of target detection, space-time modeling, operation scene understanding, path planning and the like. The target detection algorithm lacks space-time information modeling capabili