CN-120544075-B - Street quality evaluation method based on behavior recognition, tracking and analysis
Abstract
The invention discloses a street quality evaluation method based on behavior recognition, tracking and analysis, which comprises the steps of extracting behavior tracks of various crowds from a street video shot in high altitude by utilizing a target recognition and target tracking technology in a computer vision technology, converting three-dimensional coordinates into a real distance to realize sub-meter-level dynamic behavior measurement of the high altitude video, analyzing instantaneous and integral moving speed, moving direction and crowd scale of each pedestrian through analysis of integral video, comprehensively judging the behavior type of the individual, and obtaining environment use quality evaluation accurate to street fragments through comprehensive operation of a behavior database and position information. And the time investment of field investigation and analysis of designers in the initial stage of the project is obviously reduced, and the working efficiency is improved.
Inventors
- WANG CHUAN
- WU MINHAO
- HAN DONGQING
- LI HANCHI
- Xiao Shike
- SHEN YITING
- SHI DONGBO
Assignees
- 东南大学
Dates
- Publication Date
- 20260505
- Application Date
- 20250512
Claims (4)
- 1. A street quality evaluation method based on behavior recognition, tracking and analysis is characterized by comprising the following steps: Step one, video acquisition, preprocessing and data input; step two, basic data calculation; the calculation analysis program is written in Python language and comprises three modules which are operated in sequence after a data input link, wherein the three modules are respectively as follows: An identification and tracking module; A perspective transformation module; a classification and index calculation module; Step three, judging a behavior target; Step four, calculating the space quality of the street; the obtained pedestrian behavior category data is imported into a public space quality evaluation system, 6 public space evaluation indexes are calculated, and a final street quality radar chart is generated through the indexes, wherein the 6 public space evaluation indexes are respectively average residence time, street traffic speed, track fluctuation indexes, street blocking degree, leisure activity duty ratio and social activity duty ratio; The method comprises the following steps: The street space calculation module imports the acquired pedestrian behavior category data into a public space quality evaluation system, and calculates space evaluation indexes of each dimension; 7 public space evaluation indexes are established from different angles, and a final street quality radar chart is generated through the indexes; A1. Average residence ratio The average stay time is an index for measuring the attraction force, and the calculation method is as follows, wherein N staying represents the number of people with stay behaviors in the video clips, the total number of people with flying stay behaviors is obtained through a stay determination method in a behavior type determination module, N total represents the total number of pedestrians, and the average stay ratio reflects the blocked or attracted degree of people in space; ; A2. street traffic speed The method comprises the steps of calculating the average speed of different objects in a street space by dividing the average speed of the objects in the street space by the maximum specified speed of the objects in the space, wherein the higher the average speed is, the higher the flow fluency is, TS P is the average traffic speed of pedestrians, TS c is the average traffic speed of pedestrians, TS w is the average traffic speed of pedestrians, p i 、c i 、w i is the point identified by walking, non-motor vehicles and motor vehicles in automatic identification, the number of the points is l, m and n, v p0 is the maximum speed of the pedestrians in the street, v c0 is the maximum speed of the non-motor vehicles in urban roads, and v w0 is the maximum speed of the motor vehicles in urban active roads; ; A3. track fluctuation index Track fluctuation is an evaluation index related to the direction change (d) of the object and is used for representing the traffic flow and the change of the track, the calculation formula of the track fluctuation is shown in the following formula, TF is the average sum of the weighted average value and the weighted variance of the direction change, d i represents the instantaneous direction change of each object i at time t, The higher the track fluctuation index value is, the weaker the traffic flow of the space is indicated, and the higher the complexity of the traffic is; ; A4. Street congestion degree The blocking degree is an index based on behavior characteristics and reflects the blocking degree of different objects in traffic flow, the blocking degree is based on the weighted sum of the loitering behaviors of various objects, the loitering behaviors mainly show the non-uniformity of traffic speed change, most of abrupt speed change of targets except for few pedestrians which actively loiter in a street are generated due to the blocking of vehicles or crowds, therefore, the blocking degree can better reflect the collision degree of different objects in the passing process, the calculation formula of the blocking degree is shown as follows, alpha, beta and gamma are traffic weight coefficients, S2 represents the number of loitering pedestrians, N v 'and M v ' represent the number of riding people and motor vehicles with similar loitering behaviors, and l, M and N represent the total number of pedestrians, riding people and motor vehicles in the street respectively; ; A5. duty cycle of leisure activities The method for calculating the index comprises the following steps of calculating average proportions of people who do not pass through the area, calculating average proportions of the rest people (S1), the loiter people (S2), the multi-person walking (M2) and the group movement (G2) at different moments in a period of time, wherein t is the total time of the period of time, l i represents the total number of people appearing in a picture at the moment i, the higher the value is the more suitable for people to do leisure activities in the street space, and the higher the value is the more suitable for the people to do the leisure activities in the street space; ; A6. Social activity duty cycle The proportion of social activities is related to the number of groups (N) and group behaviors and is used for reflecting social activities in the design targets of life streets, the social activities refer to the frequency of meeting, walking side by side or interacting of pedestrians on the streets, the group behaviors or the multi-person behaviors of more than two people are calculated, and the gathering behaviors of the pedestrians last for more than a certain time are calculated, in the expression, M1 and M2 respectively represent multi-person stay and multi-person walking, G1 and G2 respectively represent group stay and group walking, l i represents the total number of people appearing in the pictures under the moment i, and the proportion of social activities occurring in the street space is reflected through the index; ; A7. radar diagram The 6 indexes are subjected to visual processing in a radar map mode, wherein DS represents average stay time, TS represents street traffic speed, TF represents track fluctuation indexes, BD represents street blocking degree, LA represents leisure activity duty ratio, SA represents social activity duty ratio, the radar map formed by the 6 indexes can intuitively show the characteristics of the street segments, the trends of different street segments can be analyzed by comparing the radar maps of different segments in the same street, and meanwhile, the change of the spatial indexes of the street segments under different conditions can be monitored by periodically comparing the radar maps of the same segment.
- 2. The method for evaluating street quality based on behavior recognition, tracking and analysis according to claim 1, wherein the method comprises the following steps: Dividing a selected street by using a satellite map, comprehensively considering the physical environment of a street segment, selecting a proper position and an angle, obliquely shooting the track of a pedestrian in the street segment by using a high-altitude camera for video acquisition, mapping a video selected place, selecting four non-collinear mark points in the place by combining the place environment, measuring any 5 distances among the four points by using an infrared range finder, and inputting the distances to a data operation interface; Preprocessing the acquired video, and performing frame dropping and anti-shake processing on the video segment by using Adobe AFTER EFFECTS: the frame rate of the video clips is adjusted to 10, and for the video collected by using an unmanned aerial vehicle, a stable motion function in a tracker is needed to track a stationary object which is highlighted on a street in the video, so that anti-shake processing is realized; The program comprises a data operation interface written by Python language, the interface is divided into four quadrants, an upper left quadrant is a perspective transformation operation area, a lower left quadrant is a data input area, an upper right quadrant is a perspective transformation preview area, a lower right quadrant is a file management and operation area, when the program is used, firstly, a video fragment which needs to be preprocessed is selected in the file management and operation area, the program can display a first frame of a video in the perspective transformation operation area, a user pushes a mouse to drag a picture, four points corresponding to a mapping link are selected in the operation area clockwise by clicking a button of the data input area, then the program can automatically convert the point distances into point coordinates in the data input area, after the perspective transformation is performed in the file management and operation area, a transformation matrix is obtained through calculation of the 8 points, the perspective transformation preview area can display the picture transformation condition under the matrix operation, the user can evaluate the transformation condition, and automatically calculate the transformation condition after the mapping data is corrected by clicking again, the point or the mapping data is improved, the requirement is met, the program can be analyzed after the calculation is started, the requirement is calculated, and the requirement of the program can be analyzed.
- 3. The method for evaluating street quality based on behavior recognition, tracking and analysis according to claim 1, wherein the step two is as follows: the calculation analysis program is written in the Python language and comprises five modules, the modules are operated in sequence after a data input link, and the functions and algorithms of the five modules are specifically explained below: (1) Identification and tracking module The recognition and tracking module uses an open-source YOLOv target detection algorithm and a ByteTrack multi-target tracking algorithm, and uses a data set training recognition weight model, so that the module can finally derive space-time position data of a street space target, and can recognize three types of targets including pedestrians, riders and motor vehicles; YOLOv5 for detecting the required objects from the street background, namely pedestrians, riders and motor vehicles closely related to the street space, and utilizing the multi-target tracking MOT to correspond the target connection between different frames to form a coherent motion trail; The detection weight model is trained by a database, the database contains 2000 pictures, 1400 pictures are collected and processed by oneself, 300 pictures come from CrowdHuman data sets, and 200 pictures come from Visdrone data sets, so as to respectively strengthen the detection of riders, pedestrians shielded from each other in an gathering environment and motor vehicles with bird's-eye view angles; in order to accurately identify and track the same person in different video frames, byteTrack tracking algorithm is adopted, the problems of target shielding, blurring and short disappearance can be effectively processed by introducing a matching mechanism of a low-resolution detection frame, the tracking robustness is further improved, and according to YOLOv target detection algorithm, each moving target in a public space of a street is accurately tracked by ByteTrack, so that reliable support is provided for subsequent behavior analysis and data statistics; (2) Perspective transformation module The space-time track data obtained by the recognition and tracking module are based on perspective planes of video visual angles, the coordinates of the space-time track data are pixel coordinates, and the function of the perspective transformation module is to map the pixel coordinates into the plane of the actual street space; the perspective transformation is also called projection mapping, the points in the original image are required to be projected into new coordinates by means of a transformation matrix, the operation equation of the matrix has 8 unknowns, so that 4 groups of known mapping points are required to be found for solving, a three-dimensional space is determined by the mapping points, and the 8 points are 4 points obtained by mapping and 4 points selected in a data operation interface during operation; In the actual mapping process, coordinates of 4 points are difficult to directly obtain, so that the module further comprises a mapping verification tool, wherein the specific process is to input six distances between four points (A, B, C and D) of the mapped quadrangle, namely four sides and two diagonals, obtain the coordinates of the points based on a nonlinear solution binary quadratic equation through a reference edge and a reference point, convert the mapping distance into actual reference coordinates, and according to the plane geometry principle, the coordinates of a third point are determined by two determined coordinates of the points and two distances between the third point and the third point, thus a set of binary quadratic equations is established, as shown in the following two formulas, x 1 , y 1 , x 2 , y 2 is the coordinates of the two points, x 0 , y 0 is the coordinates of the third point, dis1 and Dis2 are distances, ; ; The nonlinear solution is taken as a real number solution, can eliminate the imaginary number in an equation, determines the shape of the quadrangle according to the method when the lengths of four sides and one diagonal line of the quadrangle are input, calls a verification and optimization module on the basis when the lengths of the four sides and the two diagonal lines are input, firstly adopts a nonlinear solution binary quadratic equation according to the lengths of the four sides and the one diagonal line of the quadrangle to obtain an initial coordinate result, and performs error comparison according to the initial coordinate result and the last diagonal line, wherein the calculation formula is shown as follows, x B ,y B is the calculated initial B point coordinate result, x D ,y D is the calculated initial D point coordinate result, dis BD is the measured BD length, directly executes the optimization module when the error value Ina is smaller than 0.3, executes the optimized reference value and recommends to re-measure when the error value Ina is larger than 0.5, and determines that the error value is excessively large and needs to be re-measured when the error value Ina is larger than 0.5; ; the principle of the optimization module is a constrained optimization algorithm, which is realized in Python by calling a function scipy.optimize.minimize, and an objective function is defined as the error square sum of an actual side length and a given side length, wherein the formula is shown as follows, d i is the actual side length or diagonal length, and d i ' is a given value; ; The constraint condition sets quadrilateral convexity constraint and side length triangle inequality constraint, and the constraint condition is shown in the following formula: ; ; The initial coordinate result obtained by calling the nonlinear equation by the initial guess result is solved in parallel by using three optimization algorithms, namely BFGS, SLSQP, trust-const, and the result with the smallest error is selected as the final coordinate; (3) Classification and index calculation module The classification and index calculation module is used for calculating the basic motion and aggregation conditions of the street targets, and calculating by using the space-time track data transformed by the perspective transformation module; the program firstly separates the data of pedestrians, riders and motor vehicles, and then calculates indexes respectively, three basic indexes, namely two basic action indexes (v, d) and one cluster number (N), are set in total, and the riders and the motor vehicles only need to calculate v and d, so that an explanation algorithm and parameter selection are developed by taking index calculation of the pedestrian data as an example; the instantaneous speed (v) is an index for measuring pedestrian movement and is determined by displacement between frames, and the calculation formula is as follows: ; wherein Dis (t) represents the displacement of the object after a time t has elapsed from the instant i, 、 Respectively representing pixel coordinates at different times, the time variation Δt is determined by the number of frames, the video frame rate is set to 10, so that each frame in the data represents 0.1 seconds, F is an integer representing that an average speed is calculated every F frames, F is set to 100, at which time Dis (t) represents the distance the object moves within 10 seconds, represents the average speed per frame within 10 seconds, ; The instantaneous direction change (d) is used for measuring the moving direction of pedestrians, is a relative angle, is determined by taking radian as a unit and using displacement vector difference between frames, the moving vector between every t frames is Eq, firstly, sine value and cosine value of an included angle of two vectors are calculated, an inverse cosine function is used for obtaining the absolute value of the included angle, the positive and negative of the included angle is judged by calculating a vector product mode, or the positive and negative of the included angle are judged by calculating an inverse tangent function once through the sine value and the cosine value, and in order to reduce jitter and error, t takes a value of 10 in an instantaneous speed calculation formula, so that the instantaneous direction change (d) at the moment represents the included angle of displacement of an object in a certain second and the next second, as shown in the following formula: ; ; the clustering number (N) measures whether the pedestrian's behavior is in a larger group and the size of this group is critical to reflecting street vitality, using DBSCAN to calculate the clustering number, which requires two parameters, scan radius (eps) and minimum inclusion point number (minPts), algorithm optionally one point that is not visited, find all the nearby points within eps from which it is located, and treat them as a cluster, scan radius depends on the maximum social distance of the pedestrian, set eps = 200, i.e. 2 meters, minPts = 2, the clustering number is calculated frame by frame, each object gets a clustering number representing the size of the cluster it is in, Correcting fluctuation of the clustering number by an algorithm, using a segmentation calculation method, and taking a speed value of t=10, wherein the clustering number N is equal to the mode of the clustering size of each frame from time i to time i+t: ; by introducing the time control parameter t, the data volume is compressed in the calculation process of v, d and N, and each data represents the motion state within one second.
- 4. The street quality evaluation method based on behavior recognition, tracking and analysis according to claim 1, wherein the method comprises the following steps: the pedestrian type judging module uses v and N data of pedestrians, calculates the behavior mode of the pedestrians in the street space through classification judgment, the program finally classifies the behaviors of the pedestrians into 7 classes, establishes the conversion from the basic indexes of the pedestrians to the behavior classes of the pedestrians, 7 Basic behavior categories are set for the behaviors of pedestrians, namely single person staying S1, single person loitering S2, single person walking S3, multi-person staying M1, multi-person walking M2, group gathering G1 and group moving G2, wherein the behavior categories are related to two overall trend indexes, namely an overall movement trend Mt and an overall clustering trend Nt, the two trend values are respectively related to a real-time movement state Mi and an overall clustering trend Ni, The real-time motion state Mi is calibrated by a speed separation point s1, when the instantaneous speed v < s1, the speed separation point is marked as M1, namely the object is considered to be stationary, when v > s1, the speed separation point is marked as M2, the object is considered to be moving, the overall motion trend Mt is determined by a stay proportion Ps, wherein the calculation formula of the Ps is that the object is considered to be always stationary when the stay proportion Ps is smaller than K1, mt=1, the object is always moving when the Ps is larger than K2, mt=3, and the object is considered to be in a loitering motion state when the object is located between the two values, namely stop-and-go motion, and Mt=2; ; When the integral clustering trend Ni is calibrated, two clustering number separation points N1, N2 and N > N1 are needed, the integral clustering trend Ni is considered to be a group of multiple persons, when N2 and N > N2 are marked, the integral clustering trend Ni is considered to be a single person behavior, and is marked as N1, the integral motion trend Nt is determined by clustering proportions P1 and P2, when P1 is larger than K3, an object is considered to be always active alone, when P2 is larger than K4, the object is considered to be always located in the group, nt=3, and the rest other conditions are considered to be Nt=2, namely, the multiple persons are active together, and the calculation formulas of P1 and P2 are as follows: ; ; Taking s1=2, K1=0.25, K2=0.8, n1=1, n2=3, K3=0.6 and K4=0.5 as preset values, and judging the behavior type of an object according to the obtained overall motion trend Mt and the overall clustering trend Nt: when nt=1, if mt=1, it is marked as "one-man stay" (S1), When nt=1, if mt=2, it is marked as "one-person loitering" (S2), When nt=1, if mt=3, it is marked as "one-man walking" (S3), When nt=2, if mt=1, it is marked as "multi-person stay" (M1), When nt=2, if mt=2 or 3, it is marked as "multi-person walking" (M2), When nt=3, if mt=1 or 2, it is marked as "population stay" (G1), When nt=3, if mt=3, it is marked as "group movement" (G2).
Description
Street quality evaluation method based on behavior recognition, tracking and analysis Technical Field The invention belongs to the field of city design and city updating, and particularly relates to a street quality evaluation method based on behavior recognition, tracking and analysis. Background Along with the transition of the urban process from increment to stock, urban updating enters the phase of stock improvement, and the improvement of street space quality is a key ring. Creating a better public space for everyone is a persistent and central problem in public space planning design practice, but its quality depends on how people use public space. Behavior observation is therefore a key way to measure the quality of public spaces. Prior urban researchers have often evaluated public spaces by computing behavioral related indicators through observation records and/or interviews and have been applied to assist in updating designs or evaluating the viability, safety, and walkability of updated designs. However, the method is time-consuming and labor-consuming, has low accuracy, is difficult to repeatedly perform experiments, and is not suitable for general evaluation of the urban streets at present. There is therefore a need to develop a new tool based on computer vision technology for automatically identifying, tracking and analyzing activities in an open public space for comprehensive assessment of public space quality. Disclosure of Invention In order to solve the problems, the invention discloses a street quality evaluation method based on behavior recognition, tracking and analysis, which is used for automatically recognizing, tracking and analyzing activities in an open public space, and the accuracy reaches the decimeter level. The process obviously reduces the time investment of the designer for site investigation and analysis at the initial stage of the project, thereby improving the working efficiency. In order to achieve the above purpose, the technical scheme of the invention is as follows: a street quality evaluation method based on behavior recognition, tracking and analysis comprises the following steps: step one, video acquisition, preprocessing and data input The method comprises the steps of dividing a selected street by utilizing a satellite map, comprehensively considering the physical environment of a street segment, selecting a proper position and an angle, obliquely shooting the tracks of pedestrians, riders and motor vehicles in the street segment by utilizing a high-altitude camera, performing video acquisition, mapping a video selected place, selecting four non-collinear mark points in the place by combining the place environment, measuring any 5 distances among the four points by utilizing an infrared range finder, and inputting the distances to a data operation interface. Preprocessing the acquired video, and performing frame dropping and anti-shake processing on the video clips by using AdobeAfter Effects. And (3) adjusting the frame rate of the video clips to 10, and tracking the stationary objects protruding on the street in the video by using a stable motion function in a tracker for the video acquired by using the unmanned aerial vehicle, so as to realize anti-shake processing. The program of the present invention includes a data manipulation interface written in the Python language. The interface is divided into four quadrants, wherein the upper left quadrant is a perspective transformation operation area, the lower left quadrant is a data input area, the upper right quadrant is a perspective transformation preview area, and the lower right quadrant is a file management and operation area. When the video processing method is used, firstly, a video fragment needing preprocessing is selected in a file management and operation area, a program can display a first frame of video in a perspective transformation operation area, and a mouse is pressed to drag pictures. The four points corresponding to the mapping link are selected clockwise in the picture by clicking the button of the point to be input in the data input area and selecting the button in the operation area, then four point coordinates of the actual plane space are needed to be input in the data input area, five point distances can also be input, and the program can automatically convert the point distances into the point coordinates. After the file management and operation area clicks 'perspective transformation', a transformation matrix is obtained through calculation of the 8 points, and a perspective transformation preview area displays the picture transformation condition under the matrix operation so as to enable a user to evaluate the transformation condition and improve the transformation precision through point reselection or mapping data correction. When the transformation condition meets the requirement, clicking the 'start calculation', ending the data input link, and automatically performing subsequent