CN-122020341-A - Bus stop classification method based on payment data
Abstract
The invention relates to the technical field of station passenger flow analysis, and discloses a bus station classification method based on payment data, which comprises the steps of obtaining payment data, line data and departure schedule data of a bus; the method comprises the steps of extracting transaction time sequences of each vehicle in each bus line according to payment data, calculating expected arrival time of each station in each bus line according to line data and departure schedule data, carrying out passenger flow station matching according to the transaction time sequences of each vehicle and the expected arrival time of each station to obtain matching results of passenger flow stations and transaction time in each bus line, and carrying out clustering analysis based on time sequence characteristics according to the matching results of the passenger flow stations and the transaction time in each bus line to obtain bus station classification results. According to the method, the problem of station matching errors caused by data loss and line dynamic adjustment in the traditional method is solved, and the accuracy of classification of bus stations is improved.
Inventors
- LIU YUNXI
- LI HAO
- ZHANG ZHAO
- LIU WENTAO
- WANG ZHENGWU
- WANG JIE
- XIANG JIAN
Assignees
- 长沙理工大学
Dates
- Publication Date
- 20260512
- Application Date
- 20251229
Claims (10)
- 1. The bus station classification method based on the payment data is characterized by comprising the following steps of: acquiring payment data, line data and departure schedule data of a bus; Extracting a transaction time sequence of each vehicle in each bus line according to the payment data; Calculating the estimated arrival time of each station in each bus route according to the route data and the departure schedule data; Matching passenger flow stations according to the transaction time sequence of each vehicle and the expected arrival time of each station to obtain a matching result of the passenger flow stations and the transaction time in each bus line; and performing clustering analysis based on time sequence characteristics according to matching results of the passenger flow stations and the transaction time in each bus line to obtain bus station classification results.
- 2. A bus stop classification method based on payment data according to claim 1, wherein calculating the estimated arrival time of each stop in each bus route based on route data and departure schedule data comprises: extracting the running time of each bus route at different moments from each other at stations according to the route data; extracting departure time of each vehicle in each bus route according to departure schedule data; And adding the matched travel time of the stations apart to the departure time of each vehicle to obtain the estimated arrival time of each station.
- 3. A bus stop classification method based on payment data according to claim 2, wherein the matching method of the stop travel time apart comprises: Determining a matching interval of departure time according to the running time range of the bus on a bus line; matching corresponding vehicles in the matching section at the departure time according to the matching section at the departure time; And adding the stop-apart running time at the middle time of the matching interval to the departure time of each vehicle to obtain the stop-apart running time of each vehicle.
- 4. The bus stop classifying method based on payment data according to claim 2, wherein extracting travel time of each bus route from stops at different times according to route data comprises: converting the site name into site geographic position information by adopting a geographic coding method; according to the geographical position information of the station, a bus route is obtained through bus route planning; Extracting the running time, the distance and the number of stations of each bus line at different moments according to the bus lines; And extracting the running time of the separated stations of each bus line at different moments according to the running time, the separation distance and the number of the separated stations of each bus line at different moments.
- 5. The bus stop classifying method based on payment data as set forth in claim 4, wherein extracting travel time of each bus line at different points in time from the distance and the number of points in time comprises: and selecting a departure station as a starting station, and calculating the time interval between each station and the starting station in a running time subtraction mode among stations to obtain the running time of each station.
- 6. The bus stop classification method based on payment data according to claim 1, wherein the step of performing passenger flow stop matching according to the transaction time sequence of each vehicle and the expected arrival time of each stop to obtain a matching result of the passenger flow stop and the transaction time in each bus line comprises the steps of: And (3) aiming at minimizing the difference between the transaction time and the expected arrival time, screening the transaction time sequence of the same vehicle ID and the expected arrival time of each station, and matching each transaction time to the corresponding passenger flow station closest to the expected arrival time to obtain a matching result of the passenger flow station and the transaction time.
- 7. The bus stop classification method based on payment data according to claim 1, wherein performing a clustering analysis based on time series characteristics according to a matching result of a bus stop and a transaction time in each bus line to obtain a bus stop classification result comprises: constructing a station-time point matrix according to the matching result of the passenger flow station and the transaction time in each bus line, and converting the station-time point matrix into long-format time sequence data comprising stations, time stamps and observation values; extracting multidimensional time series characteristics according to the long-format time series data; Adopting a principal component analysis method to dynamically reduce the dimension of the multi-dimensional time sequence characteristics; and clustering the multidimensional time series features after dimension reduction by adopting a K-means clustering method to obtain a bus station classification result.
- 8. The bus stop classification method based on payment data according to claim 7, wherein the bus stop classification result comprises a peak dense type, a peak balance type and a remote sparse type bus stop.
- 9. The bus stop classification method based on payment data according to claim 1, wherein the payment data of the bus is obtained from one-card data and mobile payment data.
- 10. A bus stop classification method based on payment data according to claim 1, wherein the obtaining of route data of the bus comprises geographical location data of the bus stop.
Description
Bus stop classification method based on payment data Technical Field The invention relates to the technical field of station passenger flow analysis, in particular to a bus station classification method based on payment data. Background Passenger flow data is an important basis for classification of bus stops and passenger flow characteristic analysis. In recent years, domestic researchers have been increasingly concerned with bus flow analysis using payment data. Particularly in the big data age, researchers can obtain finer passenger flow data by using payment means such as smart cards, two-dimension code payment, mobile payment and the like. By analyzing the data, researchers can more accurately identify the passenger flow characteristics, distribution conditions and variation trends of bus stops, and further support is provided for bus dispatching, route optimization and the like. However, the traditional station classification and passenger flow analysis are limited by investigation means and cost, mainly depend on manual counting, questionnaire investigation and the like, have small sample size of investigation data and have great randomness, and are difficult to accurately analyze and classify the bus stations and the passenger flow characteristics thereof. At present, the traditional bus station classification and passenger flow characteristic method based on passenger flow data mainly comprises cluster analysis, weighted regression, combination of qualitative and quantitative analysis and the like. For example Qiao Rui, etc. propose to adopt methods such as cluster analysis, systematic clustering, etc. to carry out classification study on different types of bus stops, and from the core functions born by bus stops, the urban area factor where the stops are located, and the influence factors of road class on bus stops, reasonable classification and classification results are obtained. Pang Lei et al propose to explore different types of site traffic influencing factors and their extent of influence using 3 regression models, common least squares regression (Ordinary Least Squares, OLS), geo-weighted regression (Geographically weighted regression, GWR), and Multi-scale geo-weighted regression (Multi-Scale Geographically weighted regression, MGWR). Han Xu and the like propose to adopt daily passenger flow change trend characteristics of different stations to carry out K-Means cluster analysis, and to classify all rail transit stations into 5 types, and further study the effect of factors such as land utilization, built environment, station type and the like on the early-late-entry and late-exit peak passenger flow through a geographic weighted regression model (Geographically weighted regression, GWR). And classifying urban rail transit stations by adopting a K-Means clustering algorithm in summer snow and the like, and carrying out passenger flow characteristic analysis based on station classification. Based on AFC data, deng evaluation heart and the like, a plurality of effectiveness indexes are synthesized to determine classification numbers, and the classification numbers are classified by combining qualitative analysis and quantitative analysis through methods such as principal component analysis, k-means clustering, multiple linear regression and the like. He Xin, etc. propose to establish a site function positioning database, and obtain a site classification result from a quantification perspective by adopting a method of combining principal component analysis and cluster analysis. Li Xiangnan, selecting 11 factors related to site self characteristics, site environment characteristics and the like as initial variables of clustering analysis, clustering according to the extracted common factors by adopting a K-means method, and finally dividing operation sites into five categories. Fu Bofeng and the like comprehensively consider the traffic function and the place characteristics of the stations, and adopt a method combining qualitative and quantitative analysis to classify the rail stations. However, at present, card swiping data of a bus is usually counted according to time, and because of lack of GPS information, a traditional method cannot determine a corresponding boarding station according to the card swiping data of the bus, so that the problem of station matching errors caused by data loss and line dynamic adjustment cannot be solved, and further classification accuracy of the bus stations is affected. Disclosure of Invention Aiming at the defects in the prior art, the invention provides a bus stop classification method based on payment data, which is based on the geographical position and the line information of the bus stop, analyzes the relation of the starting point and the ending point (origin and destination, OD) of the travel of passengers by using the payment data (including card swiping and code scanning records), and combines space-time feature analysis and a deep learning model,