CN-115796942-B - Clustering-based shared bicycle space-time layout determining method

CN115796942BCN 115796942 BCN115796942 BCN 115796942BCN-115796942-B

Abstract

The invention discloses a clustering-based shared bicycle space-time layout determining method, and belongs to the technical field of intelligent transportation. The invention firstly carries out system analysis on the current situation of the public bicycle, knows the characteristics of the existing public bicycle system and establishes a buffer zone by utilizing Arcmap software. And secondly, clustering the shared bicycle stations according to daily time difference by using fuzzy C-means cluster analysis to obtain various activity characteristic conditions of the shared bicycle clusters. And finally, carrying out regression analysis by adopting a binary Logit model based on the POI to obtain the location characteristics of the shared bicycle parking layout. The invention provides effective basis for travelers to make better travel decisions, and reasonably distributes public bicycle demands, thereby improving traffic environment.

Inventors

MEI ZHENYU
GONG JINRUI
ZHANG HONGYANG
CAO YUEXIAN
TANG WAI

Assignees

浙江大学

Dates

Publication Date: 20260505
Application Date: 20221108

Claims (6)

1. A clustering-based shared bicycle space-time layout determining method is characterized by comprising the following steps: S1, analyzing and processing travel data of a shared bicycle; taking the time of entering and exiting of a shared bicycle as basic data and taking a subway station and a bus station as centers to establish a traffic area; Establishing a buffer zone by utilizing Arcmap software, namely establishing a research area with a radius of h meters by taking each station as a center, dividing an analysis time period into two types of weekdays and weekends, and carrying out space cross analysis on the shared bicycle stream incoming and outgoing events and the station buffer zone range to obtain all incoming or outgoing data streams, wherein the incoming or outgoing early peak data streams and the incoming or outgoing late peak data streams; dividing data of all sharing bicycles entering and exiting a BSS (basic service set) of a public bicycle sharing system into 8 groups, and integrating inbound or outbound data to form three groups of overall analysis data; Calculating average flow per g minutes for shared bicycle inbound or outbound data of all BSS stations according to all the obtained data; Calculating the ratio of the average flow per g minute of the early peak and the late peak inbound or outbound respectively based on the average flow per g minute of the inbound or outbound of all BSSs, recording as an early peak ratio and a late peak ratio of an inbound or outbound, and calculating the ratio of corresponding flows of the early peak to the late peak; S2, obtaining the activity characteristic conditions of each cluster by using fuzzy C-means cluster analysis; Clustering the shared bicycle stations according to daily time difference by using a fuzzy C-means clustering algorithm; After determining a preset clustering number, selecting three indexes of average flow per g minute of all inbound or outbound, early peak ratio of inbound or outbound and late peak ratio of inbound or outbound as dissimilarity measuring objects, using square error as dissimilarity measuring value, and carrying out modeling analysis on 8 groups; the contour coefficient method is selected as a main basis for determining the cluster number of each group, and the cluster number of each group of data is determined by combining the error square sum; Carrying out statistical analysis on each station by using the inbound/outbound times, the early peak inbound/outbound ratio and the late peak inbound/outbound ratio of the average g minutes of the shared bicycle in the whole day, and dividing clusters by using the flow amount and time of the shared bicycle in each day by using a C-means clustering algorithm based on the minimization of a weighted square error function, thereby classifying the stations according to the main cluster value with the highest membership degree and obtaining the activity characteristic conditions of various shared bicycle clusters; S3, carrying out regression analysis on the shared bicycle location features based on the POI; Acquiring relevant geographic information data of an h meter range around a subway station and an h meter range around a bus station respectively, establishing a data acquisition related model by using python software through a data cloud platform of a Goldmap open platform, and acquiring interest points POI in the h meter range around each station; According to the continuous membership value generated by the C-means clustering method, determining factors of clustering membership can be explored by using a binary Logistic regression model, and meanwhile, the space autocorrelation among sites is controlled; Analyzing the spatial correlation of the number of each cluster in each cluster data and the number of basic clusters through a binary Logistic regression model to obtain the correlation condition of the location features on the shared bicycle parking layout; In step S2, the fuzzy C-means clustering algorithm continuously optimizes the objective function to obtain the membership degree of each obtained sample point to each clustering center, and selects the highest membership degree from the membership degrees as the classification basis, thereby achieving the purpose of automatically classifying the class of the sample data, and the clustering algorithm aims at solving the following objective function: Wherein c is a preset cluster number, m is a ambiguity, and the ambiguity is a parameter for determining the influence of an outlier on a cluster centroid value; Representing membership values, i.e. membership values, of clusters from the ith site to the kth site, wherein for a given site, the sum of the membership values for all other clusters is 1, x i , x k represents the ith object value and the kth cluster centroid value, respectively, and L (x i , x k ) is a dissimilarity measure of the ith object and the kth cluster centroid value; Using the sum of squares of residuals as a measure of anisotropy, when the membership value satisfies the following condition: at this time, the optimal condition of clustering is reached.
2. The method for determining the space-time layout of the shared bicycle based on the clustering of claim 1, wherein in the step S1, before data processing, analysis conditions including the number of effective subway stations, the number of effective bus stations, the randomly selected set number of common stations, and early peak time period and late peak time period are determined; in the process of calculating the average flow rate of every ten minutes of all the inbound or outbound of the shared bicycle, the flow rate of 0 hour 30 minutes to 4 hours 30 minutes is ignored, and 20 hours are selected as the effective period in one day.
3. The method for determining the space-time layout of the shared bicycle based on the clustering of claim 1, wherein the 8 groups in the step S1 are respectively data of outbound around a subway station on a weekday, data of inbound around a subway station on a weekday, data of outbound around a bus station on a weekday, data of inbound around a bus station on a weekend, data of inbound around a subway station on a weekend, data of outbound around a bus station on a weekend and data of inbound around a bus station on a weekend.
4. The method for determining a shared bicycle space-time layout according to claim 1, wherein in step S2, the core index of said contour coefficient method is a contour coefficient, and a certain sample point The contour coefficients of (a) are defined as follows: Where a is the average distance of X i from other samples in the same cluster, called the degree of aggregation, and b is the average distance of X i from all samples in the nearest cluster, called the degree of separation; The definition of the nearest cluster is as follows: Wherein, p is a sample in a certain cluster C k , namely, after taking the average distance between X i and all samples in a certain cluster as the distance between the point and the cluster, selecting a cluster closest to X i as a nearest cluster; the average contour coefficient is obtained by calculating the contour coefficients of all samples according to the formula and then averaging, the value range is [ -1,1], and k with the maximum average contour coefficient is the optimal cluster number.
5. The method for determining the space-time layout of the shared bicycle based on the clustering of claim 1, wherein in the step S3, a binary Logistic regression model formula is established as follows: In the above-mentioned method, the step of, Following a binary logistic distribution, P represents the probability of an event occurring, 1-P represents the probability of not occurring, X 1 ,X 2 on the right side of the equal sign. Are research objects or influence factors, and are particularly interest points of all POIs; 、 ... are all influence coefficients, and represent the influence degree of each influence factor on research data.
6. The method of claim 5, wherein the POIs comprise schools, residential areas, dining services, sports venues, shopping services and corporate enterprises.

Description

Clustering-based shared bicycle space-time layout determining method Technical Field The invention belongs to the technical field of intelligent transportation, and particularly relates to a clustering-based shared bicycle space-time layout determining method. Background In recent years, the rapid increase of the number of private cars brings about various problems such as traffic jam, and the development of public transportation is regarded as an ideal countermeasure. Guiding the driver to change from a simple car trip to a combined trip mode of 'car+public transportation', and even finally entering a stage of completely depending on public transportation trip becomes the focus of current attention. Meanwhile, in order to comprehensively utilize the advantages of various traffic modes, the traveler can achieve the minimum comprehensive cost, or because the traveler cannot complete traveling by using a single traffic mode, the traveler often selects a combined traveling comprising transfer among multiple modes in actual traveling. Wherein, the shared bicycle provides a reasonable solution to the problems of the first kilometer and the last kilometer. Currently, the domestic public bicycle industry is in the development stage of rapid rise, a public Bicycle Sharing System (BSS) is continuously expanded in the China range, but challenges in design, arrangement and management operation are also continuously aggravated, and the space-time distribution of the supply and demand modes and the use modes of the BSS system needs to be comprehensively known, so that the cognition of the system is perfected through a series of models. In conventional models, the in-out analysis of BSS stations is modeled separately, ignoring that the in-and out-of (arrival and departure) flows of each BSS station are closely related, and current model work often ignores interactions of neighboring base stations, resulting in bias, inconsistency, or inefficiency in the estimation. Disclosure of Invention Aiming at the defects of the existing model and analysis technology, the invention provides a clustering-based shared bicycle space-time layout determination method. The invention specifically comprises the following steps: 1. shared bicycle trip data analysis and processing And (3) carrying out system analysis on the current situation of the public bicycle, knowing the characteristics of the existing public bicycle system, and carrying out space-time distribution analysis on the acquired public bicycle travel data. And taking the time of entering and exiting of the shared single vehicle as basic data and taking the subway station and the bus station as centers to establish a traffic area. Before processing, firstly determining analysis conditions, namely the number of effective stations of subway stations, the number of effective stations of bus stations and a certain number of random selection of ordinary BSS stations, and an early peak time period and a late peak time period. Before the fuzzy clustering algorithm model of the next step is operated, a buffer area is established by utilizing Arcmap software (a research area with the radius of 300 meters is established by taking each station as the center), an analysis time period is divided into two types of workdays and weekends, space cross analysis is carried out on shared bicycle inflow and outflow events and the buffer area ranges of two types of station points, all inflow/outflow data streams are obtained, and inflow/outflow early peak data streams and inflow/outflow late peak data streams are obtained. The data of all the shared bicycles entering and exiting the BSS are divided into 8 groups, namely, data of all the surrounding stops of the subway station on the working day, data of all the surrounding stops of the bus station on the working day, data of all the surrounding stops of the subway station on the weekend, data of all the surrounding stops of the bus station on the weekend and data of all the surrounding stops of the bus station on the weekend. The inbound/outbound data is integrated into three sets of overall analysis data. According to all the obtained data, the average flow of every ten minutes is calculated for the shared bicycle inbound/outbound data of all BSS stations, so that analysis and clustering are facilitated. In the process of calculating the average flow rate of every ten minutes of all the inbound/outbound of the shared bicycle, the flow rate of 30 minutes from 0 hour to 4 hours is considered to be negligible, and 20 hours are selected as the effective period in one day. Based on the average flow rate of all BSS in/out, the ratio of the average flow rate of the early peak to the average flow rate of the late peak in/out every ten minutes is calculated and recorded as the ratio of the early peak to the late peak of the in/out station, and the ratio of the corresponding flow rates of the early peak to the late peak period is calculated. 2. Obtaining the