CN-122022896-A - Large-model-driven urban rail newly-opened site flow prediction method
Abstract
A large model driven urban rail new open site flow prediction method relates to the technical field of urban rail traffic operation management. The method solves the technical problems of low prediction accuracy, poor interpretation of the traditional model and insufficient reasoning flexibility in complex urban scenes caused by lack of historical passenger flow data of newly opened sites of urban rails. The method comprises the steps of firstly collecting and preprocessing relevant multidimensional data of passenger flows, then constructing a multidimensional reference system to search similar references of target sites or extracting macroscopic index references, judging scene types through a rule engine, calling corresponding templates, filling information to generate structured prompt words, inputting the prompt words into a large language model, outputting the content predicted values and the results of a natural language reasoning chain according to a reasoning thinking chain template, and finally checking the results and completing optimization and tuning of the method based on feedback. The invention has good application prospect in the technical field of urban rail transit operation management.
Inventors
- ZHOU YONG
- LUO JIACHEN
- SONG JIAHUA
- CHEN ZHENWU
- ZHANG XIAOYONG
- HU FANGQIAO
Assignees
- 深城交科技集团股份有限公司
- 深圳市交通科学研究院有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260414
Claims (7)
- 1. A large model driven urban rail new open site flow prediction method is characterized by comprising the following steps: S1, collecting and preprocessing multidimensional data related to station passenger flow; S2, constructing a multi-dimensional reference system of grid space-site type-macroscopic index aiming at a target new open site, searching whether an existing site with the same type label as the target new open site exists in the space geographic range of the target new open site as a similar reference, if so, extracting the daily average in-out quantity of at least one characteristic day in the normal working day, the normal rest day, the 1 day before festival and the festival period within 30 days, and if not, extracting the urban environment static data of the space geographic range of the target new open site as a macroscopic index reference; S3, calling a prompting word template based on a rule engine, automatically judging a scene category of the target new open site according to the operation state of the target new open site and the retrieval result of S2 through the rule engine, and calling a preset prompting word template corresponding to the scene category; S4, large model driven reasoning and prediction generation, namely inputting the prompt words in the S3 into a large language model, guiding the large language model to analyze according to a preset reasoning thinking chain template, and outputting a prediction result comprising a quantized prediction value and a natural language reasoning chain; and S5, checking the prediction result in the step S4, and optimizing the prediction method based on check feedback.
- 2. The large model driven urban rail new open site traffic prediction method according to claim 1, wherein in S1, the multidimensional data specifically comprises: Space basic data comprising urban geographic information grids divided by preset sizes, newly opened sites and longitude and latitude coordinates of all existing sites; the station characteristic data comprises a type tag marked for the existing station based on a historical passenger flow mode, a line, a transfer attribute and a basic attribute of the number of entrances and exits of the station, and historical incoming and outgoing flow data of the existing station; The urban environment cross-domain data comprises static data and dynamic data, wherein the static data comprises the number of various interest points, population density and employment post density mapped to a grid, and the dynamic data at least comprises calendar information, weather information and event information; and the knowledge vector base data is constructed by vectorizing expert experience knowledge and research report text data.
- 3. The method for predicting the traffic of the newly opened station of the urban rail driven by the large model according to claim 2, wherein in S2, the space geographic range is a 2 km-by-2 km urban geographic information grid, and the station type label is obtained by performing cluster analysis on the historical passenger flow mode of the existing station and comprises at least one of commute type, commercial entertainment type, traffic hub type, living type and mixed type.
- 4. The large model driven urban rail newly opened site traffic prediction method according to claim 3, wherein in S3, the scene categories comprise that the new site has operated a small amount of sample data, the new site has not been opened with similar reference, and the new site has not been opened with similar reference; when the new site is not opened with similar references, the corresponding prompting word template contains site density passenger flow rules, wherein the number of the similar references is more than 3, the shunting effect is marked, and when the number of the similar references is less than or equal to 3, the induced passenger flow effect is marked.
- 5. The method for predicting traffic of new open sites in urban rail driven by large model according to claim 4, wherein in S4, the reasoning thinking chain template is a causal analysis template after setting traffic prediction expert roles for large language model, and the analyzing step comprises sequentially studying and judging site attributes, reference values, passenger flow trends, holidays, special weather, large activities, operation and policy adjustment, newly built POIs, site specificity influence and multi-factor superposition or subtraction.
- 6. The large model driven urban rail new open site traffic prediction method according to claim 5, wherein in S4, the quantized predicted value is output in a csv text format with a header, and the natural language inference chain is output integrally with the quantized predicted value after being distinguished by a preset mark.
- 7. The large model driven urban rail new open site flow prediction method according to claim 6, wherein in S5, the verification is to cross-verify a large language model prediction result and a traditional regression model result, and when the difference is significant, manual auditing is triggered, the optimizing and optimizing comprises supplementing auditing feedback to a knowledge vector library and optimizing a template, and real passenger flow data is input into a large language model in a small sample learning mode after a new site operates.
Description
Large-model-driven urban rail newly-opened site flow prediction method Technical Field The invention relates to the technical field of urban rail transit operation management, in particular to a large-model-driven urban rail new open site flow prediction method. Background The new line of urban rail transit and the new site opening become the urban development normalcy, the passenger flow prediction result of the new opening site is an important basis for operation work such as rail transit capacity allocation, driving organization and the like, but the core pain point is lack of historical passenger flow data, and the traditional prediction method is difficult to directly adapt, so that the prediction precision is insufficient, and the scientificity and the refinement level of the operation organization are restricted. The prior art mainly tries to solve the problems through three types of methods, namely a passenger flow virtual distribution method based on line similarity calculation, a passenger flow virtual distribution method based on similar line or station migration passenger flow data retrieval, a multi-source data prediction method based on BP neural network, a neural network nonlinear mapping capability mining data association method, a model of a black box structure, a reasoning process lack of interpretability, a result is difficult to understand and utilize, and a fixed statistical relationship between station attributes and passenger flow is constructed based on an attribute association prediction method of historical data clustering, the dependence on historical data is strong, reasoning logic is stiff, and prediction is easy to fail when no similar station references. The prior art does not break through the core technical bottleneck, has the problems of cold start of data, poor interpretation of models and insufficient flexibility of reasoning of complex scenes, lacks an effective closed-loop optimization mechanism, cannot adapt to the prediction requirements of atypical scenes such as new urban areas, special functional areas and the like, and is difficult to meet the actual requirements of the refined operation of rail transit, so that a new open site passenger flow prediction method with stronger adaptability and higher precision is needed. Disclosure of Invention In order to solve the technical problems of low prediction accuracy, poor interpretation of a traditional model and insufficient reasoning flexibility in a complex urban scene caused by lack of historical passenger flow data of a newly opened station of an urban rail, the invention provides a large-model-driven urban rail newly opened station flow prediction method, which comprises the following steps: S1, collecting and preprocessing multidimensional data related to station passenger flow; S2, constructing a multi-dimensional reference system of grid space-site type-macroscopic index aiming at a target new open site, searching whether an existing site with the same type label as the target new open site exists in the space geographic range of the target new open site as a similar reference, if so, extracting the daily average in-out quantity of at least one characteristic day in the normal working day, the normal rest day, the 1 day before festival and the festival period within 30 days, and if not, extracting the urban environment static data of the space geographic range of the target new open site as a macroscopic index reference; S3, calling a prompting word template based on a rule engine, automatically judging a scene category of the target new open site according to the operation state of the target new open site and the retrieval result of S2 through the rule engine, and calling a preset prompting word template corresponding to the scene category; S4, large model driven reasoning and prediction generation, namely inputting the prompt words in the S3 into a large language model, guiding the large language model to analyze according to a preset reasoning thinking chain template, and outputting a prediction result comprising a quantized prediction value and a natural language reasoning chain; and S5, checking the prediction result in the step S4, and optimizing the prediction method based on check feedback. Further, in S1, the multi-dimensional data specifically includes: Space basic data comprising urban geographic information grids divided by preset sizes, newly opened sites and longitude and latitude coordinates of all existing sites; the station characteristic data comprises a type tag marked for the existing station based on a historical passenger flow mode, a line, a transfer attribute and a basic attribute of the number of entrances and exits of the station, and historical incoming and outgoing flow data of the existing station; The urban environment cross-domain data comprises static data and dynamic data, wherein the static data comprises the number of various interest points, population density and employment p