CN-121979922-A - Intelligent high-speed-oriented space semantic retrieval intelligent body construction method

CN121979922ACN 121979922 ACN121979922 ACN 121979922ACN-121979922-A

Abstract

The invention relates to a smart high-speed space semantic retrieval intelligent agent construction method, which comprises the steps of constructing an entity word segmentation library based on multi-source high-speed traffic space information data, converting different source names and coordinates of each entity in the entity word segmentation library into vectors to construct a vector library, constructing intelligent agent workflow, acquiring user query information by a large language model, extracting entity information based on the word segmentation library, acquiring entity identification candidate sets, extracting vector representations of the entities from the vector library based on the entity identification candidate sets, simultaneously carrying out retrieval and reasoning of entity relations based on a knowledge graph, taking the entity identification candidate sets and information acquired from the vector library and the knowledge graph as background information of space retrieval, converting the background information into a heterogeneous engine scheduling instruction set, and calling a dynamic execution routing graph of multi-dimensional space query to acquire retrieval results. The method can efficiently process the space information retrieval problem, and can eliminate ambiguity in entity identification and process the complex space inquiry process.

Inventors

HU XINYU
LI JIEYANG
XIE JIE
Bian jiajia
QIU LING

Assignees

南京感动科技有限公司

Dates

Publication Date: 20260505
Application Date: 20260409

Claims (10)

1. The intelligent high-speed-oriented space semantic retrieval agent construction method is characterized by comprising the following steps of: Constructing an entity word segmentation library based on multi-source high-speed traffic space information data, and constructing a pre-segmentation device based on the constructed word segmentation library; converting different source names and entity geographic coordinates of each entity in the entity word segmentation library into vectors respectively, and storing the vectors and the coding information of the entity coordinates into a vector library; constructing a four-layer structure dynamic knowledge graph consisting of an entity layer, a relation layer, a business rule layer and a road network real-time state layer; Setting up an agent workflow to perform a spatial retrieval task, comprising: the method comprises the steps of acquiring user query information by a large language model, calling a space retrieval intelligent agent, extracting entity information from the query information based on a pre-segmentation device, and acquiring an entity identification candidate set; The heterogeneous engine scheduling instruction set is generated by combining the background information and the predefined high-speed traffic space rule by utilizing a large language model, wherein the instruction set comprises a plurality of basic instruction elements which are isolated from each other in calculation and provided with execution engine attribution labels, and each basic instruction element corresponds to a preset retrieval logic; And converting the instruction set into a dynamic execution routing graph of multidimensional space query, and triggering a corresponding bottom physical engine to calculate by taking a basic instruction element as an intention root node to generate a final search result.
2. The method of claim 1, wherein the multi-source high-speed traffic spatial information data includes at least spatial information data during construction and spatial information data during operation; and giving credibility parameters to the high-speed traffic space information data of different sources, and constructing fusion weights of different source names of the same entity based on the quantity of semantic information fields contained by the entity and the credibility of the data sources, wherein the fusion weights are used for sequencing the priority of the entity names in a word segmentation library.
3. The method of claim 1, wherein the pre-slicer marks the location of the entity name in the query information when extracting the entity name, and records the location code as a location code corresponding to the entity name, as a common spatial feature parameter for subsequent calculation of entity relationship association degree.
4. The method of claim 1, wherein the entity word stock is preprocessed for entity extraction, comprising: the entity names are uniformly processed, and direction identification is uniformly carried out; Calculating a final word frequency by adopting a weighted average algorithm according to the occurrence frequency of the entity in different data sources; Part-of-speech tagging is performed according to entity types; Forming a triple structure for the preprocessed data according to entity names, word frequencies and parts of speech; and (3) performing reverse order matching on the entity word segmentation library of the triple structure by utilizing a sliding window according to a preset maximum matching length from each position of the user query information.
5. The method of claim 1, wherein extracting the vector representation of the entity from the vector library comprises: performing spatial range screening based on the coding information of the coordinates; The distance range based on the spatial range screening is used for calculating and screening matched vectors based on vector similarity; And reordering the retrieval results based on the correlation to obtain a final result.
6. The method of claim 1, wherein the base instruction element comprises an attribute search, a topology search, a scope search, a path search, and a compound nested search defining a combined instruction; the topology retrieval preset comprises a topology direction and a driving direction parameter; the path searches preset path types and running direction parameters.
7. The method of claim 1, wherein dynamically executing the routing graph comprises: a space topology operator node is used as a direct subordinate node of the intended root node, and a direction pointer is added to the inflow data; The entity node is used as a bottom leaf node arranged under the operator node and used for storing static identifiers containing entities; or the dynamic state constraint node is used as a conditional branch of the intended root node and contains real-time traffic state parameters for the physical execution layer to filter the road network dynamic data.
8. The method of claim 1, further comprising constructing a three-layer indexing system for spatial indexing, comprising: a semantic similarity index, which is to construct a graph index based on the entity name vector and to perform semantic similarity retrieval; constructing a relation vector based on road network connectivity for each entity by using the topological relation index, and carrying out semantic retrieval of topological relation; And constructing a weighted model based on similarity of the query vector and the entity vector, the topological relation score of the entity and the business rule matching degree of the entity, calculating the entity score, and searching based on the score, wherein the topological relation score of the entity is obtained based on the entity space distance and the topological relation type, and the entity space distance is calculated based on the coordinate vector.
9. The method of claim 1, further comprising building a multi-level context storage architecture, storing a session record generated by a user's query in a split manner, and invoking the session record when the user queries a plurality of times, comprising: a session level context is formed based on the retrieval history of a certain user, and the complete history of the user session is stored; Entity level context binding with session level context, based on user retrieval history of session level context, caching detailed information of entity; The relation level context is bound with the session level context, and the spatial topological relation among the entities is recorded based on the user retrieval history of the session level context; the state level context is used for storing real-time state information of the road network, and is multi-user shared information and called according to the need when any user inquires.
10. The method of claim 1, further comprising presetting a GIS action instruction protocol, outputting the search result in a nested action structure, wherein each action comprises an action type and result data; And executing different processing strategies for different action types, including text display, map labeling and map integration display.

Description

Intelligent high-speed-oriented space semantic retrieval intelligent body construction method Technical Field The invention belongs to the technical field of digital economic intelligent transportation, and particularly relates to a method for constructing a space semantic retrieval intelligent body oriented to intelligent high speed. Background With the rapid development of high-speed channel economy and intelligent traffic technology, space retrieval has important application in the fields of intelligent high-speed management, operation, monitoring and the like. Especially in the context of the state-driven high-speed digital transformation, users/administrators have a need to quickly acquire static geographic information (such as the location of a toll gate or service area), and even further desire to quickly acquire complex dynamic topological relation information (such as the nearest service area, upstream hubs, etc.). However, current spatial retrieval techniques have not fully met these needs, particularly with significant shortcomings in terms of semantic understanding and dynamic complex topological relation computation. There are 3 significant drawbacks to the currently commonly used spatial retrieval algorithm scheme: 1. Semantic splitting between query and retrieval, that is, the existing GIS system relies on SQL/GeoJSON and other databases to perform space query, and natural language needs to be manually converted into structured language executable by the databases (such as 'service area along the way' corresponds to topological relation query and service area category range query). But the semantic understanding is directly based on the semantic understanding of a large language model, and the semantic understanding capability is limited, so that the quality of the proprietary entity name corpus is very high, and meanwhile, the generated SQL also has the generation error in the conversion process, so that the inefficiency in the actual execution process is caused. 2. Ambiguity in entity identification, the current entity identification algorithm has ambiguity problems, such as that 'Nanjing bridge' is not blocked, and is easy to be identified as 'Nanjing bridge' or 'Nanjing bridge' and 'bridge' by mistake, but 'bridge in Nanjing city' which is actually and potentially used by a user cannot be accurately identified, and meanwhile, the entity identification is inaccurate or key information is lost due to lack of a checking mechanism of space entity retrieval conditions, so that the retrieval effect is further influenced. 3. The space calculation efficiency is low, for complex space information retrieval problems, such as the problems of 'Jiangsu province of the bridge in Taizhou' and 'which bridge' and 'which two nearest toll stations are of the Xionless West station', the space database is usually required to be called for multiple times, the response time is longer than 10 seconds, and for a large-scale data set, the efficiency problem is more serious and the user experience is poor. Due to the defects and the corresponding technical bottlenecks, the query success rate of the existing scheme in a high-speed management scene and an intelligent customer service scene is low, and more than about 70% of complex queries still need to be manually interfered, so that the automation, informatization and intelligent level of the system are greatly limited. Disclosure of Invention The invention aims to solve the problems in the prior art and provides a method for constructing a space semantic retrieval intelligent body oriented to intelligent high speed. In order to achieve the above purpose, the invention adopts the following technical scheme: A method for constructing intelligent high-speed space semantic retrieval intelligent body comprises the following steps: Constructing an entity word segmentation library based on multi-source high-speed traffic space information data, and constructing a pre-segmentation device based on the constructed word segmentation library; converting different source names and entity geographic coordinates of each entity in the entity word segmentation library into vectors respectively, and storing the vectors and the coding information of the entity coordinates into a vector library; constructing a four-layer structure dynamic knowledge graph consisting of an entity layer, a relation layer, a business rule layer and a road network real-time state layer; Setting up an agent workflow to perform a spatial retrieval task, comprising: the method comprises the steps of acquiring user query information by a large language model, calling a space retrieval intelligent agent, extracting entity information from the query information based on a pre-segmentation device, and acquiring an entity identification candidate set; The heterogeneous engine scheduling instruction set is generated by combining the background information and the predefined high-speed traffic space rule by utilizi