CN-122024021-A - Terminal position estimation method based on street view picture features

CN122024021ACN 122024021 ACN122024021 ACN 122024021ACN-122024021-A

Abstract

The invention relates to a terminal position estimation method based on street view picture characteristics, which belongs to the fields of geographic information analysis, image recognition and off-line terminal application and comprises the steps that a server-side crawls a target area street view picture and names longitude and latitude, and extracts feature vectors to construct a database with longitude and latitude indexes after multi-angle clipping and normalization processing of panoramic images; the mobile terminal loads the model, performs normalization processing consistent with the service end on the scene street photo, extracts characteristics, and outputs longitude and latitude information with highest similarity as a position estimation result after matching with the vector library. The invention is beneficial to solving the problem of how to estimate the current position under the condition of no network and no satellite positioning signals.

Inventors

HUANG ZIJUN
He Peichao
CHEN JING
ZANG MIN
SUI XIN
YANG HUI
WANG ZHIDUO
CHENG HONGJUN

Assignees

华东计算技术研究所（中国电子科技集团公司第三十二研究所）

Dates

Publication Date: 20260512
Application Date: 20260114

Claims (10)

1. The terminal position estimation method based on the street view picture features is characterized by comprising the following steps of: At a server, crawling the full-quantity street view image of a target area, naming the full-quantity street view image by longitude and latitude, performing multi-angle cutting and normalization processing on the panoramic image, extracting image feature vectors by using a model, and constructing a street view feature vector database with longitude and latitude indexes; at the terminal side, converting the model into a mobile terminal model format supported by an mnn framework, and migrating a street view feature vector database to an embedded vector database; In the mobile terminal, a mobile terminal model is loaded, a streetscape photo shot on site is normalized to be consistent with a server, a feature vector is extracted and then is subjected to similarity matching with a feature vector in an embedded vector database, and a plurality of records with the highest similarity and longitude and latitude information corresponding to the records are output as position estimation results.
2. The method for estimating terminal position based on street view picture features of claim 1 wherein the step of the server crawling street view images comprises the step of utilizing the GPU server to crawl all street view images of a task area in a map application, naming and storing the all street view images in a format containing longitude and latitude, and establishing a training data set with accurate geographic labels.
3. The method for estimating terminal position based on street view picture features according to claim 1, wherein the multi-angle clipping processing comprises clipping a sub-image every 60 ° in a horizontal direction from a 360 ° panoramic street view image in a north direction of 0 °, and each clipping output picture retains its original longitude and latitude information as a geographic tag.
4. The method for estimating terminal position based on street view picture features as recited in claim 1, wherein the normalization processing includes adopting unified image denoising, size adjustment and format standardization flow at the server and the terminal to eliminate illumination difference and noise interference and ensure consistency of feature extraction results at both ends.
5. The terminal position estimation method based on street view picture features of claim 1, wherein a DINOv-base model is adopted, each street view image is converted into a multidimensional feature vector when feature vector extraction is carried out at a server, and a < key, value > vector database is constructed by taking longitude and latitude as a key and the feature vector as a value.
6. The terminal position estimation method based on street view picture features of claim 1, wherein the model format conversion comprises converting a model trained by a server into a mobile terminal model in an mnn format by using an mnn framework, so as to realize light deployment of a neural network model on mobile hardware.
7. The method of claim 6, wherein the embedded vector database is ObjectBox database, which compresses the feature vector database generated by the server into a mobile storage format.
8. The terminal position estimation method based on street view picture features according to claim 1 or 7, wherein the similarity matching comprises the steps of performing cosine similarity calculation on feature vectors extracted from real photo and all feature vectors in an embedded vector database in a mobile terminal App, and searching out records with highest similarity and corresponding longitudes and latitudes.
9. The method for estimating the terminal position based on the street view picture features of claim 1 wherein the outputting of the position estimation result comprises displaying a plurality of longitude and latitude records of model estimation and similarity scores thereof in an App interface, and determining a final position by a user according to the record with highest similarity to realize man-machine collaborative decision.
10. The terminal position estimation method based on street view picture features of claim 3, wherein the 60-degree cutting angle is determined based on the difference between the actual shooting view angle and the view angle of the panoramic view, so that the training sample is closer to the local view shot by the mobile terminal, and the recognition robustness of the model to the limited view is improved.

Description

Terminal position estimation method based on street view picture features Technical Field The invention relates to the field of geographic information analysis, image recognition and terminal off-line application, in particular to a terminal position estimation method based on street view picture characteristics. Background At present, in the field of geographic information analysis application, the addition of geographic positions for picture information, such as common apples and android mobile phones, can be supported, when a picture is shot, the position of the current picture can be integrated into the picture in the form of picture attributes based on GPS/Beidou positioning information, so that a specific picture format is formed. And by utilizing a specific picture format, each terminal application can utilize the picture with the specific format to calculate the position location of the picture in a back way. However, in a special application scenario (for example, executing a special task), when a user is in a strange environment and encounters conditions such as abnormal communication network, weak/lost satellite positioning signals, the terminal cannot acquire the local geographic position information, so that the current positioning cannot be known, and the user can be assisted in position estimation by utilizing the characteristic information of the current street view picture. The street view function in the prior art is an online street view browsing function, and the street view panoramic picture is associated with the longitude and latitude corresponding to the street view panoramic picture according to the strong supporting capability of the server, so that street view information in the appointed longitude and latitude position can be displayed. However, the function is only used for viewing street views, and cannot be used for identification and reverse retrieval according to image information. Disclosure of Invention Aiming at the limitation and singleness of the current street view application, the invention provides a street view picture feature position estimation method based on no geographic information, which can identify the feature elements of the street view picture/photo on a handheld terminal (such as pad, mobile phone and the like) under the condition of no network and no satellite positioning signal, so as to compare the feature elements with a street view vector library and analyze the geographic position of the picture content. The invention aims to solve the problem of how to perform current position estimation under the condition of no network and no satellite positioning signals. In order to achieve the above objective, the present invention provides a terminal position estimation method based on street view picture features, which is used for estimating the geographic position of a mobile terminal in a network connection-free and satellite positioning signal failure environment, and includes the following steps: At a server, crawling the full-quantity street view image of a target area, naming the full-quantity street view image by longitude and latitude, performing multi-angle cutting and normalization processing on the panoramic image, extracting image feature vectors by using a model, and constructing a street view feature vector database with longitude and latitude indexes; at the terminal side, converting the model into a mobile terminal model format supported by an mnn framework, and migrating a street view feature vector database to an embedded vector database; In the mobile terminal, a mobile terminal model is loaded, a streetscape photo shot on site is normalized to be consistent with a server, a feature vector is extracted and then is subjected to similarity matching with a feature vector in an embedded vector database, and a plurality of records with the highest similarity and longitude and latitude information corresponding to the records are output as position estimation results. Preferably, the step of crawling the street view image by the server comprises the step of crawling all street view images of the task area in the map application by using the GPU server, naming and storing the street view images in a format containing longitude and latitude, and establishing a training data set with accurate geographic labels. Preferably, the multi-angle clipping processing comprises the steps of clipping a sub-image of a 360-degree panoramic street view from the north direction of 0 degrees along the horizontal direction every 60 degrees, and keeping the original longitude and latitude information of each clipping output image as a geographic tag. Preferably, the normalization processing comprises the steps of adopting unified image denoising, size adjustment and format standardization flow at the server and the terminal, eliminating illumination difference and noise interference, and ensuring the consistency of feature extraction results at the two ends. Prefe