KR-20260065667-A - MULTI-MODALITY BASED VEHICLE RE-IDENTIFICATION DEVICE AND METHOD

KR20260065667AKR 20260065667 AKR20260065667 AKR 20260065667AKR-20260065667-A

Abstract

The present disclosure relates to a multimodality-based vehicle re-identification device and method comprising all of the vehicle's external information, vehicle license plate information, and environmental context information. The multimodality-based vehicle re-identification device according to the present disclosure comprises: a camera that captures a visual image of the vehicle's exterior in real time; a communication unit that supports wired or wireless communication between the vehicle re-identification device and an external electronic device; a memory that stores one or more instructions; and a processor that executes the one or more instructions stored in the memory. When the instructions are executed by the processor, the processor may perform the following operations: detecting a vehicle and surrounding environment within a visual image captured by the camera; cropping the detected vehicle image; extracting license plate information of the vehicle from the cropped image; extracting external information of the vehicle from the cropped image; generating multimodality integrated information by combining the extracted external information and license plate information and the detected surrounding environment information; and re-identifying the vehicle based on the multimodality integrated information.

Inventors

김수훈
남명우
암마르 울 하산 무함마드
무함마드 우마이르 칸
이브로히모브 분요드벡 패즐리딘 우글리

Assignees

주식회사 델타엑스

Dates

Publication Date: 20260511
Application Date: 20241101

Claims (12)

As a multi-modality-based vehicle re-identification device mounted on a vehicle, A camera that captures a visual image of the exterior of the above vehicle in real time; A communication unit that supports wired or wireless communication between the above-mentioned vehicle re-identification device and an external electronic device; Memory for storing one or more instructions; and It includes a processor that executes one or more instructions stored in the memory, When the above instructions are executed by the processor, the processor, Operation of detecting a vehicle and surrounding environment within a visual image captured by the above camera; The operation of cropping the above-detected vehicle image; The operation of extracting license plate information of the vehicle from the cropped image above; The operation of extracting external information of the vehicle from the cropped image above; An operation to generate multimodality integrated information by combining the extracted appearance information, the license plate information, and the detected surrounding environment information; and operation of re-identifying the vehicle based on the above multimodality integration information A multi-modality-based vehicle re-identification device that enables the performance of
In Article 1, The operation of generating the multimodality integrated information by combining the extracted appearance information, the license plate information, and the detected surrounding environment information is, An operation of converting each feature included in the extracted appearance information, the license plate information, and the detected surrounding environment information into respective high-dimensional vectors; The operation of generating a multimodality integration vector by combining each of the above high-dimensional vectors; and The operation of storing the above multimodality integration vector as the above multimodality integration information A multi-modality based vehicle re-identification device including
In Article 2, The operation of generating a multimodality integration vector by combining each of the above high-dimensional vectors is, A concatentation-based combination method that generates the multimodality integration vector by connecting each of the above high-dimensional vectors, or A weighted sum combination method that generates the multimodality integration vector by assigning weights to each of the above high-dimensional vectors and summing them, or A multi-layer neural network-based combination method that generates the multi-modality integration vector by receiving each of the above high-dimensional vectors as input values and combining them non-linearly through a multi-layer neural network, or An attention-based combination method that generates the multimodality integration vector by learning the importance of each of the above high-dimensional vectors and then assigning weights to the more important vectors. A multi-modality-based vehicle re-identification device performed by any one of the following.
In Article 1, A multi-modality-based vehicle re-identification device in which the operation of detecting a vehicle and surrounding environment within a visual image captured by the above camera is performed using a YOLO (You Only Look Once) model pre-trained to detect a vehicle and surrounding environment within an image through an object recognition learning dataset.
In Article 4, A multi-modality-based vehicle re-identification device in which the operation of extracting license plate information of the vehicle from the cropped image is performed through OCR (Optical Character Recognition) processing.
In Article 5, A multi-modality-based vehicle re-identification device, wherein the operation of extracting external information of the vehicle from the cropped image is performed using a ResNet (Residual Network) model pre-trained to extract external information of the vehicle within the image through a vehicle recognition learning dataset.
As a multimodality-based vehicle re-identification method used in vehicles, A step of detecting the vehicle and the surrounding environment within a visual image captured by a camera mounted on the vehicle; A step of cropping the detected vehicle image above; A step of extracting license plate information of the vehicle from the cropped image above; A step of extracting external information of the vehicle from the cropped image above; A step of generating multimodality integrated information by combining the extracted appearance information, the license plate information, and the detected surrounding environment information; and A step of re-identifying the vehicle based on the above multimodality integration information A method including
In Article 7, The step of generating the multimodality integration information by combining the extracted appearance information, the license plate information, and the detected surrounding environment information is: A step of converting each feature included in the extracted appearance information, the license plate information, and the detected surrounding environment information into respective high-dimensional vectors; A step of generating a multimodality integration vector by combining each of the above high-dimensional vectors; and Step of storing the above multimodality integration vector as the above multimodality integration information A multi-modality based vehicle re-identification method including
In Article 8, The step of generating a multimodality integration vector by combining each of the above high-dimensional vectors is: A concatenation-based combination method that generates the multimodality integration vector by connecting each of the above high-dimensional vectors, or A weighted sum combination method that generates the multimodality integration vector by assigning weights to each of the above high-dimensional vectors and summing them, or A multilayer neural network-based combination method that generates the multimodality integration vector by receiving each of the above high-dimensional vectors as input values and combining them non-linearly through a multilayer neural network, or An attention-based combination method that generates the multimodality integration vector by learning the importance of each of the above high-dimensional vectors and then assigning weights to the more important vectors. A multi-modality-based vehicle re-identification method performed by any one of the following.
In Article 9, A multi-modality-based vehicle re-identification method, wherein the step of detecting a vehicle and surrounding environment within a visual image captured by the camera is performed using a YOLO (You Only Look Once) model pre-trained to detect a vehicle and surrounding environment within an image through an object recognition learning dataset.
In Article 10, A multi-modality-based vehicle re-identification method in which the step of extracting license plate information of the vehicle from the cropped image is performed through Optical Character Recognition (OCR) processing.
In Article 11, A multi-modality-based vehicle re-identification method, wherein the step of extracting external information of the vehicle from the cropped image is performed using a ResNet (Residual Network) model that is pre-trained to extract external information of the vehicle within the image through a vehicle recognition learning dataset.

Description

Multi-modality Based Vehicle Re-identification Device and Method The present disclosure relates to a multimodality-based vehicle re-identification device and method, and more specifically, to a multimodality-based vehicle re-identification device and method that includes all of the vehicle's external information, license plate information, and environmental context information. Existing Re-ID technologies primarily identify vehicles by utilizing external characteristics, but there is a possibility of generating incorrect identification results due to factors such as exterior similarity, various lighting conditions, and changes in viewpoint. For example, if the vehicle manufacturer or model is identical, external characteristics (color, shape, model, etc.) may be almost indistinguishable from the exterior, and consequently, there is a possibility that autonomous driving systems or vehicle tracking systems equipped with existing Re-ID technology may fail to identify the vehicle. In addition, the vehicle's external characteristics may vary depending on changes in lighting or weather conditions; for example, the vehicle's color or outline may appear different when the lighting is dim, or the exterior may appear blurry due to rain, snow, or fog. Therefore, autonomous driving systems or vehicle tracking systems applying existing vehicle re-identification (Re-ID) technology contained the possibility of misidentification caused by various environmental factors. FIG. 1 is a block diagram of a vehicle re-identification device according to one embodiment of the present disclosure. FIG. 2 is a drawing for explaining the specific usage state of a vehicle re-identification device according to one embodiment of the present disclosure. FIG. 3 is a block diagram of a vehicle re-identification method according to one embodiment of the present disclosure. FIG. 4 is a drawing for explaining a specific usage state of a vehicle re-identification method according to one embodiment of the present disclosure. Hereinafter, preferred embodiments of the present disclosure will be described in detail with reference to the accompanying drawings. First, it should be noted that in assigning reference numerals to the components of each drawing, the same components are given the same reference numeral whenever possible, even if they are shown in different drawings. Furthermore, in describing the present disclosure, if it is determined that a detailed description of related known components or functions could obscure the essence of the present disclosure, such detailed description will be omitted. Throughout the specification, when a part is described as "including" a certain component, this means that, unless specifically stated otherwise, it does not exclude other components but may include additional components. Furthermore, terms such as "…part," "…unit," and "module" as used in the specification refer to a unit that processes at least one function or operation, and this may be implemented in hardware, software, or a combination of hardware and software. In addition, to facilitate understanding of the present disclosure, the terms used in this specification are explained as follows. As used in this specification, the term "vehicle appearance information" refers to information regarding various vehicle appearance elements used to visually identify a vehicle, such as vehicle color, vehicle shape, vehicle angle, additional accessories and appearance variations, vehicle condition, etc. As used in this specification, the term "vehicle environmental context information (or vehicle surrounding environment information)" refers to information regarding elements of the physical environment and situations surrounding the vehicle, such as roads, traffic signals, signs, surrounding vehicles, lighting, weather, etc., while the vehicle is driving. As used in this specification, the term "vehicle license plate information" refers to a unique number or character assigned to a vehicle in the country or region where the vehicle is registered. As used in this specification, the term "multimodality integrated information" refers to information combining vehicle appearance information, license plate information, and environmental context information used to uniquely identify a vehicle. Hereinafter, an apparatus and method for detecting traffic accidents in a vehicle emergency rescue system according to the present disclosure will be described with reference to the drawings. [Multi-modality based vehicle re-identification device (10)] FIG. 1 is a block diagram of a multi-modality-based vehicle re-identification device (10) (hereinafter abbreviated as vehicle re-identification device (10)) according to one embodiment of the present disclosure, and FIG. 2 is a diagram showing a specific usage state in which the vehicle re-identification device (10) is used. Referring to FIG. 1 and FIG. 2 together, the vehicle re-identification device (10) includes a camera (110)