CN-121999480-A - Fish tracking and identifying method

CN121999480ACN 121999480 ACN121999480 ACN 121999480ACN-121999480-A

Abstract

The application provides a fish tracking and identifying method, which relates to the technical field of data processing, and comprises the steps of acquiring real-time underwater video data acquired by image acquisition equipment on bionic fish equipment, adopting a preset fish tracking and identifying model to identify fish in the real-time underwater video data to obtain a boundary frame of target fish, sending coordinates of the boundary frame of the target fish to a computing unit in the bionic fish equipment, enabling the computing unit to determine position coordinates of the target fish according to the coordinates of the boundary frame of the target fish, and controlling the bionic fish equipment to move along with the target fish according to the position coordinates of the target fish. The application accurately and efficiently positions the target fish, thereby realizing closed-loop control of the motion gesture of the bionic fish.

Inventors

LIU CHUNNA
GAO DONGQI
DU JIN
LU JIA
SHEN JIAN
WANG LUHAI
YUE QINGSONG
LIU DIE
Zhu Mengen
LI RUI
WANG HONGXING
Zhou Rongpan

Assignees

中国水利水电科学研究院
华电西藏能源有限公司

Dates

Publication Date: 20260508
Application Date: 20260123

Claims (10)

1. A fish tracking identification method, characterized by being applied to an edge computing device, the method comprising: Acquiring real-time underwater video data acquired by image acquisition equipment on bionic fish equipment; carrying out fish identification on the real-time underwater video data by adopting a preset fish tracking identification model to obtain a boundary frame of the target fish; the coordinates of the boundary frame of the target fish are sent to a computing unit in the bionic fish equipment, so that the computing unit determines the position coordinates of the target fish according to the coordinates of the boundary frame of the target fish, and controls the bionic fish equipment to move along with the target fish according to the position coordinates of the target fish.
2. The method of claim 1, wherein the predetermined fish-tracking recognition model comprises a backbone network, a neck network, and a detection head, wherein the performing fish recognition on the real-time underwater video data using the predetermined fish-tracking recognition model to obtain pixel coordinates of the target fish comprises: extracting characteristics of a plurality of frames of underwater images in the real-time underwater video data by adopting the backbone network to obtain a multi-scale characteristic sequence of the plurality of frames of underwater images; performing feature enhancement on the multi-scale feature sequence by adopting the neck network to obtain the enhanced multi-scale feature sequence; detecting the enhanced multi-scale characteristic sequence by adopting the detection head to obtain bounding boxes of various fishes; The bounding box of the target fish is determined from the bounding boxes of the plurality of fish.
3. The method of claim 1, wherein the multi-scale feature sequence comprises a first-scale feature sequence, a second-scale feature sequence, and a third-scale feature sequence, and wherein the backbone network comprises a first feature extraction module, a second feature extraction module, and a third feature extraction module; the adoption of the backbone network to perform feature extraction on a plurality of frames of underwater images in the real-time underwater video data to obtain a multi-scale feature sequence of the plurality of frames of underwater images comprises the following steps: Downsampling the underwater image by adopting the first feature extraction module to obtain a first scale feature sequence; Downsampling the first scale feature sequence by adopting the second feature extraction module to obtain a second scale feature sequence; and downsampling the second scale feature sequence by adopting the third feature extraction module to obtain a third scale feature sequence.
4. The method of claim 3, wherein the first feature extraction module comprises a first convolution layer, a second convolution layer, a first depth separable convolution layer, a third convolution layer, and a second depth separable convolution layer, the second feature extraction module comprises a fourth convolution layer and a third depth separable convolution layer, the third feature extraction module comprises a fifth convolution layer, a fourth depth separable convolution layer, and a fast spatial pyramid pooling ‌ SPPF layer, and the downsampling the underwater image with the first feature extraction module to obtain a first scale feature sequence comprises: downsampling the underwater image by adopting the first convolution layer to obtain a first downsampling characteristic; downsampling the first downsampling feature by using the second convolution layer to obtain a second downsampling feature; performing feature enhancement on the second downsampling feature by adopting the first depth separable convolution layer to obtain a first enhancement feature; Downsampling the first enhancement feature with the third convolution layer to obtain a third downsampled feature; Performing feature enhancement on the third downsampled feature by using the second depth separable convolution layer to obtain the first scale feature sequence; The step of downsampling the first scale feature sequence by the second feature extraction module to obtain a second scale feature sequence includes: Downsampling the first scale feature sequence by adopting the fourth convolution layer to obtain a fourth downsampled feature; performing feature enhancement on the fourth downsampling feature by adopting the third depth separable convolution layer to obtain the second scale feature sequence; The step of downsampling the second scale feature sequence by the third feature extraction module to obtain a third scale feature sequence includes: downsampling the second scale feature sequence by adopting the fifth convolution layer to obtain a fifth downsampled feature; enhancing the fifth downsampling feature by adopting the fourth depth separable convolution layer to obtain a fifth enhanced downsampling feature; and carrying out feature prediction on the fifth enhanced downsampling feature by adopting the ‌ SPPF layer to obtain the third-scale feature sequence.
5. The method of claim 3, wherein the enhanced multi-scale feature sequence comprises a first enhancement feature sequence, a second enhancement feature sequence, and a third enhancement feature sequence, wherein the neck network comprises a first attention layer, a first upsampling layer, a first full join layer, a fifth depth separable convolution layer, a second upsampling layer, a second full join layer, a second attention layer, a sixth depth separable convolution layer, a sixth convolution layer, a third full join layer, a seventh depth separable convolution layer, a seventh convolution layer, a fourth full join layer, and an eighth depth separable convolution layer; The step of enhancing the multi-scale feature sequence by adopting the neck network to obtain the enhanced multi-scale feature sequence comprises the following steps: Adopting the first attention layer to carry out attention enhancement on the third scale feature sequence to obtain a third scale enhanced feature sequence; Upsampling the third scale-enhanced feature sequence using the first upsampling layer to obtain a first upsampled feature; Splicing the first upsampling feature and the second scale feature sequence by adopting the first full-connection layer to obtain a first splicing feature; reinforcing the first splicing characteristic by adopting the fifth depth separable convolution layer to obtain a first reinforced splicing characteristic; upsampling the first enhanced splice feature with the second upsampling layer to obtain a second upsampled feature; splicing the second upsampling feature and the first scale feature sequence by adopting the second full-connection layer to obtain a second splicing feature; Adopting the second attention layer to carry out attention enhancement on the second splicing characteristic to obtain a second splicing enhancement characteristic; Reinforcing the second spliced reinforcing feature by adopting the sixth depth separable convolution layer to obtain a first reinforcing feature sequence; downsampling the first enhancement feature sequence by using the sixth convolution layer to obtain a sixth downsampled feature; Splicing the sixth downsampling feature and the first enhancement splicing feature by adopting the third full-connection layer to obtain a third splicing feature; Reinforcing the third splicing characteristic by adopting the seventh depth separable convolution layer to obtain the second reinforced characteristic sequence; Downsampling the second enhancement feature sequence with the seventh convolution layer to obtain a seventh downsampled feature; splicing the seventh downsampling feature and the second splicing enhancement feature by adopting the fourth full-connection layer to obtain a fourth splicing feature; and reinforcing the fourth splicing characteristic by adopting the eighth depth separable convolution layer to obtain the third reinforced characteristic sequence.
6. The method of claim 4 or 5, wherein the depth separable convolutional layers comprise at least three bottleneck layers connected in sequence, the bottleneck layers comprising two convolutional layers, the output of the last bottleneck layer of the at least two bottleneck layers being the output of the depth separable convolutional layers.
7. A fish tracking identification method, characterized by being applied to a computing unit in a bionic fish device, the method comprising: Acquiring underwater video data acquired by image acquisition equipment; Transmitting the underwater video data to edge computing equipment, so that the edge computing equipment adopts a preset fish tracking and identifying model to identify fish in the underwater video data, and obtaining a boundary box of target fish; Receiving coordinates of a bounding box of the target fish sent by the edge computing equipment; determining the position coordinates of the target fish according to the coordinates of the boundary frame of the target fish; And controlling the bionic fish equipment to move along with the target fish according to the position coordinates of the target fish.
8. The method of claim 7, wherein the determining the location coordinates of the target fish based on the coordinates of the bounding box of the target fish comprises: according to the coordinates of the boundary frame of the target fish, determining the coordinates of the central point of the boundary frame; And converting the center point coordinates of the target fish from a camera coordinate system of the image acquisition equipment to a world coordinate system to obtain the position coordinates of the target fish.
9. The method of claim 7, wherein controlling the biomimetic fish device to follow the target fish for movement according to the position coordinates of the target fish comprises: Processing a plurality of frames of underwater images in the real-time underwater video data by adopting a sparse optical flow method to obtain a motion vector of a target characteristic point in a continuous frame; According to the motion vector of the target characteristic point, determining the motion vector and the motion speed of the target fish; predicting a future position of the target fish according to the position coordinates of the target fish and the movement speed; And controlling the bionic fish equipment to move along with the target fish according to the motion vector of the target fish and the future position of the target fish.
10. The method of claim 9, wherein controlling the biomimetic fish device to follow the target fish for movement based on the target fish's motion vector and the target fish's future location comprises: Determining the motion control quantity of the bionic fish equipment according to the motion vector of the target fish; and controlling the bionic fish equipment to move along with the target fish according to the movement control quantity and the future position of the target fish.

Description

Fish tracking and identifying method Technical Field The application relates to the technical field of data processing, in particular to a fish tracking and identifying method. Background The fish is used as a key indicator species in the ecological systems of natural rivers, lakes and reservoirs, and the population structure, the distribution characteristics and the behavior patterns of the fish directly reflect the health condition and the dynamic change of the ecological environment of the water area. In the fields of ecological monitoring, fishery resource evaluation, endangered species protection, aquatic organism diversity research and the like, the method has important significance for long-term, continuous and accurate observation of fishes. At present, the technical means for monitoring underwater fishes mainly comprise a manual fishing method, fixed camera monitoring, acoustic detection, visual-based passive video analysis and the like. The artificial fishing method comprises the steps of collecting fish on a set section by using modes such as a fixed net, a trawl net, a stab net and an electric catching mode, and combining on-site photographing or video recording auxiliary recording, wherein the fixed underwater photographing system performs off-line analysis after long-time video recording so as to obtain the frequency and the type information of the fish. However, the prior art has the problems of high dependence on manual operation, low monitoring efficiency, poor repeatability, difficulty in realizing real-time response and the like. The manual sampling method not only disturbs the ecological system, but also can not realize continuous observation, the fixed camera has limited visual field, the target is easy to lose, and the subsequent video analysis is time-consuming and labor-consuming. Disclosure of Invention The application aims to provide a fish tracking and identifying method aiming at the defects in the prior art, so that target fish can be accurately and efficiently positioned, and further closed-loop control of the motion gesture of the bionic fish is realized. In order to achieve the above purpose, the technical scheme adopted by the embodiment of the application is as follows: in a first aspect, an embodiment of the present application provides a fish tracking and identifying method, including: Acquiring real-time underwater video data acquired by image acquisition equipment on bionic fish equipment; carrying out fish identification on the real-time underwater video data by adopting a preset fish tracking identification model to obtain a boundary frame of the target fish; the coordinates of the boundary frame of the target fish are sent to a computing unit in the bionic fish equipment, so that the computing unit determines the position coordinates of the target fish according to the coordinates of the boundary frame of the target fish, and controls the bionic fish equipment to move along with the target fish according to the position coordinates of the target fish. Optionally, the preset fish tracking and identifying model comprises a backbone network, a neck network and a detection head, wherein the fish tracking and identifying model is adopted to identify the fish from the real-time underwater video data to obtain the pixel coordinates of the target fish, and the fish tracking and identifying model comprises the following steps: extracting characteristics of a plurality of frames of underwater images in the real-time underwater video data by adopting the backbone network to obtain a multi-scale characteristic sequence of the plurality of frames of underwater images; performing feature enhancement on the multi-scale feature sequence by adopting the neck network to obtain the enhanced multi-scale feature sequence; detecting the enhanced multi-scale characteristic sequence by adopting the detection head to obtain bounding boxes of various fishes; The bounding box of the target fish is determined from the bounding boxes of the plurality of fish. Optionally, the multi-scale feature sequence comprises a first-scale feature sequence, a second-scale feature sequence and a third-scale feature sequence, wherein the backbone network comprises a first feature extraction module, a second feature extraction module and a third feature extraction module; the adoption of the backbone network to perform feature extraction on a plurality of frames of underwater images in the real-time underwater video data to obtain a multi-scale feature sequence of the plurality of frames of underwater images comprises the following steps: Downsampling the underwater image by adopting the first feature extraction module to obtain a first scale feature sequence; Downsampling the first scale feature sequence by adopting the second feature extraction module to obtain a second scale feature sequence; and downsampling the second scale feature sequence by adopting the third feature extraction module to obtain a third