KR-20260065224-A - SERVER FOR UNDERWATER FISH DETECTION BASED ON PREDICTION MODEL AND METHOD OPERATION THEREOF

KR20260065224AKR 20260065224 AKR20260065224 AKR 20260065224AKR-20260065224-A

Abstract

According to various embodiments, a server for detecting fish underwater based on a prediction model comprises: a communication interface; and a processor; wherein the processor acquires at least one underwater image information containing at least one fish from an external electronic device of the aquaculture facility through the communication interface, inputs the at least one underwater image information into a first sub-deep learning model to output at least one first scale information, inputs the at least one underwater image information into a second sub-deep learning model to output at least one second scale information, inputs the at least one underwater image information, the at least one first scale information, and the at least one second scale information into the prediction model to output at least one fish movement prediction information, and is configured to determine that if the at least one fish movement prediction information exceeds a preset first threshold value, the at least one fish movement prediction information includes information indicating that the fish is moving in an accurate prediction direction, and the prediction model is learned based on a plurality of underwater image information, a plurality of first scale information, a plurality of second scale information, a plurality of fish movement prediction information, and result information indicating that a plurality of fish have moved in an accurate direction.

Inventors

이상현

Assignees

호남대학교 산학협력단

Dates

Publication Date: 20260508
Application Date: 20241101

Claims (12)

In a server for detecting fish underwater based on a prediction model, Communication interface; and Includes a processor; and The above processor is, Through the above communication interface, at least one underwater image information including at least one fish is obtained from an external electronic device of the aquaculture facility, and Input at least one underwater image information into a first sub-deep learning model to output at least one first scale information, and input at least one underwater image information into a second sub-deep learning model to output at least one second scale information, Input the above at least one underwater image information, the above at least one first scale information, and the above at least one second scale information into the prediction model to output at least one fish movement prediction information, and When the above at least one fish movement prediction information exceeds a preset first threshold value, it is set to determine that the above at least one fish movement prediction information includes information that moves in an accurate prediction direction, and The above prediction model is, Learning based on multiple underwater image information, multiple first scale information, and multiple second scale information, multiple fish movement prediction information, and multiple fish movement result information. Server.
In Article 1, The above at least one underwater image information is, At least one fish image and/or video information including fish inhabiting the above-mentioned aquaculture farm, at least one underwater environment image and/or video information surrounding the fish inhabiting the above-mentioned aquaculture farm, and at least one aquaculture image information regarding modern aquaculture configured in the above-mentioned aquaculture farm. Server.
In Article 2, The above processor is, The above at least one underwater image information is input into the first sub-deep learning model including a backbone (fusion deep network) structure, and Performing multilayer combination of the above at least one underwater environment image information to extract various types of the above at least one underwater environment image information captured at different depths, and It is configured to extract at least one first scale information by assembling feature extraction from at least one multilayer combined underwater environment image information, and The above-mentioned first sub-deep learning model is, A plurality of underwater image information and a plurality of underwater environment image information, and a backbone structure, a multilayer combined structure, and a plurality of first scales that are learned based on Server.
In Article 2, The above processor is, The above at least one underwater image information is input into the second sub-deep learning model comprising the backbone structure, the head structure performing detection, and the neck structure connecting the backbone and the head to reconstruct while refining a feature map, and In the above at least one underwater image information, at least one fish movement information configured in the form of a vector for the direction and/or trajectory of the fish moving in the above at least one fish image information is obtained, and The above at least one fish image information and the above at least one fish movement information are input into the above second sub-deep learning model to obtain the above at least one second scale information, and The above second sub-deep learning model is, Learning based on a plurality of underwater image information and a plurality of fish image information, the backbone structure, the head structure, the neck structure, a plurality of fish movement information, and a plurality of second scales, Server.
In Paragraph 4, The above processor is, The direction in which the fish moves is estimated as a multiple object based on the output of the at least one fish movement prediction information, which is input into the prediction model by the at least one first scale information and the at least one second scale information, and [Equation 1] below. The above-mentioned fish movement prediction information estimated as multiple objects is configured to evaluate the appearance feature vector of the object and the similarity between the objects based on the following [Equation 1], Server. [Mathematical Formula 1] (1) (2) (3) (4) Here, is the Kalman gain value, and is the currently input covariance matrix, and is the covariance matrix, and is an observation model, and is a noise covariance model, and is the final updated appearance feature vector of the above object, and is the currently input vector value, and This is the final updated vector in the above observation model, and is the previously outputted at least one underwater image information, and is the currently input at least one underwater image information, and is a value measured for at least one underwater image information, and is a value predicted from at least one of the above fish movement prediction information, and is the covariance matrix, and is a value for evaluating the similarity between the above measured value and the above predicted value.
In Article 5, The above processor is, The performance of the prediction model is determined by assessing the tracking accuracy, tracking precision, and consistency of the at least one fish movement prediction information evaluated for the similarity between the object's appearance feature vector and the object based on the following [Equation 2], and Comparing the performance of the above prediction model with the at least one fish movement prediction information identified as accurate with the first threshold value, if the at least one fish movement prediction information exceeds the first threshold value, it is determined that the at least one fish movement prediction information includes information that moves in an accurate prediction direction, and Set to transmit at least one fish movement prediction information exceeding the first threshold value to an external electronic device of the aquaculture facility through the above communication interface, Server. [Mathematical Formula 2] (1) (2) (3) Here, (multiple object tracking accuracy) is the tracking accuracy mentioned above, and is a false negative value, and is a false positive value, and (multiple object tracking precision) is the tracking precision mentioned above, and is an object time It is a value that outputs the distance for the current error that matches the comparison, and The above time It is the total number of objects that match, and is the total number of exact matches, and is the number of incorrect matches, and is the missing matching ID, and is the consistency mentioned above.
In a method for operating a server for detecting fish underwater based on a prediction model, The above method is, Through a communication interface, at least one underwater image information including at least one fish is obtained from an external electronic device of the aquaculture facility, and Through a processor, at least one underwater image information is input to a first sub-deep learning model to output at least one first scale information, and at least one underwater image information is input to a second sub-deep learning model to output at least one second scale information. Through the above processor, at least one underwater image information, at least one first scale information, and at least one second scale information are input into the prediction model to output at least one fish movement prediction information, and Through the above processor, if the at least one fish movement prediction information exceeds a preset first threshold value, it is configured to determine that the at least one fish movement prediction information includes information that moves in an accurate prediction direction, and The above prediction model is, Learning based on multiple underwater image information, multiple first scale information, and multiple second scale information, multiple fish movement prediction information, and multiple fish movement result information. method.
In Article 7, The above at least one underwater image information is, At least one fish image and/or video information including fish inhabiting the above-mentioned aquaculture farm, at least one underwater environment image and/or video information surrounding the fish inhabiting the above-mentioned aquaculture farm, and at least one aquaculture image information regarding modern aquaculture configured in the above-mentioned aquaculture farm. method.
In Article 8, The above method is, The above at least one underwater image information is input into the first sub-deep learning model including a backbone (fusion deep network) structure, and Performing multilayer combination of the above at least one underwater environment image information to extract various types of the above at least one underwater environment image information captured at different depths, and It is configured to extract at least one first scale information by assembling feature extraction from at least one multilayer combined underwater environment image information, and The above-mentioned first sub-deep learning model is, A plurality of underwater image information and a plurality of underwater environment image information, and a backbone structure, a multilayer combined structure, and a plurality of first scales that are learned based on method.
In Article 8, The above method is, The above at least one underwater image information is input into the second sub-deep learning model comprising the backbone structure, the head structure performing detection, and the neck structure connecting the backbone and the head to reconstruct while refining a feature map, and In the above at least one underwater image information, at least one fish movement information configured in the form of a vector for the direction and/or trajectory of the fish moving in the above at least one fish image information is obtained, and The above at least one fish image information and the above at least one fish movement information are input into the above second sub-deep learning model to obtain the above at least one second scale information, and The above second sub-deep learning model is, Learning based on a plurality of underwater image information and a plurality of fish image information, the backbone structure, the head structure, the neck structure, a plurality of fish movement information, and a plurality of second scales, method.
In Article 10, The above method is, The direction in which the fish moves is estimated as a multiple object based on the output of the at least one fish movement prediction information, which is input into the prediction model by the at least one first scale information and the at least one second scale information, and [Equation 1] below. The above-mentioned fish movement prediction information estimated as multiple objects is configured to evaluate the appearance feature vector of the object and the similarity between the objects based on the following [Equation 1], method. [Mathematical Formula 1] (1) (2) (3) (4) Here, is the Kalman gain value, and is the currently input covariance matrix, and is the covariance matrix, and is an observation model, and is a noise covariance model, and is the final updated appearance feature vector of the above object, and is the currently input vector value, and This is the final updated vector in the above observation model, and is the previously outputted at least one underwater image information, and is the currently input at least one underwater image information, and is a value measured for at least one underwater image information, and is a value predicted from at least one of the above fish movement prediction information, and is the covariance matrix, and is a value for evaluating the similarity between the above measured value and the above predicted value.
In Article 11, The above method is, The performance of the prediction model is determined by assessing the tracking accuracy, tracking precision, and consistency of the at least one fish movement prediction information evaluated for the similarity between the object's appearance feature vector and the object based on the following [Equation 2], and Comparing the performance of the above prediction model with the at least one fish movement prediction information identified as accurate with the first threshold value, if the at least one fish movement prediction information exceeds the first threshold value, it is determined that the at least one fish movement prediction information includes information that moves in an accurate prediction direction, and Set to transmit at least one fish movement prediction information exceeding the first threshold value to an external electronic device of the aquaculture facility through the above communication interface, method. [Mathematical Formula 2] (1) (2) (3) Here, (multiple object tracking accuracy) is the tracking accuracy mentioned above, and is a false negative value, and is a false positive value, and (multiple object tracking precision) is the tracking precision mentioned above, and is an object time It is a value that outputs the distance for the current error that matches the comparison, and The above time It is the total number of objects that match, and is the total number of exact matches, and is the number of incorrect matches, and is the missing matching ID, and is the consistency mentioned above.

Description

Server for Underwater Fish Detection Based on Prediction Model and Method of Operation Thereof Various embodiments of the present invention relate to a server for detecting fish underwater based on a prediction model and a method for operating the same. In modern aquaculture, continuous observation is necessary to assess fish health and water quality due to the characteristics of the underwater environment. To facilitate this management, the aquaculture sector has developed various methods utilizing computer vision technology to analyze fish behavior and condition, aiming to improve water quality control and production efficiency. However, conventionally, this process has been performed manually, and since performing this process manually is inefficient, there is a need for a method to monitor fish underwater by installing cameras and sensors and analyze them using computer vision technology. In particular, while the detection and tracking of fish are essential for aquaculture farms, no such prediction model has been invented in the past, and there was a problem in that multiple fish could not be identified simultaneously. Therefore, based on the problems in these aquaculture farms, there is a need for technology and systems that can easily detect and track fish by outputting predictive information based on a prediction model. FIG. 1 illustrates a block diagram of a server and a network according to various embodiments of the present invention. FIG. 2 is a flowchart illustrating how a server operates according to various embodiments. FIG. 3 is an example diagram showing a server acquiring at least one underwater image data according to various embodiments. FIG. 4 is an example diagram of a first sub-deep learning model included in a server according to various embodiments. FIG. 5 is an example diagram of a second sub-deep learning model included in a server according to various embodiments. FIG. 6 is an example of a prediction model included in a server according to various embodiments. FIG. 7 is an example of at least one prediction information captured and extracted from a prediction model according to various embodiments. Hereinafter, various embodiments of this document are described with reference to the accompanying drawings. The embodiments and the terms used therein are not intended to limit the technology described in this document to specific embodiments and should be understood to include various modifications, equivalents, and/or substitutions of said embodiments. In relation to the description of the drawings, similar reference numerals may be used for similar components. A singular expression may include a plural expression unless the context clearly indicates otherwise. In this document, expressions such as "A or B" or "at least one of A and/or B" may include all possible combinations of items listed together. Expressions such as "first," "second," "first," or "second" may modify said components regardless of order or importance and are used only to distinguish one component from another and do not limit said components. When it is mentioned that a certain (e.g., 1st) component is "(functionally or telecommunicationally) connected" or "connected" to another (e.g., 2nd) component, said certain component may be directly connected to said other component or connected through another component (e.g., 3rd component). In this document, "configured to" may be used interchangeably with, depending on the context, for example, hardware- or software-wise, "suitable for," "capable of," "modified to," "made to," "capable of," or "designed to." In some cases, the expression "device configured to" may mean that the device is "capable of" in conjunction with other devices or components. For example, the phrase "processor configured to perform A, B, and C" may mean a dedicated processor for performing the corresponding operations (e.g., an embedded processor), or a general-purpose processor capable of performing the corresponding operations by executing one or more software programs stored in a memory device (e.g., a CPU or application processor). An electronic device according to various embodiments of the present document may include, for example, at least one of a smartphone, a tablet PC, a desktop PC, a laptop PC, a netbook computer, a workstation, and a server. Referring to FIG. 1, a server (101) within a network environment (100) in various embodiments is described. The server (108) may include a bus (110), a processor (120), memory (130), an input/output interface (140), a display (150), and a communication interface (160). In some embodiments, the server (108) may omit at least one of the components or additionally include other components. The bus (110) may include a circuit that connects the components (110-160) to each other and transmits communication (e.g., control messages or data) between the components. The processor (120) may include one or more of a central processing unit, an application pr