CN-122023165-A - Visual communication image sharpening processing method and system

CN122023165ACN 122023165 ACN122023165 ACN 122023165ACN-122023165-A

Abstract

The invention relates to the field of image recognition processing and discloses a visual transmission image definition processing method and a visual transmission image definition processing system, wherein the visual transmission image definition processing method comprises the steps of executing global motion compensation to obtain an alignment historical frame, calculating a difference value between a current frame and the alignment historical frame, performing morphological open operation and purification, and generating a normalized motion amplitude diagram through linear mapping; according to the amplitude diagram, linear interpolation is carried out, a dynamic filter coefficient exclusive to each pixel is calculated, the coefficient is applied to generate a current frame output pixel by combining a previous frame output value through recursive filtering, and a final synthesized output frame is used for identification.

Inventors

Xiong Xuezhu

Assignees

重庆师范大学

Dates

Publication Date: 20260512
Application Date: 20251208

Claims (10)

1.A method for visually communicating an image sharpness process, applied to an image recognition system, the method comprising: Step 101, global motion compensation, namely acquiring a current frame and a historical frame in a video stream, estimating global motion caused by camera shake, and transforming the historical frame based on the global motion to generate an aligned historical frame; Step 102, a mask purifying step, namely calculating a pixel original differential value between a current frame and an aligned historical frame, and executing a morphological open operation on the pixel original differential value to inhibit isolated noise and reserve an aggregation motion area so as to generate a purified differential map; Step 103, a motion amplitude generating step, which is to perform linear mapping operation on the refined differential graph, wherein the linear mapping operation comprises the steps of setting a value lower than a preset noise base threshold value in the refined differential graph to be 0, and linearly mapping a value between the noise base threshold value and a preset motion saturation threshold value to a floating point number interval of 0.0 to 1.0 so as to generate a normalized motion amplitude graph; Step 104, a proportional filtering control step, setting a static filtering coefficient and a dynamic maintaining coefficient, and executing linear interpolation operation between the static filtering coefficient and the dynamic maintaining coefficient according to the pixel value in the normalized motion amplitude diagram so as to generate a pixel-level dynamic filtering coefficient exclusive to the pixel; Step 105, a time domain filtering step, which maintains a frame buffer for storing the output result of the previous frame, substitutes the pixel level dynamic filter coefficient into the recursive average filter, and combines the pixel value of the current frame with the output pixel value of the previous frame stored in the frame buffer to calculate and obtain the output pixel value of the current frame; step 106, a frame synthesizing and updating step, which combines the output pixel values of all the current frames to generate an output frame for the subsequent image recognition, and updates the frame buffer by using the output pixel values of the current frames.
2. The method of claim 1, further comprising a step 201 of determining instantaneous motion based on comparing the normalized motion amplitude map with a predetermined motion threshold to generate an instantaneous motion mask, a step 202 of updating the history mask to maintain a history mask buffer and perform an asymmetric update on the history mask buffer, wherein for pixels marked as dynamic in the instantaneous motion mask, the corresponding value thereof in the history mask buffer is set to a predetermined maximum value, and for pixels marked as static in the instantaneous motion mask, the corresponding value thereof in the history mask buffer is subjected to a step-by-step decay operation, a step 203 of normalizing the values in the history mask buffer to generate a provisional motion amplitude map, and the proportional filtering control step is modified to perform a linear interpolation operation based on the pixel values in the provisional motion amplitude map to generate a pixel-level dynamic filter coefficient.
3. A method of sharpness processing for visually conveyed images according to claim 1, wherein the static filter coefficients in the step of proportional filter control are defined as Static filter coefficients Floating point number with value range of 0 to 0.3, dynamic retention coefficient is defined as Dynamic retention coefficient And a recursive average filter in the time-domain filtering step that calculates the output pixel value of the current frame The following operation rules are: , wherein, Output pixel values for the current frame; for the pixel-level dynamic filter coefficients, For the pixel value of the current frame, Is the output pixel value of the previous frame stored in the frame buffer.
4. The method of claim 1, wherein the morphological open operation in the mask refining step comprises performing an etching operation first and then performing an expansion operation after the etching operation.
5. The method of claim 1, wherein the mask refining step is performed by performing an inter-frame difference operation on the current frame and the aligned historical frame.
6. The method of claim 1, wherein the linear mapping operation in the motion amplitude generating step further comprises setting a pixel value in the normalized motion amplitude map to 1.0, wherein the pixel value is greater than a predetermined motion saturation threshold in the refined difference map.
7. The method according to claim 1, wherein in the global motion compensation step, the estimation of the global motion is performed based on matching and tracking feature points of static background areas in the current frame and the historical frame.
8. The method according to claim 2, wherein the ratio of the maximum value to the attenuation value in the gradual attenuation operation is set in advance such that the time required for the corresponding value of a target whose motion is stopped in the history mask buffer to attenuate to 0 is greater than a predetermined number of video frames.
9. The method according to claim 1, wherein the calculation of the pixel original differential value in the mask refining step is performed by performing three-frame differential operation on the current frame, the aligned history frame, and an earlier aligned history frame, and the morphological opening operation in the mask refining step is replaced with a space-time consistency arbitration operation including checking each pixel in the pixel original differential value, continuously maintaining a predetermined number of frames in a time dimension, and whether the pixel original differential value exists in a neighborhood cluster having a predetermined size in a space dimension, and retaining the pixel value in the refined differential map only when one pixel satisfies both the constraint of the time dimension and the space dimension.
10. A visually conveyed image sharpening processing system for implementing a visually conveyed image sharpening processing method of claim 1, the system comprising: a global motion compensation unit for obtaining a current frame and a history frame in the video stream, estimating global motion caused by camera shake, and transforming the history frame based on the global motion to generate an aligned history frame; a mask purifying unit for calculating pixel original differential values between the current frame and the aligned historical frame, and performing a morphological open operation on the pixel original differential values to suppress isolated noise and preserve an aggregate motion region, thereby generating a purified differential map; A motion amplitude generating unit for performing a linear mapping operation on the refined difference map, the linear mapping operation including setting a value below a preset noise floor threshold in the refined difference map to 0, and linearly mapping a value between the noise floor threshold and a preset motion saturation threshold to a floating point number interval of 0.0 to 1.0 to generate a normalized motion amplitude map; A proportional filtering control unit for setting static filtering coefficient and dynamic maintaining coefficient, and performing linear interpolation operation between the static filtering coefficient and the dynamic maintaining coefficient according to the pixel value in the normalized motion amplitude diagram to generate a pixel-level dynamic filtering coefficient exclusive to the pixel; Substituting the pixel-level dynamic filter coefficient into a recursive average filter, and combining the pixel value of the current frame with the output pixel value of the previous frame stored in the frame buffer to calculate and obtain the output pixel value of the current frame; a frame synthesizing and updating unit for combining the output pixel values of all the current frames to generate an output frame for subsequent image recognition, and updating the frame buffer using the output pixel values of the current frames.

Description

Visual communication image sharpening processing method and system Technical Field The invention relates to a visual transmission image sharpening processing method and a visual transmission image sharpening processing system, and belongs to the technical field of image recognition processing. Background In image processing and machine vision applications, especially in relation to analysis and recognition tasks of video streams, a system needs to extract stable and clear information from continuous image frames, in order to cope with interference of sensor noise on a subsequent recognition algorithm, a basic and widely adopted technical means in the field is time domain filtering, for example, random noise is effectively suppressed by a multi-frame averaging or recursive filtering mode, and signal to noise ratio and recognizability of a static background area in an image are improved, however, in the conventional mode depending on time domain processing, when facing a high-value and challenging application scene, namely a dynamic target recognition task in a low-illumination and high-noise environment, an inherent principle constraint appears, the constraint is represented by that the time domain filtering causes irreversible pollution to transient information of any moving target in the image while improving the definition of the static background, serious or ghost distortion is caused, and the transient information is just the important object of the recognition system, so that the prior art forces an inherent technology of the system to enable the time domain filtering to see the background, namely that the time domain filtering is enabled to see the background, but the dynamic noise contour is too low to be the recognition target is submerged, and the noise ratio is not completely lost, and the noise ratio is completely lost because the recognition algorithm is not completely lost. In the field, although various improved paths have been tried for reconciling such dynamic and static conflicts, such as a compromised filtering strength or adaptive noise reduction methods designed for improving the visual perception of human eyes, these approaches tend to seek a visual balance rather than providing an optimal signal for machine recognition, and for the downstream recognition algorithm, the requirements are almost absolute, i.e. it needs a noise-free stable background as a reference standard, and a dynamic target form which retains all instantaneous details without any time domain pollution, the signal requirements of which are not satisfied by the above compromise, and even further, even if a simple partition processing thought is adopted, the prior art still generally ignores the more complex secondary technical limitations introduced by the real environment, these limitations can cause the simple partition logic to fail rapidly in reality, besides the stable reference loss brought by the hardware acquisition aspect such as camera shake, on the software recognition and processing method of the moving target, the fundamental technical shortages also exist, for example, the chinese invention disclosed in CN patent 110544222a discloses a visual target form which retains all instantaneous details, the above compromise way, even if a simple partition processing thought is adopted, the more complex secondary technical limitation is adopted by the actual engineering application, these limitations are more complex in real environment, these limitations are caused by the real world, besides the stable reference loss caused by the hardware acquisition aspect, the hardware, the software recognition method is also the stable reference loss, on the hardware, on the moving target object, and software is also on the software, and the software is also on the recognition method, the aspect, and the intrinsic limitation, the real state, and the real limitation, and the real state, and real limitation are based on the real contrast, and real state, and real quality. The problems of dynamic-static area signal pollution and edge artifacts caused by camera shake, motion object suspension and motion detection misjudgment in a high-noise environment cannot be solved. Therefore, how to construct a processing method can fundamentally solve dynamic and static conflicts in time domain processing, provide high-definition background context and high-fidelity dynamic targets for a recognition system at the same time, have enough robustness, and can actively cope with real engineering challenges such as camera shake, target persistence, high-noise misjudgment and the like, and become the technical problem to be solved by the invention. Disclosure of Invention The invention provides a visual communication image sharpening processing method, which mainly aims to solve the problems of dynamic and static conflicts in time domain processing and insufficient recognition robustness caused by the conflicts under real working conditions such as