CN-122023541-A - Self-adaptive green curtain image matting management system and method based on artificial intelligence
Abstract
The invention discloses an artificial intelligence-based self-adaptive green curtain image matting management system and method, and relates to the technical field of neural networks, wherein different sensors are used for collecting multi-source data of a green curtain field, and an end-to-end self-adaptive image matting neural network is constructed; the method comprises the steps of processing multi-source data by using an encoder to output characteristics of different dimensions, fusing three paths of characteristics to output comprehensive characteristics by using a self-adaptive fusion weight, constructing a decoder by using a U-Net structure, outputting three predicted values, correcting a foreground by using the three predicted values to output a final foreground, constructing a self-adaptive resolution adjustment mechanism, dynamically adjusting an image downsampling proportion, setting a physical constraint separation foreground and background, constructing a time domain smoothing mechanism guided by an optical flow to refine and filter an image edge area, inputting green curtain multi-source data into an inference pipeline to output a matting result, and carrying out quality evaluation and feedback update on the matting result.
Inventors
- YANG JIANRU
Assignees
- 数智云库(北京)科技有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260127
Claims (10)
- 1. The self-adaptive green curtain image matting management method based on artificial intelligence is characterized by comprising the following steps of: S100, acquiring multi-source data of a green curtain site by using different sensors, respectively carrying out hardware synchronization and space calibration when acquiring the multi-source data, and constructing an end-to-end self-adaptive matting neural network; S200, constructing three paths of parallel encoder branches in the self-adaptive image matting neural network by utilizing a depth residual error network, processing and outputting characteristics of different dimensions to multi-source data by utilizing an encoder, and endowing attention weight to the characteristics of the multi-source data for characteristic enhancement; s300, respectively calculating self-adaptive fusion weights aiming at the output three paths of features, and fusing the three paths of features to output comprehensive features by utilizing the self-adaptive fusion weights; S400, monitoring performance of the matting neural network in real time, constructing a self-adaptive resolution adjustment mechanism, and dynamically adjusting the downsampling proportion of the image according to deviation of actual processing time and target processing time; s500, setting physical constraint to separate foreground and background, constructing a time domain smoothing mechanism guided by optical flow, and carrying out thinning filtering on an image edge region; s600, a real-time reasoning pipeline is built in the self-adaptive image matting neural network, green curtain multi-source data are input into the reasoning pipeline to output a matting result, and quality assessment and feedback update are conducted on the matting result.
- 2. The method for adaptively managing green curtain matting based on artificial intelligence according to claim 1, wherein the specific steps of performing hardware synchronization and space calibration respectively when collecting multi-source data in S100 are as follows: S101, acquiring color image sequences I rgb (x, y, t) of a green curtain scene by using an HDR main camera, generating pixel-by-pixel depth sequences D (x, y, t) by using the ToF camera, acquiring reflection characteristics I nir (x, y, t) of green curtain materials by using the NIR camera, wherein x and y represent pixel space coordinates and t represents a time frame; S102, hard synchronization of synchronous phase-locked Genlock signals is utilized for all sensors, and pixel-level mapping is established through a checkerboard calibration method; obtaining four-channel tensor X (X, y, t) = [ R, G, B and D ] by carrying out time synchronization and space mapping on the acquired multi-source data, wherein R, G and B respectively represent three primary color channels of a color image, and D represents the space depth of a depth image; an end-to-end self-adaptive matting neural network is constructed, and the self-adaptive matting network comprises an input layer, an encoder, a decoder and an output layer.
- 3. The method for adaptively managing green screen matting based on artificial intelligence according to claim 2, wherein the specific step of giving attention weight to the characteristics of the multi-source data to perform characteristic enhancement in S200 is as follows: S201, constructing three paths of parallel encoder branches in a self-adaptive image matting neural network by using a depth residual error network ResNet-50 as a backbone, wherein the encoder branches comprise an RGB encoder, a depth encoder and a fusion encoder; For an RGB encoder to input an RGB color image, extracting color, texture and edge characteristics in the color image, and outputting an RGB characteristic image with the characteristic dimension of 256 XH28 XW 8, wherein 256 represents the characteristic channel dimension, and H8 and W8 represent the down-sampling output of 8 times of a depth residual error network ResNet-50 respectively; normalizing the spatial distance between the foreground and the background to [0,1] aiming at the input depth image of the depth encoder, separating the foreground and the background, and outputting a depth feature map with the feature dimension of 128 XH 8 XW 8, wherein 128 represents the dimension of an output feature channel; Aiming at the input color image and depth image of the fusion encoder, channel stitching is adopted to conduct advanced interaction on the features of the RGB color image and the depth image, and a stitching feature map with feature dimensions of 192 XH 8 XW 8 is output, wherein 192 represents the output feature channel dimension; s202, distributing space attention weights in RGB coder branches aiming at foreground areas separated by depth coder branches; S203, enhancing the RGB feature map by using the space attention weight map, wherein the formula is F rgb ' =F rgb A, F rgb ' represents the enhanced RGB feature map, Representing the multiplication of the RGB feature map and the spatial attention weight map element by element.
- 4. The method for adaptively managing green screen matting based on artificial intelligence according to claim 3, wherein the specific steps of correcting and outputting the final foreground for the foreground by using three predicted values in S300 are as follows: s301, extracting a global pooling vector from an RGB feature map to obtain an environmental context vector, wherein the environmental context vector comprises environmental illumination and green curtain uniformity; Adding and summing the RGB feature map, the depth feature map and the spliced feature map by using the fusion weight to obtain a comprehensive feature map F used ; s302, the decoder adopts jump connection of a U-Net structure to simultaneously output three predictions, wherein the three predictions comprise an Alpha mask, a foreground residual error and an edge confidence; Aiming at Alpha mask prediction, setting the number of channels as 1, the activation function as Sigmoid, and outputting the transparency Alpha of the foreground in the range of [0,1];0 is a pure background, 1 is a pure foreground, and 0-1 is a semitransparent edge; Setting the number of channels as 3 for foreground residual prediction, setting an activation function as Tanh, correcting foreground colors, and outputting corrected RGB values by utilizing residual correction original foreground for a foreground semitransparent area; Aiming at edge confidence prediction, setting the number of channels as 1, enabling an activation function as Sigmoid, and marking edge reliability; S303, analyzing and outputting a final foreground by utilizing the foreground transparency to the original foreground RGB value and the corrected RGB value, wherein the formula is as follows: ; In the formula, F final represents the final foreground, F orig represents the original foreground RGB values, alpha represents the Alpha mask predicted foreground transparency, R fg represents the modified RGB values, And (5) representing element-by-element multiplication, and outputting the final foreground as a green curtain image.
- 5. The method for adaptively managing green screen matting based on artificial intelligence according to claim 4, wherein the step of dynamically adjusting the downsampling ratio of the image in S400 is as follows: s401, monitoring performance of the matting neural network in real time, wherein the performance comprises frame processing delay and CPU memory occupation, defining target time of the frame processing delay, and constructing a self-adaptive resolution adjustment mechanism, and specifically comprises the following steps: ; In the formula, S scale represents the downsampling proportion of an input image, clip represents that the limit value is in the range of [0.5,1], lambda represents the adjustment coefficient, T target represents the target processing time, and T process represents the real-time processing time of frame processing delay; And calculating the downsampling proportion of the input image of the self-adaptive matting neural network in real time by utilizing a self-adaptive resolution adjustment mechanism to update and adjust.
- 6. The method for adaptively managing green curtain image matting based on artificial intelligence according to claim 5, wherein the specific step of constructing a time domain smoothing mechanism for optical flow guidance in S500 to refine and filter the image edge region is as follows: s501, calculating the average depth of a green curtain background according to a depth image, artificially presetting a tolerance threshold value of the green curtain background and a foreground, and setting physical constraint to separate the foreground and the background, wherein the specific steps are as follows: when D (x, y) < D bg -delta, forcing Alpha mask predicted value to be alpha=1, judging as foreground; when D (x, y) > D bg +delta, forcing Alpha mask predicted value to be alpha=0, and judging the Alpha mask predicted value as background; S502, constructing a time domain smoothing mechanism of optical flow guidance, taking an Alpha mask and edge confidence in a decoder as inputs, and filtering a final prospect by using the time domain smoothing mechanism.
- 7. The method for adaptively managing green curtain matting based on artificial intelligence according to claim 6, wherein the specific steps of performing quality evaluation and feedback update on matting results in S600 are as follows: s601, a real-time reasoning assembly line is established in a self-adaptive image matting neural network, and specifically comprises the steps of inputting multi-source data, synchronizing the data, extracting the characteristics of an encoder, self-adaptive fusion, multi-task decoding, physical constraint, time domain filtering and outputting an image matting result; S602, collecting standard deviation of Alpha mask changes of continuous adjacent frames as time domain jitter, presetting a standard deviation threshold, and increasing the Alpha mask weight of the current frame in a time domain smoothing mechanism by 0.1 when the time domain jitter is greater than the standard deviation threshold.
- 8. The self-adaptive green curtain image matting management system based on the artificial intelligence is characterized by comprising a data acquisition module, a feature extraction module, a multi-task decoding module, a result optimization module, an image matting output module and an evaluation updating module; The data acquisition module is used for acquiring multi-source data of a green curtain site by using different sensors, and respectively carrying out hardware synchronization and space calibration when acquiring the multi-source data, so as to construct an end-to-end self-adaptive image matting neural network; The characteristic extraction module is used for constructing three paths of parallel encoder branches by utilizing a depth residual error network, processing and outputting characteristics of different dimensionalities by utilizing the encoder, endowing attention weights to the characteristics of the multi-source data to perform characteristic enhancement, respectively calculating self-adaptive fusion weights aiming at the outputted three paths of characteristics, and fusing the three paths of characteristics by utilizing the self-adaptive fusion weights to output comprehensive characteristics; The multi-task decoding module is used for constructing a decoder by utilizing a U-Net structure, outputting three predicted values, and correcting and outputting a final prospect by utilizing the three predicted values; The result optimization module is used for monitoring the performance of the matting neural network in real time, constructing a self-adaptive resolution adjustment mechanism, dynamically adjusting the downsampling proportion of the image according to the deviation of the actual processing time and the target processing time; The image matting output module is used for constructing a real-time reasoning pipeline in the self-adaptive image matting neural network, inputting green curtain multi-source data into the reasoning pipeline and outputting a matting result; The evaluation updating module is used for carrying out quality evaluation and feedback updating aiming at the matting result.
- 9. The adaptive green curtain matting management system based on artificial intelligence according to claim 8, wherein the feature extraction module comprises an encoder unit, a feature enhancement unit and a feature fusion unit; The encoder unit is used for constructing three parallel encoder branches by using depth residual error networks ResNet-50 as backbones in the self-adaptive matting neural network; The characteristic enhancement unit is used for enhancing the RGB characteristic map by using the spatial attention weight map; the feature fusion unit is used for respectively calculating self-adaptive fusion weights aiming at the three paths of output features, and the three paths of features are fused to output comprehensive features by utilizing the self-adaptive fusion weights.
- 10. The adaptive green curtain matting management system based on artificial intelligence according to claim 8, wherein the result optimization module comprises an adaptive resolution adjustment unit, a physical constraint unit and a time domain smoothing unit; The self-adaptive resolution adjustment unit is used for calculating the downsampling proportion of the input image of the self-adaptive matting neural network in real time by utilizing a self-adaptive resolution adjustment mechanism to update and adjust; the physical constraint unit is used for calculating the average depth of the green curtain background according to the depth image, and artificially presetting the tolerance threshold value of the green curtain background and the foreground; The time domain smoothing unit is used for constructing a time domain smoothing mechanism of optical flow guidance, and filtering the final prospect by using the time domain smoothing mechanism.
Description
Self-adaptive green curtain image matting management system and method based on artificial intelligence Technical Field The invention relates to the technical field of neural networks, in particular to an artificial intelligence-based self-adaptive green curtain image matting management system and method. Background The traditional green curtain image matting relies on fixed color threshold segmentation such as HSV/YUV and the like, and has extremely high requirements on pure-color uniform green curtains and stable illumination. In an actual scene, green curtain folds, shadows and reflection are easy to cause ' missing and ' rough edges ', a foreground containing green elements can be mistakenly scratched, dynamic foreground and complex illumination changes can greatly reduce image scratching precision, manual image repairing is time-consuming and poor in consistency, and single-frame finishing cost can reach a plurality of minutes. Along with the development of science and technology culture, the demand for real-time performance and flexibility of image matting in the fields of film and television production, live broadcast, online education, virtual people and the like is increased. For example, millisecond background replacement is needed for live broadcast, dynamic machine position and complex prospect are needed to be adapted for virtual production, and the mobile terminal application is limited by calculation force and needs a light-weight scheme. The traditional color key is difficult to consider real-time performance and precision, cannot adapt to parameter adjustment of different scenes, and severely restricts application efficiency. With the development of artificial intelligence, deep learning can learn complex features of foreground and background, sub-pixel level edge segmentation is realized, detail image matting problems of hairs, semitransparent objects and the like are solved, GPU/TPU computing power is improved and edge computing is developed, real-time AI image matting is possible on a terminal side, multi-source data fusion further enhances scene adaptation capability, and a data and computing power foundation is provided for self-adaptive management and closed loop optimization. Disclosure of Invention The invention aims to provide an artificial intelligence-based self-adaptive green curtain image matting management system and method, which are used for solving the problems in the prior art. In order to achieve the above purpose, the present invention provides the following technical solutions: an artificial intelligence-based self-adaptive green curtain image matting management method comprises the following steps: S100, acquiring multi-source data of a green curtain site by using different sensors, respectively carrying out hardware synchronization and space calibration when acquiring the multi-source data, and constructing an end-to-end self-adaptive matting neural network; further, the specific steps of respectively carrying out hardware synchronization and space calibration when the multi-source data are collected are as follows: S101, acquiring color image sequences I rgb (x, y, t) of a green curtain scene by using an HDR main camera, generating pixel-by-pixel depth sequences D (x, y, t) by using the ToF camera, acquiring reflection characteristics I nir (x, y, t) of green curtain materials by using the NIR camera, wherein x and y represent pixel space coordinates and t represents a time frame; S102, hard synchronization of synchronous phase-locked Genlock signals is utilized for all sensors, pixel-level mapping is established through a checkerboard calibration method, and the formula is as follows: ; In the formula, (u d,vd) represents the pixel coordinates of the depth map, (u rgb,vrgb) represents the pixel coordinates of the RGB image, H represents the 3X 3 homography matrix, and the pixel coordinates are calculated by a checkerboard calibration method; obtaining four-channel tensor X (X, y, t) = [ R, G, B and D ] by carrying out time synchronization and space mapping on the acquired multi-source data, wherein R, G and B respectively represent three primary color channels of a color image, and D represents the space depth of a depth image; Color images, depth data and reflection characteristics are acquired through an HDR main camera, a ToF camera and an NIR camera, multi-dimensional information such as colors, space distances and materials is covered, and the problem of image matting errors caused by insufficient data dimension of a single sensor is solved. The Genlock signal hard synchronization ensures the time alignment of multi-source data, and the checkerboard calibration method establishes pixel-level mapping, so that characteristic matching deviation caused by data dislocation is avoided, and a precise foundation is laid for subsequent cross-modal characteristic fusion. An end-to-end self-adaptive matting neural network is constructed, and the self-adaptive matting netw