CN-122023716-A - Geometric representation and inverse transformation restoration method of image and related equipment

CN122023716ACN 122023716 ACN122023716 ACN 122023716ACN-122023716-A

Abstract

The invention belongs to the technical field of digital images, and relates to a geometric representation and inverse transformation restoration method of an image and related equipment, wherein the method comprises the steps of converting an input image into a standard grid and defining uniform source measurement; the method comprises the steps of constructing a target measure function based on attribute information of an image, transmitting a source measure to the target measure through optimal transmission mapping to generate an optimal transmission grid, and calculating a Beltrami coefficient field of the mapping as geometric representation of the image with different dimensions. And when reconstructing, solving the quasi-conformal mapping by using the coefficient field to reconstruct the grid, and recovering the original image according to the local area proportion of the grid. The invention also provides an implementation scheme for realizing quick mapping solving by utilizing the Fourier neural operator and constructing the end-to-end geometrical variation self-encoder. The technical scheme realizes high-fidelity image coding and decoding, can effectively eliminate double images in tasks such as image interpolation, generation and the like, and keeps structural continuity.

Inventors

ZENG WEI
XU ZONGBEN
CHEN XIN
ZHAO QI
YANG YIJUN
Mao Yaoxin

Assignees

西安交通大学

Dates

Publication Date: 20260512
Application Date: 20260119

Claims (10)

1. A method of geometrically representing an image, comprising: processing an input image into a standard grid, defining a source measure, and converting attribute information of the image into a target measure function on the standard grid; based on the source measure and the target measure function, solving through an optimal transmission solving model of a Fourier neural operator to obtain an optimal transmission mapping, and converting the standard grid into an optimal transmission grid according to the optimal transmission mapping; And calculating a deformation field from the standard grid to the optimal transmission grid, representing the deformation field as quasi-conformal mapping, and calculating a Beltrami coefficient field corresponding to the quasi-conformal mapping, wherein the Beltrami coefficient field forms geometric representations of images with different dimensions.
2. A geometric representation of an image according to claim 1, characterized in that the input image is processed as a standard grid and source measures are defined, and the attribute information of the image is converted into an objective measure function on the grid, in particular: Gridding an input image, setting characteristic points of attribute information of the image as grid vertexes, and constructing a standard triangular grid through the grid vertexes; defining the uniform probability distribution of the standard triangular mesh as a source measure; carrying out normalization processing on attribute information of an input image, and defining the attribute information as a specific function on an image domain; constructing an objective measure function on the standard grid based on the specific function on the image domain; before constructing an objective measure function on a standard grid, adding positive super parameters to process a zero-intensity region of the image; wherein the attribute information of the image includes at least one of pixel intensity, gradient, curvature, texture, and depth of the image.
3. The geometric representation method of an image according to claim 1, wherein the optimal transmission mapping is obtained by solving an optimal transmission solving model of a fourier nerve operator based on a source measure and a target measure function, and the standard grid is converted into the optimal transmission grid according to the optimal transmission mapping, specifically comprising: based on the source measure and the target measure function, splicing coordinates of the target measure function and the standard grid in the channel dimension to form an input tensor, inputting the tensor into an optimal transmission solving model of a pre-trained Fourier neural operator, and mapping the tensor from a low-dimensional space to a high-dimensional potential feature space to obtain an optimal transmission mapping; And (3) applying the optimal transmission mapping to each vertex of the standard grid, and forming the optimal transmission grid through an energy optimization method.
4. The geometric representation method of an image according to claim 1, wherein the Beltrami coefficient field specific acquisition process is: for each triangular patch in the standard grid, calculating the Beltrami coefficient of each triangular patch according to the corresponding vertex coordinates of the triangular patch in the standard grid and the optimal transmission grid; For each vertex on the standard grid, selecting the average value of all the Beltrami coefficients of the adjacent triangular patches of the vertex as the Beltrami coefficient of the vertex; The Beltrami coefficients for each vertex on all standard meshes are combined to form a Beltrami coefficient field over the image domain.
5. An inverse image restoration method based on geometric representation, which is characterized by comprising the following steps of, According to geometric representation of images with different dimensions, calculating corresponding quasi-conformal mapping by solving a linear Beltrami equation, and changing standard grid into reconstructed quasi-conformal grid by the quasi-conformal mapping; and calculating the local area of each vertex in the reconstructed quasi-conformal grid, reversely calculating the attribute information of the image of each vertex based on the inverse operation of the local area through the objective measure function, and recovering to obtain the reconstructed image.
6. An image processing method, characterized in that the geometric representation method based on the image of claim 1 or 5 and the inverse image transformation restoration method based on the geometric representation are implemented, comprising: Taking geometric representations of the images of different dimensions as inputs to a self-encoder model of the image; Interpolation or operation is carried out on geometric representations corresponding to the images with different dimensions in a potential space of the self-encoder model of the image, so that an interpolation or operation result is obtained; and decoding the interpolation or operation result into an image output through the inverse image transformation restoration method based on the geometric representation.
7. An image geometry representation system for performing the geometry representation method of an image according to any one of claims 1-4, comprising: The preprocessing module is used for processing the input image into a standard grid, defining a source measure and converting attribute information of the image into a target measure function on the grid; The mapping calculation module is used for solving through a Fourier neural operator based on the source measure and the target measure function to obtain an optimal transmission map, and converting the standard grid into an optimal transmission grid according to the optimal transmission map; the geometric feature extraction module is used for calculating a deformation field from the standard grid to the optimal transmission grid, representing the deformation field as quasi-conformal mapping, and calculating a Beltrami coefficient field corresponding to the quasi-conformal mapping, wherein the Beltrami coefficient field is used as geometric representation of images with different dimensions.
8. An image restoration system for performing the inverse geometric representation-based image restoration method of claim 5, comprising: The grid reconstruction module is used for calculating corresponding quasi-conformal mapping by solving a linear Beltrami equation according to geometric representation of input images with different dimensions, and transforming standard grid into reconstructed quasi-conformal grid by the quasi-conformal mapping; And the image recovery module is used for calculating the local area of each vertex in the reconstructed quasi-conformal grid, reversely calculating the attribute information characteristic of each vertex through the inverse operation of the objective measure function based on the local area, mapping the attribute information characteristic at the vertex to the image pixel grid, and recovering to obtain the reconstructed image.
9. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements a geometrical representation of an image according to any of claims 1-4.
10. An image processing apparatus, characterized by comprising: one or more processors; A memory storing computer program instructions; the computer program instructions, when executed by the one or more processors, cause the apparatus to perform the geometric representation method of an image as claimed in any one of claims 1-4, or to implement the geometric representation-based inverse image reconstruction method as claimed in any one of claim 5, or to run the image processing method as claimed in claim 6.

Description

Geometric representation and inverse transformation restoration method of image and related equipment Technical Field The invention belongs to the technical field of digital images, and relates to a geometric representation and inverse transformation restoration method of an image and related equipment. Background In the fields of computer vision and machine learning, a common representation of an image is to treat it as an intensity or color value on a grid of pixels. This discrete, pixel-based representation is a direct input to various learning models (e.g., convolutional neural networks) in the prior art. However, such pixel-based representations have inherent technical limitations. First, a single pixel contains only local information, and it is difficult to effectively capture the global structure and context of an image. Second, pixel-based statistics may lose fine texture and internal structure in the image. To overcome the locality defect of pixel representations, the prior art (e.g., convolutional neural network CNN) expands the receptive field by stacking convolutional layers in an attempt to capture a broader context. However, this approach only learns the structure indirectly, implicitly, and does not fundamentally solve the problem, and the learned features are still highly dependent on the local neighborhood of the pixel. Other image representation methods exist in the prior art, such as fourier transforms or wavelet transforms. Although the methods can capture the global frequency information of the image, the positioning capability in the spatial domain is poor, and the local geometric characteristics of the image are difficult to accurately describe and retain. In recent years, while deep learning models have been able to learn powerful hierarchical features from original pixels, these learned features remain in essence deep entangled (entangled) with photometric properties of the image (e.g., brightness, contrast). In these representations, the intrinsic geometry of the image is considered an implicit, uncontrollable by-product, rather than a primary entity that can be directly analyzed, manipulated, and retained. Such entanglement of luminosity and geometry results in difficulty in ensuring continuity and integrity of the intrinsic structure of the image, and in easily generating artifacts or structural distortions, when performing image transformation, editing or generating tasks. In addition, other coordinate-based image representation methods have emerged in the prior art, such as implicit neural representations or two-dimensional gaussian SPLATTING techniques. These methods model the image as a continuous function of a neural network or gaussian primitive. While they perform well in image fitting and compression, they are essentially "remembering" or "fitting" the appearance of an image implicitly through network weights. This representation still lacks explicit and controllable description of the geometry within the image. More importantly, they do not provide theoretical guarantees of maintaining topological structure and geometric continuity like differential homoembryo based methods when performing image transformation or interpolation, and their structure maintaining ability is completely dependent on black box like network learning results, lacking interpretability and stability. Therefore, how to decouple the representation of the image from its photometric appearance, extracting a representation method that is more intrinsic and focused on its internal geometry, and can provide explicit structural continuity and integrity with theoretical guarantees during transformation and generation, is a technical problem that is currently in need of solution in the field of image processing and computer vision. Disclosure of Invention The invention aims to solve the defects of the image representation method in the prior art and provides a representation method capable of decoupling the appearance of an image from the internal geometry of the image. In particular, the present invention addresses how to non-invasively and reversibly transform attribute information of an image into a geometric representation that can preserve topology and be efficiently utilized by a machine learning model. The invention is realized by the following technical scheme: a geometric representation method of an image, comprising: Processing the input image into a standard grid, defining a source measure, and converting attribute information of the image into a target measure function on the standard grid, wherein the attribute information comprises pixel intensity or other attributes derived from the original image; Based on a source measure and a target measure function, solving an optimal transmission solving model of a Fourier neural operator to obtain an optimal transmission mapping, converting a standard grid into an initial optimal transmission grid according to the optimal transmission mapping, and match