CN-122023350-A - Training method for image quality evaluation model, image quality evaluation method, device, electronic apparatus, storage medium, and computer program product

CN122023350ACN 122023350 ACN122023350 ACN 122023350ACN-122023350-A

Abstract

The present disclosure provides a training method, an image quality evaluation method, an apparatus, an electronic device, a storage medium, and a computer program product of an image quality evaluation model, the training method comprising obtaining a training image dataset comprising an image pair, wherein the image pair comprises two images that are similar and a relative preference relationship of the two images of the image pair, inputting the images of the training image dataset into the image quality evaluation model to obtain respective scores for the two images of the image pair, determining a reward for each score based on a consistency of a comparison relationship between the scores for the two images and the relative preference relationship of the two images, and adjusting parameters of the image quality evaluation model based on the reward for each score.

Inventors

YUAN YANLONG
GONG JIACHAO
SUN MING
ZHOU CHAO

Assignees

北京达佳互联信息技术有限公司

Dates

Publication Date: 20260512
Application Date: 20260130

Claims (11)

1. A training method of an image quality evaluation model, comprising: acquiring a training image dataset comprising an image pair, wherein the image pair comprises two images that are similar and information about the relative preference relationship of the two images of the image pair; Inputting the image pairs of a training image dataset into an image quality assessment model to obtain respective scores for both images of the image pairs; Determining a reward for each score based on consistency of a comparison between scores for the two images and a relative preference relationship of the two images; parameters of the image quality evaluation model are adjusted based on rewards for each score.
2. The method of claim 1, wherein the relative preference relationship of the two images of the image pair is determined based on weighted scores for a plurality of quality dimensions of the two images, Wherein the plurality of quality dimensions includes at least two image quality dimensions: Sharpness, texture, artifacts, color, and brightness.
3. The method of claim 1, wherein the image quality assessment model is a multimodal large language model based on a group relative policy optimization GRPO method configured to score a first image in the image pair to obtain at least one first score for the first image and to score a second image in the image pair to obtain at least one second score for the second image.
4. The method of claim 3, wherein determining the reward for scoring based on consistency of a comparison between the scores for the two images and a relative preference relationship of the two images comprises: Comparing each of the at least one first score with each of the at least one second score and determining a reward corresponding to each of the at least one first score based on consistency of the comparison with the relative preference relationship of the two images; comparing each of the at least one second score with each of the at least one first score, and determining a reward corresponding to each of the at least one second score based on consistency of the comparison with the relative preference relationship of the two images.
5. The method of claim 1, wherein obtaining a score for each of the two images of the image pair comprises: The plurality of scores for each image are resampled based on the mean and standard deviation of the plurality of scores to obtain a score having a predetermined accuracy.
6. An image quality evaluation method, comprising: Acquiring an image to be evaluated; inputting the image to be evaluated into an image quality evaluation model trained using any one of claims 1-5 to obtain a score for the image to be evaluated.
7. An apparatus for training an image quality assessment model, comprising: a training data acquisition unit configured to acquire a training image data set including an image pair, wherein the image pair includes two images that are similar and information on a relative preference relationship of the two images of the image pair; a scoring unit configured to input the image pair of a training image dataset into an image quality assessment model to obtain a score for each of the two images of the image pair; A reward determination unit configured to determine a reward for each score based on consistency of a comparison relationship between scores for the two images and a relative preference relationship of the two images; and a parameter adjustment unit configured to adjust parameters of the image quality evaluation model based on rewards for each score.
8. An image quality evaluation device comprising: An image acquisition unit configured to acquire an image to be evaluated; An evaluation unit configured to input the image to be evaluated to an image quality evaluation model trained using any one of claims 1-5 to obtain an input image score.
9. An electronic device, comprising: At least one processor; at least one memory storing computer-executable instructions, Wherein the computer executable instructions, when executed by the at least one processor, cause the at least one processor to perform the training method of the image quality assessment model of any one of claims 1 to 5 or to perform the image quality assessment method of claim 6.
10. A computer readable storage medium, which when executed by at least one processor, causes the at least one processor to perform the training method of the image quality assessment model of any one of claims 1 to 5, or to perform the image quality assessment method of claim 6.
11. A computer program product, characterized in that instructions in the computer program product are executed by at least one processor to perform the training method of the image quality assessment model according to any one of claims 1 to 5 or to perform the image quality assessment method according to claim 6.

Description

Training method for image quality evaluation model, image quality evaluation method, device, electronic apparatus, storage medium, and computer program product Technical Field The present disclosure relates to the field of computer vision, and more particularly, to a training method of an image quality evaluation model, an image quality evaluation method, an apparatus, an electronic device, a storage medium, and a computer program product. Background Image quality assessment (Image Quality Assessment, IQA) is an important technology in the field of computer vision, and its core objective is to simulate the perception of the human visual system on the sharpness, naturalness and overall look and feel of images or videos, so as to realize the quantitative judgment on the quality of visual contents. In general, this process may be described as inviting multiple viewers to subjectively score (e.g., 1-5 cents) based on established evaluation criteria (e.g., sharpness, distortion level, etc.) given a video frame or image, thereby obtaining a subjective mean opinion score (Mean Opinion Score, MOS) for the visual content. The quality evaluation algorithm aims at realizing automatic objective quality scoring by constructing a calculation model, learning and fitting the complex mapping relation from the visual signal to the subjective perception score. In recent years, a multi-modal large model (Multimodal Large Models) has been used to construct a precise and robust IQA model. The model has strong general feature extraction capability, rich semantic understanding capability and modeling capability on complex modes, and can remarkably improve the generalization performance of the model on diversified and unknown distortion types. Meanwhile, the abundant knowledge contained in the large model and the deep understanding of the image content provide a potential explanatory basis for the prediction result of the model, and help to understand the basis of specific quality judgment of the model. However, the conventional image quality evaluation model based on MMLM generally has the problems that the basic model training scale is small, the generalization is not strong, the recognition capability for paired image scenes with only nuances is poor, and the quality scoring precision is low. Disclosure of Invention According to a first aspect of an embodiment of the present disclosure, there is provided a training method of an image quality evaluation model, characterized by comprising acquiring a training image dataset comprising an image pair, wherein the image pair comprises two similar images and information about the relative preference relationship of the two images of the image pair, inputting the image pair of the training image dataset into the image quality evaluation model to obtain respective scores for the two images of the image pair, determining a reward for each score based on a consistency of the relative preference relationship between the scores for the two images and the relative preference relationship of the two images, and adjusting parameters of the image quality evaluation model based on the reward for each score. According to a first aspect of embodiments of the present disclosure, the relative preference relationship of two images of the image pair is determined from a weighted score for a plurality of quality dimensions of the two images, wherein the plurality of quality dimensions includes at least two image quality dimensions of sharpness, texture, artifacts, color, and brightness. According to a first aspect of embodiments of the present disclosure, the image quality assessment model is a multimodal large language model based on a Group Relative Policy Optimization (GRPO) method configured to score a first image of the pair of images to obtain at least one first score for the first image and score a second image of the pair of images to obtain at least one second score for the second image. According to a first aspect of embodiments of the present disclosure, determining rewards for the scores based on consistency of the relative preference relationship between the scores for the two images and the relative preference relationship for the two images includes comparing each of the at least one first score with each of the at least one second score and determining rewards corresponding to each of the at least one first score based on consistency of the comparison result with the relative preference relationship for the two images, comparing each of the at least one second score with each of the at least one first score and determining rewards corresponding to each of the at least one second score based on consistency of the comparison result with the relative preference relationship for the two images. According to a first aspect of embodiments of the present disclosure, obtaining the scores for each of the two images of the image pair includes resampling the plurality of scores for each image based