JP-7856381-B2 - Image storage system for images with overlapping sections

JP7856381B2JP 7856381 B2JP7856381 B2JP 7856381B2JP-7856381-B2

Inventors

フ、ロン
ワン、ウェン
イン、ジアンドン
シェン、ハオ
ジャン、カン
ユ、チュアンチン

Assignees

インターナショナル・ビジネス・マシーンズ・コーポレーション

Dates

Publication Date: 20260511
Application Date: 20221013
Priority Date: 20211021

Claims (20)

In the step where multiple processor units identify a benchmark image within a group of similar images, where multiple other images within the group of similar images that are not identified as the benchmark image constitute a set of similar images; A computer implementation method for managing an image storage space, comprising the steps of: the plurality of processor units creating an image mapping tree, wherein the image mapping tree has a root block for the benchmark image and blocks arranged in a set of layers below the root block based on the set of similar images; the blocks represent portions of the benchmark image ; and a plurality of lower blocks within the block in the lower layer correspond to sub-partitions within the upper block within the block in the upper layer; and the plurality of processor units storing a set of selected blocks in the set of similar images, wherein the set of selected blocks in the set of similar images has differences from the set of blocks for the benchmark image located at corresponding locations in the image mapping tree; and the plurality of processor units storing metadata for the set of selected blocks in the set of similar images, describing a set of paths in the image mapping tree from the set of blocks at corresponding locations in the image mapping tree to the root block.
The computer implementation method according to claim 1 , further comprising the steps of: the plurality of processor units storing in the image mapping tree a new set of blocks for a new image, wherein the set of new blocks for the new image is different from the set of blocks at corresponding locations in the benchmark image ; and the plurality of processor units storing in the image mapping tree new metadata for the set of new blocks that describes a new set of paths from the set of blocks at corresponding locations in the image mapping tree to the root block for the benchmark image.
The computer implementation method according to claim 1 or 2, further comprising the step of the plurality of processor units storing the new set of images using the image mapping tree when the new set of images is within the maximum threshold of images for the image mapping tree.
The computer implementation method according to claim 3, further comprising the step of the plurality of processor units recreating the image mapping tree using the new set of images and the images in the similar image group when the new set of images is equal to or exceeds a new image threshold.
The steps by which the multiple processor units create the image mapping tree are as follows : The computer implementation method according to claim 1 or 2, wherein the plurality of processor units divide the benchmark image to form sub-parts within the benchmark image; and create upper layers of upper blocks within the lower layers of the root block, wherein each upper block within the upper block has a sub-part within the sub-parts within the benchmark image.
The steps by which the multiple processor units create the image mapping tree are as follows : The step of the plurality of processor units determining the number of similar images in the set of similar images that have similar blocks different from the corresponding upper blocks in the upper blocks, in order to form a set of corresponding upper blocks; The computer implementation method according to claim 5, further comprising the steps of: the plurality of processor units dividing each corresponding upper block in a set of corresponding upper blocks into the sub-partitions; and the plurality of processor units creating lower layers of lower blocks, wherein each lower block in the lower block further comprises a sub-partition in the sub-partitions in the set of corresponding upper blocks.
For the purpose of creating subsequent lower layers, the lower block becomes the upper block. The computer implementation method according to claim 6, further comprising the step of forming the subsequent lower layer by repeatedly performing the determination step, the division step and the creation step of the plurality of processor units.
The computer implementation method according to claim 6, further comprising the step of the plurality of processor units repeatedly performing the determination step, the division step, and the creation step to form another lower layer based on the storage to be reduced and the processing resources used.
The computer implementation method according to claim 6, wherein the number of lower blocks formed by dividing the upper block is based on the number of similar images having similar blocks for the upper block being divided.
The computer implementation method according to claim 1 or 2, wherein the metadata comprises a set of image identifiers for the selected set of blocks, a reference to the benchmark image, and a set of paths in the image mapping tree from the corresponding set of blocks in the image mapping tree to the root block.
The step in which the plurality of processor units identify the benchmark image within the similar image group is: The computer implementation method according to claim 1 or 2, wherein the plurality of processor units have a step of identifying the benchmark image from the plurality of similar images based on the minimum overall difference between the plurality of similar images.
The computer implementation method according to claim 11, wherein the minimum overall difference is determined using at least one of the following methods: RGB color histogram method, keypoint matching method, image hash method, or bit calculation method.
The stage in which multiple processor units receive new images to store; A step in which multiple processor units store a new set of blocks for the new image using an image mapping tree having a root block for a benchmark image and blocks arranged in a set of layers below the root block based on a set of similar images; where the set of new blocks for the new image is different from the set of blocks in the corresponding location in the image mapping tree, the blocks in the lower layers represent portions of the benchmark image, and multiple lower blocks within the block in the lower layer correspond to sub-partitions within the upper block within the block in the upper layer; and A computer implementation method for managing an image storage space, comprising the step of the plurality of processor units storing in the image mapping tree new metadata for the new set of blocks that describes a new set of paths from the set of blocks in corresponding locations in the image mapping tree to the root block for the benchmark image.
A computer system comprising multiple processor units, wherein the multiple processor units are: The procedure involves the plurality of processor units identifying a benchmark image within a group of similar images, wherein any other images within the group of similar images that are not identified as the benchmark image constitute a set of similar images; A computer system in which the plurality of processor units execute program instructions for performing the steps of: creating an image mapping tree, wherein the image mapping tree has a root block for the benchmark image and blocks arranged in a set of layers below the root block based on the set of similar images; the blocks represent portions of the benchmark image; and a plurality of lower blocks within the block in the lower layer correspond to sub-parts within the upper block within the block in the upper layer; and storing a set of selected blocks in the set of similar images , wherein the set of selected blocks in the set of similar images has differences from the set of blocks for the benchmark image located at corresponding locations in the image mapping tree; and storing metadata for the set of selected blocks in the set of similar images , which describes a set of paths in the image mapping tree from the set of blocks at corresponding locations in the image mapping tree to the root block.
The computer system according to claim 14, further comprising: a step of the plurality of processor units storing in the image mapping tree a new set of blocks for a new image , wherein the set of new blocks for the new image is different from the set of blocks at corresponding locations in the benchmark image ; and a step of the plurality of processor units storing in the image mapping tree new metadata for the set of new blocks that describes a new set of paths from the set of blocks at corresponding locations in the image mapping tree to the root block for the benchmark image.
The computer system according to claim 14 or 15, further comprising the steps of the plurality of processor units storing the new set of images using the image mapping tree when the new set of images is within the maximum threshold of images for the image mapping tree.
The computer system according to claim 16, further comprising the steps of the plurality of processor units to recreate the image mapping tree using the new set of images and the images in the similar image group when the new set of images is equal to or exceeds a new image threshold.
The procedure by which the multiple processor units create the image mapping tree is as follows : The computer system according to claim 14 or 15, wherein the plurality of processor units have the steps of dividing the benchmark image to form sub-partitions within the benchmark image; and creating upper layers of upper blocks within the lower layers of the root block, wherein each upper block within the upper block has a sub-partition within the sub-partitions within the benchmark image.
The procedure by which the multiple processor units create the image mapping tree is as follows : A procedure for determining the number of similar images in a set of similar images that have similar blocks different from the corresponding upper blocks within the upper blocks, in order for the plurality of processor units to form a set of corresponding upper blocks; The computer system according to claim 18, wherein the plurality of processor units further comprises the steps of dividing each corresponding upper block in a corresponding set of upper blocks into the sub-partitions; and creating lower layers of lower blocks, wherein each lower block in the lower block further comprises the sub-partitions in the sub-partitions in the corresponding set of upper blocks.
For the purpose of creating subsequent lower layers, the lower block becomes the upper block. The computer system according to claim 19, wherein the plurality of processor units further have a procedure for forming the subsequent lower layers by repeatedly performing the determination procedure, the division procedure and the creation procedure.

Description

This invention generally relates to improvements to computer systems, and more specifically, to a computer system for efficiently storing images with overlapping portions. With the development of the internet and social networking platforms, a vast amount of information is being stored. Images can constitute a large proportion of the information stored on social networking platforms. For example, users of social media platforms can upload a very large number of images, including over 1.8 billion images per day. The storage of images on the internet can be used for creating training datasets for image recognition training of artificial intelligence systems, and for other purposes where a large number of images may be required. Storing these images requires a vast amount of storage space and incurs significant costs. To manage image storage space and reduce the amount of storage required, compression techniques can be applied to images. Data compression applied to digital images reduces their size, potentially lowering the cost of storage or transmission. Image compression can be either lossy or lossless. Lossless compression is preferred, but it may not always produce the desired image size. Furthermore, when managing image storage space, duplicate images may be identified. Identifying duplicate images allows for deduplication to be performed to remove duplicate copies of the image. A single image may be retained as a shared copy. However, because the number of stored images continues to increase rapidly, deduplication of identical images may not provide the desired reduction in image storage space. According to one exemplary embodiment, a computer implementation method for managing image storage space is provided. Multiple processor units identify benchmark images within a group of similar images. Multiple other images not identified as benchmark images within the similar image group constitute a set of similar images. Multiple processor units create an image mapping tree. The image mapping tree has a root block for the benchmark image and blocks arranged in a set of layers below the root block based on the set of similar images. Each block represents a portion of the benchmark image, and multiple sub-blocks within a block in a lower layer correspond to sub-partitions within a higher block in a block in a higher layer. Multiple processor units store a selected set of blocks in the set of similar images that have differences from the corresponding set of blocks in the image mapping tree for the benchmark image. Multiple processor units store metadata about the selected set of blocks, describing a set of paths in the image mapping tree from the corresponding set of blocks to the root block. According to another exemplary embodiment, a computer system and computer program product for managing image storage space are provided. As a result, the exemplary embodiment may provide a technical effect of improving performance in the computer system by reducing the amount of image storage space required through the storage of portions of similar images. An exemplary embodiment may also tolerate storing in the image mapping tree a new set of blocks for a new image that is different from the new corresponding set of blocks in the benchmark image, and may store in the image mapping tree new metadata for the new set of blocks that describes a new set of paths from the new corresponding set of blocks in the image mapping tree to the root block for the benchmark image. The exemplary embodiment may further tolerate storing the new set of images using the image mapping tree if the new set of images is within the maximum threshold for images for the image mapping tree. The exemplary embodiment may also tolerate reconstruct the image mapping tree using the new set of images and the images in the similar image group if the new set of images is equal to or exceeds a new image threshold. In creating the image mapping tree for the similar image group, the exemplary embodiment may tolerate dividing the benchmark image to form sub-parts within the benchmark image, creating a layer above the upper-level blocks within a layer below the root block, where each upper-level block within the upper-level blocks corresponds to a sub-part within a sub-part in the benchmark image. In creating an image mapping tree for a group of similar images, the exemplary embodiment allows for further determining the number of similar images in a set of similar images that have different similar blocks from the corresponding higher-level blocks within the higher-level blocks, in order to form a set of corresponding higher-level blocks; dividing each corresponding higher-level block within the set of corresponding higher-level blocks into sub-partitions; and creating lower layers of lower-level blocks, where each lower-level block corresponds to a sub-partition within a sub-partition in the set of corresponding higher-level blocks. As a result, the exemplary e