Search

CN-121979843-A - Mail system single copy storage system and method based on object storage

CN121979843ACN 121979843 ACN121979843 ACN 121979843ACN-121979843-A

Abstract

The invention relates to a mail system single copy storage system and method based on object storage, wherein the system comprises a mail index service module, an object storage service module, a single copy storage management service module and a mail client, when a mail is sent, a hash value of a mail body file is calculated, only a unique copy is stored in the object storage and the reference count is managed through a metadata table, when the mail is read, the file in the object storage is directly accessed through a mail index, when the mail is deleted, the reference count is decremented, and the file is physically deleted only when the count is zero. The invention realizes the single copy storage of mail attachments, obviously reduces the storage space occupation in the scene of mass-sending mails, and ensures the consistency and the safety of data through a reference counting mechanism.

Inventors

  • LIU YULUAN
  • CHI SHAONING
  • CHEN FENGXI
  • MI KAI

Assignees

  • 福建亿榕信息技术有限公司

Dates

Publication Date
20260505
Application Date
20260121

Claims (6)

  1. 1. The mail system single-copy storage system based on the object storage is characterized by comprising a mail index service module, an object storage service module, a single-copy storage management service module and a mail client; the mail index service module is used for storing and managing metadata indexes of mails, the metadata indexes comprise mail identifiers and access paths of corresponding mail body files in object storage, the object storage service module is used as a physical storage medium and used for storing unique mail body file data, the single copy storage management service module comprises a metadata management unit and a storage logic control unit, the metadata management unit is used for maintaining a metadata table, the metadata table takes file hash values as main keys and records corresponding object storage paths, reference counters, file sizes and creation time, the storage logic control unit is used for calling interfaces of the object storage service module and operating the metadata table in response to sending, reading and deleting operations of the mails so as to achieve single copy storage and life cycle management of the mail body files, and the mail client is used for providing a mail operation interface for users and interacting with the mail index service module and the single copy storage management service module.
  2. 2. The mail system single-copy storage system based on object storage according to claim 1, wherein the metadata index maintained by the mail index service module adopts a structured data organization mode, and comprises two key fields of a mail identifier and an object storage access path, specifically, the mail identifier is used as a unique primary key, a globally unique UUID or a time stamp-based ordered ID generation strategy is adopted to ensure that each mail has a unique identity in the system, and the corresponding mail body file access path records the specific storage position of the mail in the object storage system, including a storage bucket name, an object key value, version information and the like.
  3. 3. The mail system single copy storage system based on object storage according to claim 2, wherein the mail index service module provides service to the outside through RESTful API or RPC interface, supports deletion and examination operation of mail index, in the mail sending scenario, the mail index service module receives index creation request from single copy storage management service, establishes mapping relation between identifier of new mail and path information in object storage and persistence storage, in the mail reading scenario, the mail index service module searches and returns corresponding storage path according to mail identifier fast, so that mail client can access object storage directly to obtain mail content, in the mail deleting scenario, the mail index service module is responsible for clearing corresponding index record and coordinates processing updating of reference count with single copy storage management service.
  4. 4. The mail system single copy storage system based on object storage according to claim 1, wherein the metadata management unit maintains a metadata table using file hash values as a primary key, the table is implemented by using a high-performance key value storage or relational database, each record in the table corresponds to a unique mail body file, the record content comprises four key fields, an object storage path is used for identifying the precise position of the file in the distributed storage system, a reference counter records how many seal mail pieces currently reference the file, concurrent security is ensured by adopting atomic operation, a file size field is convenient for storage space statistics and quota management, a creation time field supports lifecycle management and audit trail, and the metadata management unit realizes a CRUD operation interface and supports quick query based on hash values, atomic increment decrement of reference count and batch cleaning function of expired data.
  5. 5. The mail system single copy storage system based on object storage according to claim 1, wherein the storage logic control unit is used as a scheduler of a service flow and is responsible for processing a complete life cycle of mail sending, reading and deleting operations, in the mail sending flow, the storage logic control unit firstly calculates a hash value of a mail body file, inquires a metadata table to judge whether files with the same content exist, if yes, directly increases a reference count, if not, invokes an object storage service interface to upload new files and create new records in the metadata table, in the mail reading flow, the storage logic control unit acquires corresponding file hash values according to a mail identifier, searches an object storage path through the metadata table and provides file access credentials for a client, in the mail deleting flow, the storage logic control unit executes a reference count decrementing operation, and when the count returns to zero, triggers a garbage collection mechanism, invokes an object storage interface to delete physical files and cleans up metadata records.
  6. 6. A storage method based on the mail system single copy storage system of any one of claims 1 to 5, comprising the steps of; Step S1, mail sending and storing processes; s1.1, a mail client prepares to send a mail with an attachment, regards a mail body as an integral file and calculates a hash value of the integral file; s1.2, the single-copy storage management service module receives a storage request, queries a metadata table and judges whether records with the same hash value exist or not; S1.3, if the same hash value does not exist, uploading the mail body file to an object storage service module, newly adding a record in a metadata table, setting the initial value of a reference counter to be 0, and recording an object storage path; S1.4, updating a reference counter of a record corresponding to the hash value according to the actual number N of recipients, and increasing N; S1.5, a mail index service module stores a mail metadata index which contains information pointing to the object storage path; s1.6, returning a successful response to the mail client; Step S2 mail reading flow S2.1, a mail client requests to read a mail; S2.2, the mail index service module queries a corresponding object storage path according to the mail ID; s2.3, the mail client or the mail server rear end directly reads the mail body file content through the interface of the object storage service module according to the path; s2.4, presenting mail content to a user; Note that the reading process directly accesses the object store without going through the single-copy storage management service module to improve the reading performance; Step S3, mail deleting flow: S3.1, the mail client requests to delete the mail; s3.2, the mail index service module marks and deletes the mail metadata index; s3.3, the single-copy storage management service module queries the associated file hash value according to the deleted mail ID; s3.4, for each associated file hash value, subtracting 1 from a reference counter in a metadata table; S3.5, checking the reference counter value after 1 reduction, if the reference counter is reduced to 0, calling an interface of the object storage service module to delete the corresponding physical file, and if the reference counter is greater than 0, only updating the metadata table and reserving the physical file.

Description

Mail system single copy storage system and method based on object storage Technical Field The invention relates to the technical field of computers, in particular to a mail system single-copy storage system and method based on object storage. Background As the informatization of enterprises increases, internal mail systems have become an indispensable tool for daily offices. Especially in large organizations, a scenario in which an email is attached with an attachment and is clustered to a large number of recipients (e.g., thousands to tens of thousands of people) is common. The traditional mail system storage mode has the obvious defects that: 1. Common mail and attachment storage when a user sends a mail with an attachment to N recipients, the system typically stores a complete mail body and attachment copy under the mailbox directory of each recipient. The multi-copy storage mode causes huge storage space waste, and particularly in a group sending scene, the storage expense linearly increases along with the number of recipients, and huge pressure is formed on storage resources. 2. Oversized attachment handling, although the existing system generally adopts single-copy storage for oversized attachments (namely storing an attachment entity, and a recipient accesses the attachment entity through a link), for common attachments (generally referring to attachments with a size below a certain threshold), a multi-copy storage mode is still used, and global optimization of storage efficiency cannot be achieved. 3. The data consistency management is complex, and if the single copy storage is attempted to be realized, the reference relation of the file needs to be accurately managed. The traditional file system or simple storage mode is difficult to efficiently and reliably process life cycle events such as file creation, reference count increase and decrease, deletion and the like, and the problems that data is inconsistent (such as deleting the file being referenced by mistake) or storage space cannot be recovered in time are easily generated. Disclosure of Invention In order to solve the problems, the invention aims to provide a mail system single-copy storage system and a mail system single-copy storage method based on object storage, so that mail data, particularly single-copy storage of common attachments, are realized, the occupied storage space is obviously reduced, and meanwhile, the consistency and the safety of the data are ensured. In order to achieve the above purpose, the present invention adopts the following technical scheme: The mail system single-copy storage system based on object storage comprises a mail index service module, an object storage service module, a single-copy storage management service module and a mail client; the mail index service module is used for storing and managing metadata indexes of mails, the metadata indexes comprise mail identifiers and access paths of corresponding mail body files in object storage, the object storage service module is used as a physical storage medium and used for storing unique mail body file data, the single copy storage management service module comprises a metadata management unit and a storage logic control unit, the metadata management unit is used for maintaining a metadata table, the metadata table takes file hash values as main keys and records corresponding object storage paths, reference counters, file sizes and creation time, the storage logic control unit is used for calling interfaces of the object storage service module and operating the metadata table in response to sending, reading and deleting operations of the mails so as to achieve single copy storage and life cycle management of the mail body files, and the mail client is used for providing a mail operation interface for users and interacting with the mail index service module and the single copy storage management service module. Furthermore, the metadata index maintained by the mail index service module adopts a structured data organization mode and comprises two key fields of a mail identifier and an object storage access path, wherein the mail identifier is used as a unique main key, a globally unique UUID or a time stamp-based ordered ID generation strategy is adopted to ensure that each mail has a unique identity in a system, and the corresponding mail body file access path records the specific storage position of the mail in the object storage system, including the complete access seats such as a storage bucket name, an object key value, version information and the like. Further, the mail index service module provides service outwards through RESTful API or RPC interface to support the operation of adding, deleting and checking mail index, and in the mail sending scene, the mail index service module receives index creating request from single copy storage management service, establishes mapping relation between the identifier of new mail and the path information in the ob