CN-122018807-A - File storage method and device
Abstract
The disclosure provides a file storage method and device, relates to the technical field of computers, and particularly relates to the technical field of distributed data storage. The method comprises the steps of segmenting a file to be stored into data objects, segmenting a disk file into data blocks with the same size as the data objects, applying for idle data blocks from the disk file, correspondingly writing the data objects in the file to be stored into the idle data blocks in the disk file, determining a space to be released from the disk file in response to determining that a space recovery condition is met, and migrating data in the space to be released to the idle space in the disk file.
Inventors
- LIANG MINGYUAN
- HOU JINCHAO
- CHEN ZHESI
- MIAO LEI
- Wen Dongjiu
- HUANG SONG
- XING YAN
- DUAN LIGUO
- Qiu tengfei
Assignees
- 北京百度网讯科技有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260130
Claims (17)
- 1. A method of storing a file, comprising: Splitting a file to be stored into data objects; splitting the disk file into data blocks with the same size as the data objects; applying for idle data blocks from the disk file, and correspondingly writing the data objects in the file to be stored into the idle data blocks in the disk file; determining a space to be released from the disk file in response to determining that a space recycling condition is met; and migrating the data in the space to be released to an idle space in the disk file.
- 2. The method of claim 1, wherein the determining space to be freed from the disk file in response to determining that a space reclamation condition is satisfied comprises: Determining a disk file to be released from the disk files in response to determining that a disk file space reclamation condition is satisfied, and The step of migrating the data in the space to be released to the free space in the disk file comprises the following steps: and in response to determining that the number of idle data blocks of other disk files except the disk file to be released is not smaller than the number of occupied data blocks of the disk file to be released, migrating the data objects in the disk file to be released to the idle data blocks in the other disk files.
- 3. The method of claim 2, wherein the determining that the disk file space reclamation condition is satisfied comprises: determining the proportion of idle data blocks of the disk file; and if the proportion of the idle data blocks reaches a first preset proportion threshold value, determining that the recovery condition of the disk file space is met.
- 4. The method of claim 3, wherein the determining the free data block proportion of the disk file comprises: Counting the total data block number and occupied data block number of the disk file; calculating the difference value between the total data block number and the occupied data block number to obtain the idle data block number of the disk file; and calculating the ratio of the number of the idle data blocks to the total number of the data blocks to obtain the idle data block ratio.
- 5. The method of claim 2, wherein the determining a disk file to be released from the disk files comprises: and selecting the disk file to be released from the disk files based on the number of occupied data blocks of each disk file.
- 6. The method of claim 2, wherein the migrating the data object in the disk file to be released to the free data block in the other disk file comprises: applying for idle data blocks from the other disk files, and correspondingly writing the data objects in the disk files to be released into the idle data blocks in the other disk files; And deleting the disk file to be released in response to determining that all data objects in the disk file to be released are written into the other disk files.
- 7. The method of claim 1, wherein the determining space to be freed from the disk file in response to determining that a space reclamation condition is satisfied comprises: Determining a data block to be released from the occupied data blocks in the disk file in response to determining that the data block space reclamation condition is satisfied, and The step of migrating the data in the space to be released to the free space in the disk file comprises the following steps: And in response to determining that the free data amount of the occupied data blocks except the data block to be released is not smaller than the occupied data amount of the data block to be released, migrating the data in the data block to be released to the other occupied data blocks.
- 8. The method of claim 7, wherein the determining that the data block space reclamation condition is satisfied comprises: determining the idle data volume proportion of the occupied data block; and if the idle data volume proportion reaches a second preset proportion threshold value, determining that the data block space recovery condition is met.
- 9. The method of claim 7, wherein the determining the data block to be released from the occupied data blocks in the disk file comprises: And selecting the data block to be released from the occupied data blocks based on the occupied data quantity of each occupied data block.
- 10. The method of claim 7, wherein the migrating data in the data block to be released to the other occupied data block comprises: Writing the data in the data block to be released into the other occupied data blocks; And in response to determining that all data in the data block to be released is written into the other occupied data blocks, marking the data block to be released as an idle state.
- 11. The method of any of claims 1-10, wherein the method further comprises: recording the disc file identification of the occupied data block and the offset of the occupied data block in the disc file.
- 12. The method of any of claims 1-10, wherein the method further comprises: And in response to determining that the disk file is full and the to-be-stored file has an unwritten data object, creating a new disk file, and continuing to store the unwritten data object.
- 13. The method of any of claims 1-10, wherein the method further comprises: And in response to determining that the data blocks with all the data objects deleted exist in the disk file, marking the data blocks with all the data objects deleted as idle states.
- 14. A file storage device comprising: a first splitting module configured to split a file to be stored into data objects; the second segmentation module is configured to segment the disk file into data blocks with the same size as the data objects; The writing module is configured to apply for idle data blocks from the disk file, and correspondingly write the data objects in the file to be stored into the idle data blocks in the disk file; A determining module configured to determine a space to be freed from the disk file in response to determining that a space reclamation condition is satisfied; and the migration module is configured to migrate the data in the space to be released to the free space in the disk file.
- 15. An electronic device, comprising: at least one processor, and A memory communicatively coupled to the at least one processor, wherein, The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-13.
- 16. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-13.
- 17. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any of claims 1-13.
Description
File storage method and device Technical Field The present disclosure relates to the field of computer technology, and in particular, to the field of distributed data storage technology. Background In the distributed storage system, the final storage carrier of the data is a magnetic disk, so that a special storage engine is needed to realize the distributed management of the user data, the safety of the data storage is ensured through a reasonable redundancy mechanism, the orderly organization and efficient scheduling of the data on the magnetic disk are finished, and the actual requirement of large-scale data storage is met. The mainstream implementation scheme in industry is that a user splits an original file into data objects (objects) with granularity of 4M, and a system maps the data objects to a physical aggregation (Volume) storage through a hash algorithm. Meanwhile, based on Erasure Coding (EC) technology, data redundancy protection is realized by taking Volume as a basic unit. In specific storage logic, volume is split into data fragments (vlet) according to a preset EC erasure ratio, each vlet corresponds to a 32G disk file one by one, and data objects are stored to the disk files through vlet. The currently widely used Linked engine adopts the core architecture, volume is created according to the need, and idle identifiers are reserved after deleting data and are used for the subsequent writing and filling of the same user. Disclosure of Invention The embodiment of the disclosure provides a file storage method, a file storage device, a file storage medium and a program product. In a first aspect, an embodiment of the disclosure provides a file storage method, which includes segmenting a file to be stored into data objects, segmenting a disk file into data blocks with a size consistent with that of the data objects, applying for free data blocks from the disk file, correspondingly writing the data objects in the file to be stored into the free data blocks in the disk file, determining a space to be released from the disk file in response to determining that a space recovery condition is met, and migrating data in the space to be released to the free space in the disk file. In a second aspect, an embodiment of the disclosure provides a file storage device, which includes a first splitting module configured to split a file to be stored into data objects, a second splitting module configured to split a disk file into data blocks with a size consistent with the data objects, a writing module configured to apply for free data blocks from the disk file, correspondingly write the data objects in the file to be stored into the free data blocks in the disk file, a determining module configured to determine a space to be released from the disk file in response to determining that a space recovery condition is satisfied, and a migration module configured to migrate data in the space to be released to the free space in the disk file. In a third aspect, an embodiment of the present disclosure provides an electronic device comprising at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method as described in the first aspect. In a fourth aspect, embodiments of the present disclosure provide a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform a method as described in the first aspect. In a fifth aspect, embodiments of the present disclosure propose a computer program product comprising a computer program which, when executed by a processor, implements a method as described in the first aspect. Nor is it intended to limit the scope of the present disclosure to the critical or important features of the embodiments of the present disclosure. Other features of the present disclosure will become apparent from the following specification. Drawings Other features, objects and advantages of the present disclosure will become more apparent upon reading of the detailed description of non-limiting embodiments made with reference to the following drawings. The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein: FIG. 1 is a flow chart of one embodiment of a file storage method according to the present disclosure; FIG. 2 is a flow chart of yet another embodiment of a file storage method according to the present disclosure; FIG. 3 is a flow chart of another embodiment of a file storage method according to the present disclosure; FIG. 4 is a schematic structural view of one embodiment of a file storage device according to the present disclosure; Fig. 5 is a block diagram of an electronic device for implementing a file storage method of an embodiment of the present disclosure. Detailed Description Exemplary embodi