CN-121979915-A - Data processing method, apparatus, computer device, storage medium and program product based on shared cache
Abstract
The present application relates to a data processing method, apparatus, computer device, computer readable storage medium and computer program product based on a shared cache. The method comprises the steps of obtaining target data and corresponding data identifiers from a system disk if a data pointer of the target data is not queried in a local cache, wherein the data pointer is used for indicating a storage address of the target data in a shared cache, querying the target data in the shared cache according to the data identifiers, and creating the data pointer of the target data in the local cache based on the storage address of the target data in the shared cache. Therefore, the data cache copies do not need to be maintained independently in the local caches, the problem of memory redundancy caused by repeated storage of the same data in the local caches of all back-end processes in the traditional mode is avoided, the shared cache only bears the data storage function, lock conflict and visibility judgment complexity caused by a multi-version mechanism are avoided, and therefore stability of system performance is guaranteed.
Inventors
- JIANG DIAN
- WANG ZAO
Assignees
- 天翼云科技有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20251222
Claims (15)
- 1. A data processing method based on a shared cache, the method comprising: if the data pointer of the target data is not queried in the local cache, acquiring the target data and a corresponding data identifier from a system disk, wherein the data pointer is used for indicating a storage address of the target data in a shared cache; Inquiring the target data in the shared cache according to the data identifier; And creating a data pointer of the target data in the local cache based on the storage address of the target data in the shared cache.
- 2. The method according to claim 1, wherein the method further comprises: if the target data is not queried in the shared cache, correspondingly storing the target data and the data identifier in the shared cache; And creating a data pointer of the target data in the local cache based on the storage address of the target data in the shared cache.
- 3. The method of claim 2, wherein storing the target data and the data identification in the shared cache, respectively, comprises: Detecting the space occupation condition of the shared cache; if the space occupation condition meets the storage condition, correspondingly storing the target data and the data identifier in the shared cache; And if the space occupation condition does not meet the storage condition, storing the target data in the local cache.
- 4. The method of claim 1 or 2, wherein after the local cache creates the data pointer of the target data, further comprising: Releasing the storage space occupied by the target data in the local cache; the target data is read from the system disk and written to the local cache.
- 5. The method according to claim 1, wherein the method further comprises: And if the data pointer of the target data is queried in the local cache, reading the target data from the shared cache based on the data pointer.
- 6. The method of claim 1, wherein after the local cache creates the data pointer for the target data, further comprising: incrementing a reference count for the target data within the shared cache; If any data pointer in the local cache is changed, decrementing the reference count of the cache data in the shared cache pointed to by the data pointer before the change; And when the reference count of any cache data in the shared cache is a preset value, releasing the storage space occupied by any cache data in the shared cache.
- 7. The method of claim 1, wherein querying the shared cache for the target data based on the data identification comprises: Querying candidate data with the same data identification as the target data in the shared cache; And comparing the target data with the candidate data field by field, and judging the candidate data as the target data stored in the shared cache if the target data and the candidate data are the same in each field.
- 8. The method of claim 7, wherein the method further comprises: and if the target data and the candidate data have different fields, the target data and the corresponding data identification are stored in the shared cache independently.
- 9. The method of claim 1, wherein the obtaining the target data and the corresponding data identifier from the system disk comprises: Acquiring the target data from a system disk, and calculating a hash value of the target data as a corresponding data identifier; The querying the target data in the shared cache according to the data identifier comprises the following steps: And inquiring the target data in the hash table of the shared cache according to the hash value of the target data.
- 10. The method of claim 1, wherein the target data of a plurality of versions are stored in the system disk, each version having a version creation transaction identifier and a version invalidation transaction identifier, the obtaining the target data and the corresponding data identifiers from the system disk comprises: Acquiring a current transaction list corresponding to the target data; based on the current transaction list, the version creation transaction identifier and the version invalidation transaction identifier, performing visibility judgment on each version, and determining a target version; and acquiring the target data of the target version and the corresponding data identifier.
- 11. The method of claim 10, wherein the determining the target version based on the current transaction list, the version-creation transaction identifier, and the version-invalidation transaction identifier by performing a visibility determination for each version comprises: for each version, if the corresponding version creation transaction identifier is in the current transaction list and the version invalidation transaction identifier is null or not in the current transaction list, the version is a target version.
- 12. A data processing apparatus based on a shared cache, the apparatus comprising: The system comprises an acquisition module, a data pointer and a data storage module, wherein the acquisition module is used for acquiring target data and a corresponding data identifier from a system disk if a data pointer of the target data is not queried in a local cache; the query module is used for querying the target data in the shared cache according to the data identifier; And the creation module is used for creating a data pointer of the target data in the local cache based on the storage address of the target data in the shared cache.
- 13. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any one of claims 1 to 11 when the computer program is executed.
- 14. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 11.
- 15. A computer program product comprising a computer program, characterized in that the computer program, when executed by a processor, implements the steps of the method of any one of claims 1 to 11.
Description
Data processing method, apparatus, computer device, storage medium and program product based on shared cache Technical Field The present application relates to the field of data processing technology, and in particular, to a data processing method, apparatus, computer device, computer readable storage medium and computer program product based on shared cache. Background PostgreSQL is a relational data management system employing a multi-process architecture that creates an independent back-end process for each client that is responsible for processing the query requests of the corresponding client for metadata in PostgreSQL. In the conventional operation mode, each back-end process independently maintains respective data cache copies in the local cache, so that a large amount of identical data is repeatedly stored in the memory, and resource waste is caused. In order to reduce memory redundancy, a shared Cache (Global Cache) mechanism may be used to Cache data in a shared memory area, so as to implement efficient multiplexing of the Cache. However, the multi-version concurrency mechanism of postgreSQL allows multiple versions of the same data to exist at the same time, and different back-end processes can access different versions of the same data at the same time. Disclosure of Invention In view of the foregoing, it is desirable to provide a data processing method, apparatus, computer device, computer readable storage medium and computer program product based on a shared cache, so as to improve PostgreSQL multi-version concurrent access performance based on the shared cache. In a first aspect, the present application provides a data processing method based on shared cache, including: if the data pointer of the target data is not queried in the local cache, acquiring the target data and a corresponding data identifier from a system disk, wherein the data pointer is used for indicating a storage address of the target data in a shared cache; Inquiring the target data in the shared cache according to the data identifier; And creating a data pointer of the target data in the local cache based on the storage address of the target data in the shared cache. In one embodiment, the method further comprises: if the target data is not queried in the shared cache, correspondingly storing the target data and the data identifier in the shared cache; And creating a data pointer of the target data in the local cache based on the storage address of the target data in the shared cache. In one embodiment, the storing the target data and the data identifier in the shared cache includes: Detecting the space occupation condition of the shared cache; if the space occupation condition meets the storage condition, correspondingly storing the target data and the data identifier in the shared cache; And if the space occupation condition does not meet the storage condition, storing the target data in the local cache. In one embodiment, after the local cache creates the data pointer of the target data, the method further includes: Releasing the storage space occupied by the target data in the local cache; the target data is read from the system disk and written to the local cache. In one embodiment, the method further comprises: And if the data pointer of the target data is queried in the local cache, reading the target data from the shared cache based on the data pointer. In one embodiment, after the local cache creates the data pointer of the target data, the method further includes: incrementing a reference count for the target data within the shared cache; If any data pointer in the local cache is changed, decrementing the reference count of the cache data in the shared cache pointed to by the data pointer before the change; And when the reference count of any cache data in the shared cache is a preset value, releasing the storage space occupied by any cache data in the shared cache. In one embodiment, the querying the target data in the shared cache according to the data identifier includes: Querying candidate data with the same data identification as the target data in the shared cache; And comparing the target data with the candidate data field by field, and judging the candidate data as the target data stored in the shared cache if the target data and the candidate data are the same in each field. In one embodiment, the method further comprises: and if the target data and the candidate data have different fields, the target data and the corresponding data identification are stored in the shared cache independently. In one embodiment, the obtaining the target data and the corresponding data identifier from the system disk includes: Acquiring the target data from a system disk, and calculating a hash value of the target data as a corresponding data identifier; The querying the target data in the shared cache according to the data identifier comprises the following steps: And inquiring the target data in the hash table of the shar