US-12619540-B2 - Techniques for multi-tiered data storage in multi-tenant caching systems

US12619540B2US 12619540 B2US12619540 B2US 12619540B2US-12619540-B2

Abstract

Embodiments of the invention are directed to systems and methods for utilizing a multi-tiered caching architecture in a multi-tenant caching system. A portion of the in-memory cache may be allocated as dedicated shares (e.g., dedicated allocations) that are each dedicated to a particular tenant, while another portion of the in-memory cache (e.g., a shared allocation) can be shared by all tenants in the system. When a threshold period of time has elapsed since data stored in a dedicated allocation has last been accessed, the data may be migrated to the shared allocation. If data is accessed from the shared allocation, it may be migrated back to the dedicated allocation. Utilizing the techniques for providing a multi-tiered approach to a multi-tenant caching system can increase performance and decrease latency with respect to conventional caching systems.

Inventors

Yu Gu
Hongqin Song

Assignees

VISA INTERNATIONAL SERVICE ASSOCIATION

Dates

Publication Date: 20260505
Application Date: 20240517

Claims (18)

1 . A computer-implemented method, comprising: providing, by a computing device, a plurality of dedicated allocations and a shared allocation of a cache, a dedicated allocation of the plurality of dedicated allocations being associated with an entity of a plurality of entities, the shared allocation of the cache being available for storage to each of the plurality of entities; receiving, by the computing device from another computing device associated with the entity, a data request requesting data to be retrieved from the cache; providing, by the computing device from the shared allocation, the data requested by the data request; in response to identifying that the shared allocation and the dedicated allocation are full: evicting, by the computing device, an amount of data from the shared allocation of the cache based on identifying the amount of data as being oldest in the shared allocation of the cache; and migrating, by the computing device, an oldest instance of data from the dedicated allocation associated with the entity to the shared allocation of the cache; and moving, by the computing device, the data requested by the data request from the shared allocation to the dedicated allocation associated with the entity.
2 . The computer-implemented method of claim 1 , further comprising: determining that the dedicated allocation of the cache has available space, wherein moving the data requested by the data request is performed when the dedicated allocation of the cache is determined to have available space.
3 . The computer-implemented method of claim 1 , further comprising: determining that the dedicated allocation of the cache has available space, when the dedicated allocation of the cache is full: determining whether the shared allocation of the cache is full; when the shared allocation of the cache is not full, migrating a new oldest instance of data from the dedicated allocation associated with the entity to the shared allocation of the cache; and storing the data requested by the data request in the dedicated allocation associated with the entity.
4 . The computer-implemented method of claim 1 , further comprising: determining a plurality of hit ratios corresponding to each entity of the pluralities of entities, each hit ratio identifying a percentage of caching requests that result in requested data being found in the cache, wherein evicting the amount of data from the shared allocation of the cache is further based at least in part on the plurality of hit ratios.
5 . The computer-implemented method of claim 1 , further comprising: determining whether the data is stored in the dedicated allocation of the cache; when the data is stored in the dedicated allocation, determining whether the data is stored in the shared allocation; when the data is not stored in the shared allocation, returning an indication that the data was not found.
6 . The computer-implemented method of claim 1 , further comprising: determining whether the dedicated allocation of the cache has available space; and when the dedicated allocation of the cache has available space, storing the data requested by the data request in the dedicated allocation associated with the entity.
7 . The computer-implemented method of claim 1 , further comprising: receiving, by the computing device from another computing device associated with the entity, a second caching request comprising second data to be cached; and when the shared allocation of the cache is full: evicting, by the computing device, a second amount of data from the shared allocation of the cache based on identifying the second amount of data as being oldest in the shared allocation of the cache; migrating, by the computing device, a new oldest instance of data from the dedicated allocation associated with the entity to the shared allocation of the cache; and storing, by the computing device, the second data received in the second caching request in the dedicated allocation associated with the entity.
8 . The computer-implemented method of claim 7 , further comprising: determining whether the dedicated allocation of the cache is full, the dedicated allocation being associated with the entity; and when the dedicated allocation of the cache has available space, storing the second data received in the data request in the dedicated allocation associated with the entity.
9 . A computing device, comprising: a processor; a cache comprising a plurality of dedicated allocations and a shared allocation, a dedicated allocation of the plurality of dedicated allocations being associated with an entity of a plurality of entities, the shared allocation of the cache being available for storage to each of the plurality of entities; and one or more memories storing executable instructions that, when executed by the processor, cause the computing device to: receive, from a second computing device associated with the entity, a data request requesting data to be retrieved from the cache; provide, from the shared allocation, the data requested by the data request; in response to identifying that the shared allocation and the dedicated allocation are full: evict an amount of data from the shared allocation of the cache based on identifying the amount of data as being oldest in the shared allocation of the cache; and migrate an oldest instance of data from the dedicated allocation associated with the entity to the shared allocation of the cache; and move the data requested by the data request from the shared allocation to the dedicated allocation associated with the entity.
10 . The computing device of claim 9 , wherein executing the instructions further causes the computing device to: identify a hit ratio threshold specifying an amount of data requests that are to result in corresponding data being found in the cache; monitor corresponding hit ratios of data requests submitted by the plurality of entities; and provide a notification to the entity when a hit ratio associated with the entity falls below the hit ratio threshold.
11 . The computing device of claim 9 , wherein the amount of data of the shared allocation of the cache is evicted further based at least in part on identifying first data corresponding to a first entity with a highest hit ratio of a plurality of hit ratios corresponding to the plurality of entities.
12 . The computing device of claim 9 , wherein executing the instructions further causes the computing device to: store the data in the dedicated allocation associated with the entity when the dedicated allocation corresponding to the entity has available space.
13 . The computing device of claim 9 , wherein executing the instructions further causes the computing device to: receive, from another computing device associated with the entity of the plurality of entities, a caching request identifying second data to be cached; store the second data to be cached in the dedicated allocation when the dedicated allocation has space.
14 . The computing device of claim 13 , wherein executing the instructions further causes the computing device to: determine whether the shared allocation has available space when the dedicated allocation is full; move a corresponding oldest instance of data of the dedicated allocation to the shared allocation based on determining that the dedicated allocation is full and the shared allocation has available space.
15 . The computing device of claim 14 , wherein executing the instructions further causes the computing device to evict third data from the shared allocation when the dedicated allocation and the shared allocation are both full.
16 . The computing device of claim 9 , wherein utilizing the plurality of dedicated allocations and the shared allocation decreases an overall latency associated with of data requests submitted by the plurality of entities.
17 . The computing device of claim 9 , wherein the plurality of dedicated allocations are equal in size.
18 . The computing device of claim 9 , wherein executing the instructions further causes the computing device to maintain a mapping identifying each of the plurality of entities to a particular corresponding dedicated allocation of the cache.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application is a continuation of U.S. patent application Ser. No. 17/764,879, filed Mar. 29, 2022, which is the U.S. national stage application of PCT/US2019/054860, filed Oct. 4, 2019, the disclosures of which are incorporated herein by reference in their entirety for all purposes. BACKGROUND A variety of systems may utilize large data sets to perform operations. By way of example, a system that processes data transactions may utilize historical transaction data to perform a variety of operations. Some of this data may be cached in local memory of a computing device such that future requests for that data can be performed faster than it would be possible by accessing the data's primary storage location (e.g., typically located in disk storage). Caching allows a system to increase data retrieval performance by reducing the need to access an underlying slower storage layer. Some in-memory caching systems may be utilized by multiple tenants. Cache management in such systems can be complex. When tenants are provided the ability to store data anywhere in a shared cache, some tenants may utilize more than a fair share of the cache which can increase latency with respect to the other tenant's access operations. However, providing tenants dedicated shares of the cache for fairness reasons can also lead to inefficient utilization of the cache as some tenants may not utilize all of their dedicated memory. Accordingly, conventional in-memory caching systems and the management thereof presents a number of drawbacks and inefficiencies. Embodiments of this disclosure address these and other problems, individually and collectively. SUMMARY One embodiment of the invention is directed to a method. The method may comprise providing, by a computing device, a plurality of dedicated allocations of a cache, each dedicated allocation of the cache corresponding to an entity of a plurality of entities, the cache being provided utilizing random access memory. The method may further comprise providing, by the computing device, a shared allocation of the cache, the shared allocation being available for storage to each of the plurality of entities. The method may further comprise identifying, by the computing device, that a threshold period of time has elapsed since data stored in a dedicated allocation of the cache was last accessed. The method may further comprise migrating, by the computing device, the data from the dedicated allocation of the cache to the shared allocation of the cache. In some embodiments, the method may further comprise, when the dedicated allocation of the cache is full, determining whether a shared allocation of the cache is full. When the shared allocation of the cache is not full, the method may comprise migrating an oldest instance of data from the dedicated cache associated with the entity to the shared allocation of the cache and storing the data received in the caching request in the dedicated allocation associated with the entity. In some embodiments, the method may further comprise determining a plurality of hit ratios corresponding to each entity of the pluralities of entities, each hit ratio identifying a percentage of caching requests that result in requested data being found in the cache. The method may further comprise adjusting one or more eviction rules of the shared allocation of the cache based at least in part on the plurality of hit ratios. In some embodiments, the method may further comprise receiving a data request for cached data associated with an entity. The method may further comprise determining whether the cached data is stored the dedicated allocation associated with the entity. The method may further comprise, when the cached data is stored in the dedicated allocation associated with the entity, returning the cached data in response to the data request. In some embodiments, the method may further comprise, when the cached data is not stored in the dedicated allocation associated with the entity, determining whether the cached data is stored in the shared allocation of the cache. When the cached data is stored in the shared allocation, the method may comprise returning the cached data. When the cached data is not stored in the shared allocation, the method may comprise returning an indication that the cached data was not found. In some embodiments, the method may further comprise, when the cached data is stored in the shared allocation, determining whether the dedicated allocation associated with the entity is full and migrating the cached data to the dedicated allocation associated with the entity when the dedicated allocation associated with the entity is not full. In some embodiments, the method may further comprise, when the dedicated allocation associated with the entity is full: determining whether the shared allocation of the cache is full, migrating an oldest instance of data from the dedicated allocation associated with the entit