Search

CN-121997363-A - Careless R tree construction method and retrieval method based on trusted execution environment

CN121997363ACN 121997363 ACN121997363 ACN 121997363ACN-121997363-A

Abstract

The invention discloses an unintentional R tree construction method and a retrieval method based on a trusted execution environment. The method comprises the steps of firstly creating a storage structure in the same machine, wherein the storage structure comprises a TEE-based trusted memory and an untrusted memory, the trusted memory comprises an ORAM Client area, an R tree node storage area and a data cache area, the memory comprises an ORAM Server area, data exchange is carried out with the Client area according to an ORAM protocol, an R tree is generated by using an index initialization algorithm, the Client area and the Server area are initialized according to the generated R tree, then probability p is selected according to the data quantity to be stored and the storage space size of the trusted memory and recorded in the trusted memory, and then each node of the R tree is stored in the R tree node storage area according to the probability p, and the rest nodes are stored in the Server area.

Inventors

  • HU CHENGRUI
  • ZHANG MIN
  • CHI JIALIN
  • LIANG GUIZHOU
  • LI HAO

Assignees

  • 中国科学院软件研究所

Dates

Publication Date
20260508
Application Date
20251222

Claims (7)

  1. 1. An unintentional R tree construction method based on a trusted execution environment comprises the following steps: The method comprises the steps of creating a storage structure in the same machine, wherein the storage structure comprises a trusted memory based on a TEE and an untrusted external memory, the trusted memory comprises an ORAM Client area, an R tree node storage area and a data cache area, and the data cache area is used for temporarily storing a temporary storage structure and a return result of data retrieval in the running process of a setting algorithm; Generating an R tree by using an index initialization algorithm, initializing the ORAM Client area and the ORAM Server area according to the generated R tree, selecting a probability p according to the data quantity to be stored and the storage space size of the trusted memory, recording the probability p in the trusted memory, storing each node of the R tree into the node storage area of the R tree according to the probability p, and storing the rest nodes into the ORAM Server area according to an ORAM protocol.
  2. 2. The method of claim 1, wherein the setting algorithm comprises an index initialization algorithm, a node access algorithm, an insert algorithm, a delete algorithm, a data retrieval algorithm.
  3. 3. The method of claim 2, wherein the step of the node access algorithm comprises: Step A1, searching the ID of the current node in the R tree node storage area, and executing a step A2 if the ID of the current node is searched, otherwise, executing a step A3; Step A2, taking out the current node from the R tree node storage area and putting the current node into the data cache area; Step A3, sending a request for reading and deleting the current node to an ORAM Client area, and placing the current node into the data cache area; And step A4, taking out the current node from the data cache area, judging whether the current node is put into a trusted memory according to the probability p, if so, putting the current node into the R tree node storage area, and if not, sending a request for writing the node to the ORAM Client area.
  4. 4. A method according to claim 3, wherein the step of inserting an algorithm comprises: Starting from the root node of the R tree, sequentially selecting the node with the minimum boundary rectangle increment after expansion from all sub nodes of the currently accessed node as a node to be accessed, searching the ID of the node to be accessed in an R tree node storage area of a trusted memory, if the ID of the node to be accessed is searched, taking out the node to be accessed from the R tree node storage area and putting the node to be accessed into a data cache area of the trusted memory, otherwise, sending a request for reading and deleting the node to be accessed to the ORAM Client area, putting the node to be accessed into the data cache area, then inserting data to be stored into the node to be accessed, judging whether the node to be accessed needs splitting or not, and if not, executing the step B4; step B2, splitting the node to be split into two new nodes according to a splitting strategy and deleting the node to be split, judging whether the node is put into a trusted memory according to probability p for each split node, if so, putting the node into the R tree node storage area, otherwise, sending a request for writing the node to the ORAM Client area; Step B3, the father node of the deleted node to be split is taken out from the data cache area, the information of the father node is updated, whether the father node needs to be split or not is judged, if not, the minimum boundary rectangle of the rest nodes in the data cache area is updated, the step B4 is executed, and otherwise, the step B2 is executed in a returning mode; And B4, judging whether each node remained in the data cache area is placed in a trusted memory according to the probability p, if so, placing the node in the R tree node area, and otherwise, sending a request for writing the node to the ORAM Client area.
  5. 5. The method of claim 4, wherein the step of deleting the algorithm comprises: Step C1, starting from a root node of an R tree, sequentially selecting a node to be accessed in the next step according to whether a minimum boundary rectangle of the node contains data to be deleted or not, executing an access node algorithm for removing the step A4, and storing the accessed node in a data cache area; Step C2, taking out all the items contained in the node to be deleted, deleting the node, and then executing an insertion algorithm containing the steps B1-B4 to reinsert the taken out items into the R tree; Step C3, the father node of the deleted node is taken out from the data cache area, the information of the father node of the node is updated, whether the father node of the deleted node needs to be deleted or not is judged, if not, the step C4 is executed, otherwise, the step C2 is returned; and step C4, for each node left in the data cache area, executing a step A4 of the node access algorithm.
  6. 6. The method of claim 1, wherein the ORAM protocol includes, but is not limited to, a Path ORAM protocol, a Partition ORAM protocol.
  7. 7. A search method based on the unintentional R tree according to claim 1, characterized in that, Step D1, creating an empty result array and an empty queue in a data cache area; Step D2, accessing the root node of the R tree and adding the root node into a queue; Step D3, taking out the node at the head of the queue, judging the node type, adding the data meeting the retrieval condition into the result array if the node is a leaf node, traversing the child node if the node is not a leaf node, and adding the child node with the minimum boundary rectangle and the retrieval condition in the overlapping area to the tail end of the queue; Step D4, repeatedly executing the step D3 until the queue is empty; and D5, returning a result array.

Description

Careless R tree construction method and retrieval method based on trusted execution environment Technical Field The invention belongs to the field of dense data retrieval, relates to an efficient careless R tree indexing method, and particularly relates to a careless R tree construction method and a retrieval method based on a trusted execution environment. Background The R tree is a classical spatial index structure, and is widely applied to various fields such as a geographic information system, a database, computer vision and the like due to the fact that efficient support of spatial operations such as range query, KNN query and the like is achieved. However, the spatial data stored in the R tree generally includes privacy information such as a real behavior pattern and identity characteristics of the user, and the range size of the node boundary of the R tree also reflects the distribution range and density of the data in the space, so that the disclosure of the node boundary may cause privacy disclosure of the data owner. Therefore, the protection of the sensitive data of the R tree is not only the requirement of theoretical research, but also a key link for realizing the safe sharing and the trusted computing of the spatial data. In response to this need, an effective technique is a trusted execution environment (Trusted execution environment, TEE) technique that deploys an isolated trusted memory, where data, code, and computations performed are not visible to the outside, thereby providing a basic security guarantee for the confidentiality, integrity, and integrity of the data in the trusted memory. However, due to storage space limitations, the large number of data storage media available, and the like, it is desirable to store a portion of the data in an untrusted external memory. At this time, it is not enough to encrypt the contents of the nodes of the R tree, and in the process of storing and retrieving based on the R tree, the access modes such as the nodes, the sequence, the result scale and the like of the access may be observed and inferred by adversary, thereby causing serious risk of revealing private data. The inadvertent random access machine (Oblivious Random ACCESS MACHINE, ORAM) technology is a technology capable of effectively protecting the safety of external data, and the technology protects the data privacy and access mode in the external memory at the cost of reducing the space and time efficiency by storing redundant data and performing redundant operation. Considering that the ORAM technology retrieval efficiency is far lower than the TEE-based scheme, when combining the two technologies, the data and retrieval process should be put in a trusted memory as much as possible to improve the efficiency. A method for simply combining TEE and ORAM technology is to directly store several index nodes from root node to leaf node in the buffer of trusted memory, and the index in the buffer can be directly read when searching, otherwise, searching in external memory. This approach can achieve maximum utilization of trusted memory space, but there is a risk of leakage of access patterns, because the number of accesses to external memory is different for data in the cache than for data not in the cache, and thus the approach does not conform to the inadvertent characteristics. Further, even if a non-random scheduling manner such as LRU is used for the cache, the scheme still has security holes, and for a specific non-random scheduling manner, there must be a search order to enable the cache to reach a certain state, so that the security of the method is equivalent to that of the method without scheduling, and thus, the method also does not conform to the careless characteristics. However, if other non-R tree TEE-based schemes are simply migrated to the R tree, the efficiency of the migration is low because the space utilization of the trusted memory by other schemes is insufficient, resulting in more data exchange between the trusted memory and the external memory, and research has found that the overhead of such data exchange is much greater than the computational overhead in the trusted memory. Therefore, how to realize an efficient index structure under the premise of ensuring the security is a problem to be solved. Disclosure of Invention Aiming at the technical problems, the invention aims to provide an unintentional R tree construction method and a search method based on a trusted execution environment, which realize efficient and safe data storage, updating, single search and space search methods. The scheme is to manage the R tree by taking the node as a unit, store part of the nodes in a trusted memory, store part of the nodes in an external memory, and randomly write the nodes into the trusted memory or the external memory after each time of reading the nodes, so that the careless characteristic of node access is ensured, and a safe data updating, single searching and space search