Search

CN-115438146-B - Query method and device for multi-version unstructured data semantic information

CN115438146BCN 115438146 BCN115438146 BCN 115438146BCN-115438146-B

Abstract

The invention discloses a query method and device for multi-version unstructured data semantic information. The method comprises the steps of obtaining unstructured data of each version, obtaining the version of the semantic information based on an artificial intelligent model used for extracting the semantic information for the unstructured data of each version, constructing and storing a semantic information version tree of the unstructured data based on a modification relation among the artificial intelligent models, establishing an expression mode between a father node and a child node in the semantic information version tree according to a storage mode of the semantic information version tree, and searching the semantic information version tree based on query filtering conditions and the expression mode when querying the semantic information of the unstructured data to obtain a query result of the semantic information. The invention can realize that the multi-version unstructured data semantic information can be managed and queried.

Inventors

  • SHEN ZHIHONG
  • ZHAO ZIHAO
  • LU CHANGFA

Assignees

  • 中国科学院计算机网络信息中心

Dates

Publication Date
20260512
Application Date
20220722

Claims (7)

  1. 1. A multi-version unstructured data semantic information-oriented query method, the method comprising: obtaining unstructured data of each version; for unstructured data of each version, acquiring the version of the semantic information based on an artificial intelligent model used for extracting the semantic information; Based on the modification relation among the artificial intelligent models, constructing and storing a semantic information version tree of unstructured data, wherein nodes in the semantic information version tree comprise semantic information versions and artificial intelligent model versions used for extracting semantic information; Establishing an expression mode between a father node and a child node in the semantic information version tree according to the storage mode of the semantic information version tree; when the semantic information of unstructured data is queried, searching the semantic information version tree based on query filtering conditions and the expression mode to obtain a query result of the semantic information; the method for constructing and storing the semantic information version tree of the unstructured data based on the modification relation among the artificial intelligent models comprises the following steps: For unstructured data, acquiring a semantic information version v-sp1 extracted by an artificial intelligent model with a version number v-m1, taking the artificial intelligent model version number v-m1 as a root node of a semantic information version tree, and taking the semantic information version v-sp1 as a child node of the root node; For the unstructured data, acquiring a semantic information version v-sp2 extracted by an artificial intelligent model with a version number v-m2, taking the artificial intelligent model version number v-m2 as a child node of the semantic information version v-sp1, and taking the semantic information version v-sp1 as a child node of the artificial intelligent model version number v-m2, wherein a father version of the artificial intelligent model with the version number v-m2 is the artificial intelligent model with the version number v-m 1; storing the semantic information version tree in a hard disk; under the condition that the semantic information version tree is stored in a hard disk, the establishing an expression mode between a father node and a child node in the semantic information version tree according to a storage mode of the semantic information version tree comprises the following steps: Maintaining an association between the parent node and the child nodes using a three-way pointer, wherein the three-way pointer comprises a pointer pointing to the parent version, a pointer pointing to the actual content of the semantic information and pointers pointing to all child nodes; The query filtering conditions comprise one or more of an id of an unstructured data object corresponding to the semantic information to be queried, a version number of the unstructured data object corresponding to the semantic information to be queried, a name of the semantic information to be queried and actual content of the semantic information to be queried; Under the condition that the semantic information version tree is stored in a hard disk, when the semantic information of unstructured data is queried, the semantic information version tree is searched based on a query filtering condition to obtain a query result of the semantic information, and the method comprises the following steps: obtaining a pointer pointing to a three-way pointer of the semantic information to be queried based on the id of the unstructured data object corresponding to the semantic information to be queried, the version number of the unstructured data object corresponding to the semantic information to be queried, the name of the semantic information to be queried or the semantic information to be queried; obtaining the semantic information actual content of the target node according to the pointer pointing to the semantic information actual content; the pointers pointing to the father version and/or the pointers pointing to all the child nodes in the target node are utilized, and corresponding father nodes and/or child nodes are obtained according to the hierarchy to be queried; obtaining the semantic information actual content of the father node and/or the child node based on pointers pointing to the semantic information actual content in the father node and/or the child node; and synthesizing the obtained actual content of the semantic information to obtain the query result of the semantic information.
  2. 2. The method of claim 1, wherein the storing further comprises storing the semantic information version tree in a Key-Value database.
  3. 3. The method of claim 2, wherein in the case of storing the semantic information version tree in a Key-Value database, the establishing an expression manner between a parent node and a child node in the semantic information version tree according to a storage manner of the semantic information version tree comprises: storing nodes with the same type of semantic information names in a Kv database; And obtaining Key values of all versions in any Kv database based on the association between the father node and the child node.
  4. 4. The method of claim 3, wherein, in the case of storing the semantic information version tree in a Key-Value database, when the semantic information of unstructured data is queried, retrieving the semantic information version tree based on a query filtering condition to obtain a query result of the semantic information, comprising: finding a version of the semantic information conforming to the filtering condition, and acquiring the actual content of the semantic information of the version; obtaining the version of the father node and/or the child node according to the Key value of the version and the hierarchy to be queried; Acquiring the actual content of semantic information of the version of the father node and/or the child node; and synthesizing the obtained actual content of the semantic information to obtain the query result of the semantic information.
  5. 5. The method of any of claims 1-4, wherein deleting a version of semantic information in the semantic information version tree comprises: Deleting the corresponding nodes of the version and all the sub-versions thereof from the semantic information tree; Or alternatively, the first and second heat exchangers may be, The corresponding node of the version is deleted from the semantic information tree, and the parent node of the immediate child node of the version is set as the parent node of the corresponding node of the version.
  6. 6. A storage medium having a computer program stored therein, wherein the computer program is arranged to perform the method of any of claims 1-5 when run.
  7. 7. An electronic device comprising a memory having stored therein a computer program and a processor arranged to run the computer program to perform any of the methods of claims 1-5.

Description

Query method and device for multi-version unstructured data semantic information Technical Field The invention relates to the fields of unstructured data, artificial intelligence, query language, databases and the like, and aims to provide a query method and device for semantic information of multi-version unstructured data. Background Unstructured data generally refers to data that is not structured, such as long text, pictures, video, audio, etc. Unstructured data is typically stored in the form of a string of binary code in a computer system, and a common feature of such data is that the data is relatively bulky and unintelligible to the computer. The user's query needs with respect to unstructured data have focused mainly on queries for information in unstructured data, which should be semantically, understandable information, which is called semantic information. The traditional data query technology can not query the semantic information of unstructured data, but the development of artificial intelligence technology makes the analysis of unstructured data and the development of query technology have new directions. The existing artificial intelligence technology can realize the tasks of face recognition, object recognition, voice recognition, emotion analysis and the like with higher accuracy. Therefore, the information in the unstructured data can be obtained through an artificial intelligence technology, and the development of an unstructured data query technology is further promoted. Semantic information queries for unstructured data are essentially queries under certain rules for a certain state of a certain object. Unstructured data is essentially a description of a certain state of an object, and when the state of an object changes, the content of the unstructured data also changes, which may be considered as a version change of the unstructured data object. Meanwhile, the change of the semantic information extraction rule can also cause the version change of the semantic information. The prior art can only query for some semantic information of some unstructured data objects, but can not query unstructured data objects and semantic information of a designated version. Not only is the semantic information in unstructured data affected by the calculation rules, more versions can be generated. Version queries of unstructured data semantic information also face the problems of complex relationships among versions, associated queries of versions and the like, and the problems provide great challenges for the version queries of the unstructured data semantic information. Therefore, it is very important to research a management and query method for semantic information of multi-version unstructured data. Disclosure of Invention Aiming at the problems, the invention discloses a query method and a query device for multi-version unstructured data semantic information, which are based on the existing graph database, and can manage and query the multi-version unstructured data semantic information through research, design and implementation of technologies such as query language, acquisition method for the unstructured data object appointed version semantic information, caching and indexing of the multi-version semantic information, version calculation of the semantic information and the like. The technical content of the invention comprises: a multi-version unstructured data semantic information-oriented query method, the method comprising: obtaining unstructured data of each version; for unstructured data of each version, acquiring the version of the semantic information based on an artificial intelligent model used for extracting the semantic information; Constructing and storing a semantic information version tree of unstructured data based on a modification relation among the artificial intelligent models, wherein each node in the semantic information version tree represents semantic information of a version; Establishing an expression mode between a father node and a child node in the semantic information version tree according to the storage mode of the semantic information version tree; and when the semantic information of unstructured data is queried, searching the semantic information version tree based on query filtering conditions and the expression mode to obtain a query result of the semantic information. Further, the query filtering condition comprises one or more of an id of an unstructured data object corresponding to the semantic information to be queried, a version number of the unstructured data object corresponding to the semantic information to be queried, a name of the semantic information to be queried and actual content of the semantic information to be queried. Further, the storage mode comprises the steps of storing the semantic information version tree in a hard disk or storing the semantic information version tree in a Key-Value database. Further, under the condition that the s