DE-102024210902-A1 - Device, data structure and computer-implemented method for constructing a triple of a knowledge graph
Abstract
A device, a data structure, and a procedure are described for the automated construction (210) of triples for knowledge graphs. Starting from input data (item, property, headers for item and property), a semantic description of the headers, (204), a suitable ontology, (206), and a mapping specification, (208) which defines the relationship between the headers, are determined using large language models (LLMs). The resulting triple represents the relationship between item and property. Human experts or automated validation processes can check the outputs generated by the LLMs and thus improve the quality of the triple construction.
Inventors
- Marvin Schiller
- Cuong Xuan Chu
- Manuel Fischer
- Mohamed Gad-Elrab
- Trung Kien TRAN
Assignees
- Robert Bosch Gesellschaft mit beschränkter Haftung
Dates
- Publication Date
- 20260513
- Application Date
- 20241113
Claims (14)
- A computer-implemented method for constructing a triple of a knowledge graph, characterized in that the method comprises providing (202) input data, wherein the input data comprises an item and a property of the item and a header for the item and a header for the property of the item, wherein the method comprises determining (204) a semantic description of the headers depending on the input data, determining (206) an ontology that defines a property for the header of the property of the item depending on the semantic description, determining (208) a mapping specification depending on the semantic description and the ontology, wherein the mapping specification defines a relationship between the header for the item and the header for the property of the item, and wherein the method comprises constructing (210) the triple that comprises the relationship between the item and the property depending on the mapping specification and the input data, and wherein the method comprises at least one of the following: determining the semantic description by Requesting a first large language model to output a semantically describing input data, in particular requesting a human expert or automated machine validation to evaluate the semantically describing input data output, receiving a result of the evaluation of the semantically describing input data output, and determining the semantic description depending on the semantically describing input data output and the result of the evaluation of the semantically describing input data output, and determining the ontology by requesting the first or a second large language model to output a semantic description ontologically describing input data, in particular requesting a human expert or automated machine validation to evaluate the semantically describing input data output, receiving a result of the evaluation of the semantically describing input data output, and determining the ontology depending on the semantically describing input data output and the result of the evaluation of the semantically describing input data output, and determining the mapping specification by Requesting the first, second, or third major language model to output an output map for semantic description and ontology, in particular requesting a human expert or automated machine validation to evaluate the output map, receiving a result of the evaluation of the output map, and determining the ontology depending on the output map and the result of the evaluation of the output map.
- Procedure according to Claim 1 , characterized in that providing (202) the input data comprises providing a table with columns and rows, wherein the item header identifies the column containing the item, wherein the property header identifies the column containing the property, and wherein the table includes the item and the property in the same row.
- Method according to one of the preceding claims, characterized in that determining (204) the semantic description of the headers comprises: determining a structured output that associates the header of the item with a description of the item, a semantics of the item and a data type of the item, and determining a structured output that associates the header of the property with a description of the property, a semantics of the property and a data type of the property.
- Method according to one of the preceding claims, characterized in that determining (206) the ontology comprises: determining the property for the header of the property of the item such that it includes the header of the property of the item as an identifier of the property for the header of the property of the item.
- Procedure according to Claim 4 , characterized in that determining (208) the mapping specification defining the relationship between the header for the item and the header for the property of the item comprises: providing a first mapping to determine a subject of the triple, wherein the first mapping comprises a template including the header of the item, providing a second mapping to determine a predicate and an object of the triple, wherein the second mapping comprises the relationship and a template including the header of the property of the item.
- Procedure according to Claim 5 , characterized in that constructing (210) the triple includes: providing the item as the subject of the triple, providing the relation as the predicate of the triple and determining the object of the triple by finding the first mapping depending on the header of the item, finding the second mapping depending on the relation and finding the property of the item as the object depending on the second mapping as defined by the mapping specification.
- Procedure according to Claim 6 , characterized in that the method comprises: providing (202) properties associated with the header of the property for the item, wherein the properties include the property of the item; providing (202) properties associated with another header, wherein finding (210) the property of the item as the object depending on the second mapping includes determining a statement to find the properties associated with the header of the property of the item depending on the second mapping, in particular to search for the property of the item only in the properties associated with the header of the property of the item in the second mapping.
- Procedure according to one of the Claims 4 until 7 , characterized in that determining (206) the ontology includes: determining a class definition which includes an identifier for the class, and determining the property for the header of the property of the item such that it includes the identifier for the class.
- Procedure according to one of the Claims 4 until 8 , characterized in that determining (206) the ontology includes: determining the property for the header of the property of the item such that it includes a scope and a data type of the second property.
- A method according to any of the preceding claims, characterized in that the method comprises: validating the knowledge graph comprising the constructed triple, and constructing another triple of the knowledge graph depending on the input, the ontology, the semantic description and the mapping specification determined during the construction of the constructed triple, upon successful validation of the knowledge graph, and removing the constructed triple and the ontology and the semantic description and the mapping specification determined for the constructed triple before otherwise determining another triple depending on the input.
- A method according to one of the preceding claims, characterized in that the method comprises: constructing another triple of the knowledge graph depending on another input and the ontology, semantic description and mapping specification determined during the construction of the constructed triple, or determining multiple triples depending on the mapping specification
- Device (100) for constructing a triple of a knowledge graph, characterized in that the device comprises at least one processor (102) and at least one memory (104), wherein the at least one memory (104) stores instructions executable by the at least one processor (102), which, when executed by the at least one processor (102), cause the device (100) to execute a method according to one of the Claims 1 until 11 executes.
- A computer program product for constructing a triple of a knowledge graph, characterized in that the computer program product comprises computer-readable instructions which, when executed by the computer, cause the computer to perform a procedure according to one of the Claims 1 until 11 executes.
- Data structure (300) for constructing a triple of a knowledge graph, characterized in that the data structure (300) comprises at least one data field (302) for input data, wherein the input data comprises an item and a property of the item and a header for the item and a header for the property of the item, wherein the data structure (300) comprises at least one data field (302) for a semantic description of the headers, determined in particular with a large language model depending on the input data, wherein the data structure (300) comprises at least one data field (302) for an ontology that defines a property for the header of the property of the item, wherein the ontology is determined in particular with a large language model depending on the semantic description, wherein the data structure (300) comprises at least one data field (302) for a mapping specification, wherein the mapping specification defines a relationship between the header for the item and the header for the property of the item, wherein the mapping specification is determined in particular with a large language model depending on the semantic description and the ontology, and wherein the data structure (300) includes at least one data field (302) for the triple encompassing the relationship between the item and the property, wherein the triple is constructed depending on the mapping specification.
Description
State of the art The invention relates to a device, a data structure and a computer-implemented method for constructing a triple of a knowledge graph. Description of the invention A computer-implemented method for constructing a triple of a knowledge graph, characterized in that the method comprises providing input data, wherein the input data comprises an item and a property of the item, and a header for the item and a header for the property of the item; wherein the method comprises determining a semantic description of the headers depending on the input data; determining an ontology that defines a property for the header of the item's property, depending on the semantic description; determining a mapping specification depending on the semantic description and the ontology, wherein the mapping specification defines a relationship between the header for the item and the header for the item's property; and wherein the method comprises constructing the triple that includes the relationship between the item and the property, depending on the mapping specification and the input data; and wherein the method comprises at least one of the following: determining the semantic description by requesting a first large language model to produce an output. which semantically describes the input data, in particular requesting a human expert or an automated machine validator to evaluate the semantically describing output of the input data, receiving a result of the evaluation of the semantically describing output of the input data and determining the semantic description depending on the semantically describing output of the input data and the result of the evaluation of the semantically describing output of the input data, and determining the ontology by requesting the first or a second major language model to output an output that ontologically describes the semantic description, in particular requesting a human expert or an automated machine validator to evaluate the semantically describing output of the semantic description, receiving a result of the evaluation of the semantically describing output of the semantic description and determining the ontology depending on the semantically describing output of the semantic description and the result of the evaluation of the semantically describing output of the semantic description, and determining the mapping specification by requesting the first or the second or a third major language model to Outputting an initial image for semantic description and ontology, in particular requesting a human expert or automated machine validation to review the initial image, receiving a result of the review of the initial image, and determining the ontology depending on the initial image and the result of the review of the initial image. The process automates the construction of the knowledge graph with instructions from a human expert or with automated machine validation regarding syntax completeness and consistency without instructions from a human expert. The process proceeds step-by-step through several stages of knowledge graph construction. An example sequence is determining a semantic description, determining an ontology, determining a mapping, and executing the mapping. In each step, a user, such as a human expert or automated machine validation, can review and/or revise the intermediate results—i.e., the semantic description, the ontology, and the mapping—to refine the output of the large language model. The user can provide additional input, such as an existing ontology, based on which the process constructs a new, extended ontology. The knowledge graph can encompass complex and heterogeneous data. It is based on semantic technologies, meaning it describes the data unambiguously and in a semantic language that is interpretable by both humans and machines. This semantic language can be standardized. The knowledge graph supports interoperability and knowledge sharing. It is designed for storing and discovering highly interconnected data and supports multi-step reasoning. The method is based on formal languages and semantics. The ontology, for example, is based on an ontology language, such as the Web Ontology Language OWL. The mapping specification is based, for example, on the RDF mapping language RML, which is based on the W3C standard RDF (Resource Description Framework). The semantic description and/or the mapping specification can be based on a constraint language, such as Shapes Constraint Language (SHACL). The knowledge graph comprises knowledge graph data. A validation task can be performed on this knowledge graph data. This validation task can include a completeness check, a consistency check, and/or verification of compliance with standards. The knowledge graph provides a foundation for retrieving, discovering, and analyzing complex data and enables data analytics and decision-making in application areas such as finance, supply chain management, healthcare, and biotechno