CN-122019512-A - Digital core library construction method based on data association
Abstract
The invention discloses a method for constructing a digital core library based on data association, which relates to the technical field of core library digitization and comprises the steps of collecting core sample data, conducting dimension distinction on the collected core sample data, conducting processing on core sample data with different dimensions to generate a space-time data sequence with corresponding dimensions, constructing a data association map among the core sample data with different dimensions according to the space-time data sequence corresponding to the core sample data with different dimensions, generating a corresponding data index rule according to the data association map, completing construction of the digital core library based on the data index rule and the collected core sample data, conducting multi-dimensional data distinction and space-time sequence construction, quantifying the same dimension and cross-dimension association among the data with different dimensions by utilizing the association map, fusing the data with different dimensions into the same data association map, automatically generating a multi-level index rule based on the data association map, and remarkably improving the depth, breadth and intelligent level of data retrieval.
Inventors
- WU ZHANGLEI
- XIAO HUABO
- HUANG YINHAO
- LI FENGLIN
- DONG XIUJUN
- DENG BO
- XING SHAOJUN
- Xiang Shashan
Assignees
- 中国电建集团成都勘测设计研究院有限公司
- 成都理工大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260410
Claims (7)
- 1. The method for constructing the digital core library based on the data association is characterized by comprising the following steps of: Collecting core sample data, and carrying out dimension distinction on the collected core sample data; Processing the core sample data with different dimensions to generate a space-time data sequence with corresponding dimensions; constructing a data association map among the core sample data with different dimensions according to the space-time data sequences corresponding to the core sample data with different dimensions; and generating a corresponding data index rule according to the data association map, and completing the construction of the digital core library based on the data index rule and the collected core sample data.
- 2. The method for constructing a digital core library based on data correlation according to claim 1, wherein the process of collecting core sample data and performing dimension discrimination on the collected core sample data comprises: the core sample data consists of data of a plurality of data types, including description type data, analysis test type data and image type data; respectively splitting the core sample data according to different data types to obtain each data item belonging to the corresponding data type; Each data type is respectively used as one data dimension and is marked as a first data dimension; the data items under each data type are respectively used as one data dimension and are marked as a second data dimension.
- 3. The method for constructing a digitized core library based on data correlation of claim 2, wherein the process of processing core sample data of different dimensions to generate a spatiotemporal data sequence of corresponding dimensions comprises: traversing the core sample data in each first data dimension respectively, marking time-related data in the core sample data, generating a corresponding time tag, marking space-related data, and generating a space tag; and respectively summarizing the obtained time-related data of the time tag and the second data dimension corresponding to the space-related data of the space tag, thereby obtaining a space-time data sequence corresponding to each first data dimension.
- 4. The method for constructing a digitized core library based on data correlation of claim 3, wherein the process of constructing a data correlation map between core sample data of different dimensions according to a spatiotemporal data sequence corresponding to core sample data of different dimensions comprises: summarizing all time labels and space labels belonging to the same data dimension, and obtaining the same-dimension association coefficient between different labels according to the summarized time labels and space labels; Performing similarity matching on each time tag and each space tag with different data dimensions, and obtaining cross-dimensional association coefficients among different tags according to a similarity matching result; Setting a plurality of same-dimensional association threshold ranges and cross-dimensional association threshold ranges, wherein each same-dimensional association threshold range and cross-dimensional association threshold range corresponds to a corresponding association level; matching the obtained co-dimensional association coefficient with each co-dimensional association threshold range, and setting the association level corresponding to the co-dimensional association threshold range as the tag co-dimensional association level between tags corresponding to the co-dimensional association coefficient when the co-dimensional association coefficient is in any co-dimensional association threshold range; Matching the obtained cross-dimensional association coefficient with each cross-dimensional association threshold range, and setting the corresponding association level of the cross-dimensional association threshold range as the tag cross-dimensional association level between the tags corresponding to the cross-dimensional association coefficient when the cross-dimensional association coefficient is in any cross-dimensional association threshold range; generating topology nodes corresponding to all the labels, and generating topology routes and topology route weights among different labels according to the label same-dimensional association level among the labels of the same-dimensional data and label cross-dimensional association levels among the labels of different-dimensional data, so that the construction of the data association map is completed.
- 5. The method for constructing a digitized core library based on data association according to claim 4, wherein the process of obtaining the co-dimensional association coefficients between different labels is as follows: selecting any second data dimension, obtaining relevant data corresponding to each label in the second data dimension, and setting a corresponding comparison standard according to the content of the relevant data; sequentially comparing the related content corresponding to the tag with the related content of another tag to obtain a corresponding association degree; and obtaining the same-dimensional association coefficient between the labels according to the corresponding comparison standard and the obtained association degree.
- 6. The method for constructing a digitized core library based on data association according to claim 5, wherein the cross-dimensional association coefficient between different labels is obtained by the following steps: Selecting any tag as a reference tag and selecting another tag as a reference tag, wherein the reference tag and the reference standard do not belong to the same second data dimension; And obtaining the association degree of the related contents of the reference label and the comparison label, and obtaining the cross-dimensional association coefficient between the reference label and the comparison label according to the corresponding comparison standard and the obtained association degree.
- 7. The method for constructing a digitized core library based on data association of claim 6, wherein the process of generating the corresponding data index rules from the data association map comprises: Extracting corresponding text features from the relevant content of the label corresponding to each topological node, wherein the further explanation is that if the relevant content of the label is picture content, the corresponding picture features are converted into corresponding text content, and then the text features in the text content are extracted; associating the extracted text feature with the topology node; summarizing all the same text features, summarizing topology nodes corresponding to the same text features, and obtaining a topology node set corresponding to the text features; generating index links with the topological node set, corresponding the index links to the text features, and summarizing the generated index links to serve as primary index links; Selecting each topological node in the topological node set corresponding to each primary index link, selecting corresponding other topological nodes according to the label same-dimensional association level or label cross-dimensional association level between the selected topological node and other topological nodes, and marking the selected other topological nodes as secondary topological nodes; summarizing the selected secondary topological nodes to obtain a secondary topological node set, and generating a secondary index link; and associating all secondary index links corresponding to the selected topological nodes with the primary index links, so as to obtain the construction of a data index rule, and realizing the retrieval of core sample data according to the data index rule.
Description
Digital core library construction method based on data association Technical Field The invention relates to the technical field of core library digitization, in particular to a method for constructing a digitized core library based on data association. Background The existing digital core library focuses on the simple storage and management of data, and lacks of deep mining of internal space-time association and cross relation of different sources and multiple types of core data such as description data, analysis test data and image data, so that data resources are scattered, retrieval efficiency is low, knowledge discovery is difficult, and the demands of fine geological research and intelligent analysis are difficult to support. Disclosure of Invention The invention aims to provide a method for constructing a digital core library based on data association. The invention aims at realizing the technical scheme that the method for constructing the digital core library based on data association comprises the following steps: Collecting core sample data, and carrying out dimension distinction on the collected core sample data; Processing the core sample data with different dimensions to generate a space-time data sequence with corresponding dimensions; constructing a data association map among the core sample data with different dimensions according to the space-time data sequences corresponding to the core sample data with different dimensions; and generating a corresponding data index rule according to the data association map, and completing the construction of the digital core library based on the data index rule and the collected core sample data. Further, the process of collecting core sample data and performing dimension distinction on the collected core sample data comprises the following steps: the core sample data consists of data of a plurality of data types, including description type data, analysis test type data and image type data; respectively splitting the core sample data according to different data types to obtain each data item belonging to the corresponding data type; Each data type is respectively used as one data dimension and is marked as a first data dimension; the data items under each data type are respectively used as one data dimension and are marked as a second data dimension. Further, the process of processing the core sample data with different dimensions and generating the space-time data sequence with corresponding dimensions comprises the following steps: traversing the core sample data in each first data dimension respectively, marking time-related data in the core sample data, generating a corresponding time tag, marking space-related data, and generating a space tag; and respectively summarizing the obtained time-related data of the time tag and the second data dimension corresponding to the space-related data of the space tag, thereby obtaining a space-time data sequence corresponding to each first data dimension. Further, the process of constructing the data association map between the core sample data with different dimensions according to the space-time data sequences corresponding to the core sample data with different dimensions comprises the following steps: summarizing all time labels and space labels belonging to the same data dimension, and obtaining the same-dimension association coefficient between different labels according to the summarized time labels and space labels; Performing similarity matching on each time tag and each space tag with different data dimensions, and obtaining cross-dimensional association coefficients among different tags according to a similarity matching result; Setting a plurality of same-dimensional association threshold ranges and cross-dimensional association threshold ranges, wherein each same-dimensional association threshold range and cross-dimensional association threshold range corresponds to a corresponding association level; matching the obtained co-dimensional association coefficient with each co-dimensional association threshold range, and setting the association level corresponding to the co-dimensional association threshold range as the tag co-dimensional association level between tags corresponding to the co-dimensional association coefficient when the co-dimensional association coefficient is in any co-dimensional association threshold range; Matching the obtained cross-dimensional association coefficient with each cross-dimensional association threshold range, and setting the corresponding association level of the cross-dimensional association threshold range as the tag cross-dimensional association level between the tags corresponding to the cross-dimensional association coefficient when the cross-dimensional association coefficient is in any cross-dimensional association threshold range; generating topology nodes corresponding to all the labels, and generating topology routes and topology route weights among differe