EP-4738211-A1 - ANALYSIS PROGRAM, ANALYSIS DEVICE, AND ANALYSIS METHOD
Abstract
An analysis program that causes a computer to execute a process includes first extracting (S201), from a set of data including a value of each of a plurality of explanatory variables and a value of an objective variable, a specific condition having a specific correlation with the objective variable among conditions for at least a part of the plurality of explanatory variables, second extracting (S202) a set of combinations of a value of a predetermined explanatory variable and a value of the objective variable indicated by predetermined variable relationship information from the set of data based on the specific condition, dividing (S203) the set of combinations into a first group and a second group, and comparing (S204) positive and negative signs of a first coefficient indicating a relationship between the predetermined explanatory variable and the objective variable represented by a combination of the first group with positive and negative signs of a second coefficient indicating a relationship between the predetermined explanatory variable and the objective variable represented by a combination of the second group, wherein the predetermined variable relationship information indicates a relationship between the predetermined explanatory variable and the objective variable in data processing similar to data processing on the set of data.
Inventors
- HIGUCHI, HIROYUKI
Assignees
- FUJITSU LIMITED
Dates
- Publication Date
- 20260506
- Application Date
- 20251024
Claims (13)
- An analysis program that causes a computer to execute a process comprising: first extracting (S201), from a set of data including a value of each of a plurality of explanatory variables and a value of an objective variable, a specific condition having a specific correlation with the objective variable among conditions for at least a part of the plurality of explanatory variables; second extracting (S202) a set of combinations of a value of a predetermined explanatory variable and a value of the objective variable indicated by predetermined variable relationship information from the set of data based on the specific condition; dividing (S203) the set of combinations into a first group and a second group; and comparing (S204) positive and negative signs of a first coefficient indicating a relationship between the predetermined explanatory variable and the objective variable represented by a combination of the first group with positive and negative signs of a second coefficient indicating a relationship between the predetermined explanatory variable and the objective variable represented by a combination of the second group; wherein the predetermined variable relationship information indicates a relationship between the predetermined explanatory variable and the objective variable in data processing similar to data processing on the set of data.
- The analysis program according to claim 1, wherein the process further includes generating (S1113) specific variable relationship information indicating a relationship between the predetermined explanatory variable and the objective variable in the data processing on the set of data by using the set of combinations when the positive and negative signs of the first coefficient are different from the positive and negative signs of the second coefficient.
- The analysis program according to claim 2, wherein the process further includes generating (S1113) explanatory information on a difference between the specific variable relationship information and the predetermined variable relationship information.
- The analysis program according to claim 2, wherein the predetermined variable relationship information is a volcano plot or a conditional causal graph, and the specific variable relationship information is a volcano plot.
- The analysis program according to any one of claims 1 to 4, wherein the specific correlation indicates that an index indicating strength of a causal relationship between any of the plurality of explanatory variables and the objective variable is equal to or greater than a reference value in a causal graph generated from a condition for at least a part of the plurality of explanatory variables.
- The analysis program according to any one of claims 1 to 4, wherein the second extracting (S202) includes: extracting, from the set of data, data that satisfies a condition for an explanatory variable other than the predetermined explanatory variable among the explanatory variables included in the specific condition; and extracting a combination of a value of the predetermined explanatory variable and a value of the objective variable from data that satisfies the condition for the explanatory variable other than the predetermined explanatory variable.
- The analysis program according to any one of claims 1 to 4, wherein the process further includes selecting (S1105) the predetermined variable relationship information from a plurality of pieces of variable relationship information, wherein each of the plurality of pieces of variable relationship information is generated from the set of data including the value of each of the plurality of explanatory variables and the value of the objective variable, and represents a relationship between any of the explanatory variables and the objective variable.
- The analysis program according to any one of claims 1 to 4, wherein the process further includes updating (S1106, S1116) the set of data so that data including the value of each of the plurality of explanatory variables, the value of the objective variable, and the value of the predetermined explanatory variable is included in the set of data when the value of the predetermined explanatory variable is not included in the set of data, wherein the second extracting (S202) includes extracting the set of combinations from the updated set of data.
- The analysis program according to any one of claims 1 to 4, wherein the predetermined variable relationship information includes a first threshold for the predetermined explanatory variable, and the first extracting (S201) includes extracting a condition for at least a part of the plurality of explanatory variables from the set of data based on a second threshold for the predetermined explanatory variable, and when a difference between the second threshold and the first threshold is larger than a predetermined value, the computer is further caused to execute processing of changing the second threshold so that the difference between the second threshold and the first threshold is smaller than the predetermined value.
- An analysis device (101, 103) comprising: a condition extraction unit (111) configured to extract, from a set of data including a value of each of a plurality of explanatory variables and a value of an objective variable, a specific condition having a specific correlation with the objective variable among conditions for at least a part of the plurality of explanatory variables; a data extraction unit (112) configured to extract a set of combinations of a value of a predetermined explanatory variable and a value of the objective variable indicated by predetermined variable relationship information from the set of data based on the specific condition; and a comparison unit (113) configured to divide the set of combinations into a first group and a second group, and compares positive and negative signs of a first coefficient indicating a relationship between the predetermined explanatory variable and the objective variable represented by a combination of the first group with positive and negative signs of a second coefficient indicating a relationship between the predetermined explanatory variable and the objective variable represented by a combination of the second group; wherein the predetermined variable relationship information indicates a relationship between the predetermined explanatory variable and the objective variable in data processing similar to data processing on the set of data.
- The analysis device (101, 103) according to claim 10, wherein the comparison unit (113) is further configured to generate specific variable relationship information indicating a relationship between the predetermined explanatory variable and the objective variable in data processing on the set of data by using the set of combinations when the positive and negative signs of the first coefficient are different from the positive and negative signs of the second coefficient.
- An analysis method carried out by a computer, comprising: first extracting (S201), from a set of data including a value of each of a plurality of explanatory variables and a value of an objective variable, a specific condition having a specific correlation with the objective variable among conditions for at least a part of the plurality of explanatory variables; second extracting (S202) a set of combinations of a value of a predetermined explanatory variable and a value of the objective variable indicated by predetermined variable relationship information from the set of data based on the specific condition; dividing (S203) the set of combinations into a first group and a second group; and comparing (S204) positive and negative signs of a first coefficient indicating a relationship between the predetermined explanatory variable and the objective variable represented by a combination of the first group with positive and negative signs of a second coefficient indicating a relationship between the predetermined explanatory variable and the objective variable represented by a combination of the second group; wherein the predetermined variable relationship information indicates a relationship between the predetermined explanatory variable and the objective variable in data processing similar to data processing on the set of data.
- The analysis method according to claim 12, wherein the analysis method includes generating (S1113) specific variable relationship information indicating a relationship between the predetermined explanatory variable and the objective variable in data processing on the set of data by using the set of combinations when the positive and negative signs of the first coefficient are different from the positive and negative signs of the second coefficient.
Description
FIELD The embodiments discussed herein are related to an analysis program, an analysis device, and an analysis method. BACKGROUND In a material search for an optimal material to be used for production of a product, a scatter diagram called a volcano plot may be used. The horizontal and vertical axes of the volcano plot represent the properties with respect to the material. As an example of material search using a volcano plot, research on a catalyst for electrochemical ammonia synthesis is known (See, for example, E. Drazevic et al., "Are There Any Overlooked Catalysts for Electrochemical NH3 Synthesis-New Insights from Analysis of Thermochemical Data", iScience, Volume 23, Issue 12, 33 pages, December 18, 2020.). The horizontal axis of a volcano plot corresponds to an explanatory variable, and the vertical axis corresponds to an objective variable. The shape of the graph of the volcano plot is a mountain shape or a valley shape, and the relationship between the explanatory variable and the objective variable significantly changes at the vertex of the graph. In the case of material search, a volcano plot is often used because a feature of a desired material can be objectively understood from values of an explanatory variable and an objective variable at a vertex of a graph. However, when a volcano plot is generated from data including a combination of many properties related to a material, selection of a property to be used as an explanatory variable largely depends on knowledge and experience of experts. In addition, it is not necessarily easy to determine the possibility of generating a volcano plot from a scatter diagram in which explanatory variables and objective variables are plotted. Note that such a problem occurs not only when a set of data is analyzed for material search but also when a set of data is analyzed for various purposes. Accordingly, it is an object in one aspect of an embodiment of the invention to provide an analysis program, an analysis device, and an analysis method that efficiently determines a relationship between two variables from a set of data including values of a plurality of variables. SUMMARY According to an aspect of an embodiment, an analysis program that causes a computer to execute a process including extracting, from a set of data including a value of each of a plurality of explanatory variables and a value of an objective variable, a specific condition having a specific correlation with the objective variable among conditions for at least a part of the plurality of explanatory variables; extracting a set of combinations of a value of a predetermined explanatory variable and a value of the objective variable indicated by predetermined variable relationship information from the set of data based on the specific condition; dividing the set of combinations into a first group and a second group; and comparing positive and negative signs of a first coefficient indicating a relationship between the predetermined explanatory variable and the objective variable represented by a combination of the first group with positive and negative signs of a second coefficient indicating a relationship between the predetermined explanatory variable and the objective variable represented by a combination of the second group; and the predetermined variable relationship information indicates a relationship between the predetermined explanatory variable and the objective variable in data processing similar to data processing on the set of data. BRIEF DESCRIPTION OF DRAWINGS FIG. 1 is a functional configuration diagram of an analysis device according to an embodiment;FIG. 2 is a flowchart of first analysis processing;FIG. 3 is a functional configuration diagram illustrating a specific example of the analysis device;FIGS. 4A and 4B are diagrams illustrating first variable relationship information;FIGS. 5A and 5B are diagrams illustrating second variable relationship information;FIG. 6 is a diagram illustrating second analysis processing;FIG. 7 is a diagram illustrating an initial atomic structure;FIG. 8 is a diagram illustrating a data set;FIG. 9 is a diagram illustrating a volcano plot;FIG. 10 is a diagram illustrating a causal graph;FIG. 11A is a flowchart (part 1) of the second analysis processing;FIG. 11B is a flowchart (part 2) of the second analysis processing; andFIG. 12 is a hardware configuration diagram of an information processing device. DESCRIPTION OF EMBODIMENTS Preferred embodiments of the present invention will be explained with reference to accompanying drawings. Note that the analysis program, the analysis device, and the analysis method disclosed in the present application are not limited to the following examples. FIG. 1 illustrates a functional configuration example of an analysis device according to an embodiment. An analysis device 101 of FIG. 1 includes a condition extraction unit 111, a data extraction unit 112, and a comparison unit 113. FIG. 2 is a flowchart illustrati