CN-122029550-A - Time-series data processing device and time-series data processing method
Abstract
The time series data processing device is provided with a data input unit (110) which acquires a 1 st time series data set containing a plurality of time series data as explanatory variable candidates, a preprocessing unit (120) which converts the relativity of the plurality of time series data contained in the acquired 1 st time series data set into a computable form to generate a2 nd time series data set containing the converted plurality of time series data, a relativity calculation unit (140) which calculates waveform semantic relativity among the plurality of time series data contained in the 2 nd time series data set, a layering unit (150) which obtains a time front-rear relationship among the plurality of time series data contained in the 2 nd time series data set, and layers the plurality of time series data contained in the 2 nd time series data set according to the obtained time front-rear relationship, and outputs a layered result, and a visualization unit (160) which generates a plurality of time series data visualization graphs which visualize the 2 nd time series data set based on the calculated waveform semantic relativity and the outputted layered result.
Inventors
- Land with water
Assignees
- 三菱电机株式会社
Dates
- Publication Date
- 20260512
- Application Date
- 20231020
Claims (14)
- 1. A time-series data processing apparatus, wherein, The device is provided with: A data input unit for acquiring a1 st time-series data set including a plurality of time-series data candidates as explanatory variables; A preprocessing unit configured to convert a relationship between a plurality of pieces of time-series data included in the 1 st time-series data set obtained into a computable form, and generate a2 nd time-series data set including the plurality of pieces of time-series data after conversion; A relational computation unit that computes waveform semantic relativity between a plurality of time-series data included in the 2 nd time-series data set; A layering unit for obtaining a time-series relation between a plurality of pieces of time-series data included in the 2 nd time-series data set, layering the plurality of pieces of time-series data included in the 2 nd time-series data set according to the obtained time-series relation, and outputting a layered result, and And a visualization unit configured to generate a visualization map for visualizing a plurality of time-series data included in the 2 nd time-series data set, based on the calculated waveform semantic relationship and the outputted hierarchical result.
- 2. The time-series data processing apparatus according to claim 1, wherein, The layering unit obtains a time-series relation between a plurality of time-series data included in the 2 nd time-series data set, based on domain knowledge defined by user input.
- 3. The time-series data processing apparatus according to claim 2, wherein, The domain knowledge includes a plurality of words and definitions of temporal context between the plurality of words.
- 4. The time-series data processing apparatus according to any one of claims 1 to 3, wherein, The layering unit performs time shift on the plurality of pieces of time-series data included in the 2 nd time-series data set so that a correlation coefficient between the plurality of pieces of time-series data becomes highest, obtains a shift width, and obtains a time-series relation between the plurality of pieces of time-series data included in the 2 nd time-series data set according to the obtained shift width.
- 5. The time-series data processing apparatus according to any one of claims 1 to 4, wherein, The waveform semantic relatedness is a cross-correlation between a plurality of time series data contained in the 2 nd time series data set.
- 6. The time-series data processing apparatus according to any one of claims 1 to 5, wherein, The visual map is a graph represented by a plurality of time-series data included in the 2 nd time-series data set, each of which is a vertex and a waveform semantic relationship between the time-series data is a side.
- 7. The time-series data processing apparatus according to claim 6, wherein, The edge may be different in thickness of a line, color of a line, or line type representing a solid line or a broken line according to waveform semantic relativity among a plurality of time-series data included in the 2 nd time-series data set.
- 8. The time-series data processing apparatus according to any one of claims 1 to 7, wherein, The time-series data processing apparatus further includes a grouping unit that classifies a plurality of time-series data included in the 2 nd time-series data set into groups according to a property of the data, and generates representative values of the groups.
- 9. The time-series data processing apparatus according to claim 8, wherein, The grouping unit groups the plurality of time-series data included in the 2 nd time-series data set according to the similarity of waveforms of the plurality of time-series data included in the 2 nd time-series data set.
- 10. The time-series data processing apparatus according to claim 8 or 9, wherein, The grouping unit generates, for each group, a representative value reflecting the characteristics of the data included in each group.
- 11. The time-series data processing apparatus according to any one of claims 1 to 10, wherein, The time-series data processing apparatus further includes a recalculation unit that receives additional time-series data and performs recalculation for adding the received additional time-series data to the visual image.
- 12. The time-series data processing apparatus according to any one of claims 8 to 11, wherein, The recalculation unit obtains a representative value of the received additional time-series data, calculates a correlation coefficient between the obtained representative value of the received additional time-series data and the generated representative value of each group, and assigns the received additional time-series data to the group having the largest calculated correlation coefficient.
- 13. The time-series data processing apparatus according to any one of claims 8 to 11, wherein, The recalculation unit obtains a representative value of the received additional time-series data, calculates correlation coefficients between the obtained representative value of the received additional time-series data and the generated representative values of the groups, generates a new group including the received additional time-series data when the calculated correlation coefficients are smaller than a predetermined threshold value, and calculates a waveform semantic relationship between the generated new group and other groups.
- 14. A time-series data processing method is executed by a time-series data processing device provided with a data input unit, a preprocessing unit, a relational computation unit, a hierarchy unit, and a visualization unit, The method comprises the following steps: the data input unit acquires a1 st time-series data set including a plurality of time-series data candidates as explanatory variables; A preprocessing unit configured to convert the relativity of a plurality of pieces of time-series data included in the 1 st time-series data set obtained into a computable form, and generate a2 nd time-series data set including the plurality of pieces of time-series data after conversion; The relational computation unit computes waveform semantic relativity between a plurality of time-series data included in the 2 nd time-series data set; The layering unit obtains a time-series relation between a plurality of pieces of time-series data included in the 2 nd time-series data set, and layers the plurality of pieces of time-series data included in the 2 nd time-series data set according to the obtained time-series relation, and outputs a layered result, and The visualization unit generates a visualization map for visualizing a plurality of time-series data included in the 2 nd time-series data set based on the calculated waveform semantic relativity and the outputted hierarchical result.
Description
Time-series data processing device and time-series data processing method Technical Field The present disclosure relates to time-series data processing techniques. Background In time series data processing based on a prediction model, it is extremely important to specify the selection of variables and largely affect the accuracy of prediction. However, it is difficult to manually select the best explanatory variable from among a plurality of candidates of explanatory variables. For this reason, a technique of automatically selecting a description variable in prediction based on a predetermined algorithm has been proposed (for example, patent document 1). In the technique of patent document 1, the larger the regression coefficient (absolute value) between the objective variable and the explanatory variable is, the more appropriate the explanatory variable is considered, and the selection of the explanatory variable is automatically performed by comparing the regression coefficient with a threshold value (claim 4,0093 of patent document 1). Patent document 1 International publication No. 2013/187295 According to the technique of automatically selecting explanatory variables disclosed in patent document 1, there is a problem that only explanatory variables having pseudo-correlation with a target variable may be selected. That is, there is a problem that it cannot be said that there is a causal relation to the target variable, and that it is determined that there is a causal relation for some reason, and that only such a causal relation is selected. Disclosure of Invention The present disclosure has been made to solve such a problem, and an object thereof is to provide a time-series data processing technique capable of making a explanatory variable having only a pseudo-correlation unselected. An aspect of the time-series data processing apparatus according to an embodiment of the present disclosure includes a data input unit that obtains a1 st time-series data set including a plurality of time-series data as candidates of explanatory variables, a preprocessing unit that converts a relationship of the plurality of time-series data included in the obtained 1 st time-series data set into a computable form to generate a2 nd time-series data set including the plurality of time-series data after conversion, a relationship calculating unit that calculates waveform semantic relationships among the plurality of time-series data included in the 2 nd time-series data set, a layering unit that obtains a time-front-back relationship among the plurality of time-series data included in the 2 nd time-series data set, and layers the plurality of time-series data included in the 2 nd time-series data set according to the obtained time-front-back relationship, and outputs a layered result, and a visualizing unit that generates a plurality of visual images of time-series data included in the 2 nd time-series data set based on the calculated waveform semantic relationships and the outputted layered result. According to the time-series data processing apparatus according to the embodiment of the present disclosure, since a plurality of candidates of explanatory variables are presented, only explanatory variables having pseudo-correlations can be made unselected. Drawings Fig. 1 is a diagram showing a configuration example of a time-series data processing apparatus and a time-series data processing system. Fig. 2 is a schematic diagram showing the processing performed by the grouping unit. Fig. 3 is a diagram showing an outline of the relationship calculation performed by the relationship calculation unit. Fig. 4 is a diagram showing an outline of layering performed by the layering unit. Fig. 5 is a diagram showing a change in layering. Specifically, a graph showing an example of the correlation of the month shift is shown. Fig. 5A is a diagram showing an original waveform. Fig. 5B is a diagram of a case where one waveform is shifted by 1 month. Fig. 6 is a diagram showing a change in layering. Specifically, a specific example of the case where layering is performed in accordance with a predetermined time-dependent relationship is shown. Fig. 7A is a diagram showing a configuration example of hardware of the time-series data processing apparatus. Fig. 7B is a diagram showing a configuration example of hardware of the time-series data processing apparatus. Fig. 8 is a flowchart of a time-series data processing method. Fig. 9 is a diagram showing an example of a visual map. Fig. 10 is a diagram showing an outline of the processing performed by the recalculation unit. Detailed Description Hereinafter, various embodiments in the present disclosure will be described in detail with reference to the accompanying drawings. In the drawings, constituent elements denoted by the same or similar reference numerals have the same or similar structures or functions, and overlapping description of such constituent elements is omitted. In addition, in