CN-121997612-A - Big data simulation method, simulation platform and device
Abstract
The invention provides a big data simulation method, a simulation platform and a device, which relate to the technical field of simulation calculation and are characterized by comprising the following steps of 1, constructing a reusable operator component library, 2, constructing a simulation task flow through a graphical interface, receiving operator component operation by a user through the graphical interface, constructing the simulation task flow, 3, realizing multi-data source product individuation data association based on a product ID, editing the simulation task flow, realizing the associated loading and unified output of multi-data source data, 4, performing parallel task scheduling and distributed calculation, and 5, aggregating calculation results and performing post-processing. The method has the advantages that the technical threshold and the use cost can be greatly reduced, the precise batch modeling simulation of a large number of individuals of the product oriented to multiple data sources is realized, meanwhile, the high-efficiency cloud parallel computing capability can be provided, and the data simulation modeling and computing efficiency is improved.
Inventors
- XU WU
- LI JIQIANG
- LI ZHIZHUO
- LI HUANHUAN
- SHENG GUOJIN
Assignees
- 上海数匙科技有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20260209
Claims (10)
- 1. The big data simulation method is characterized by comprising the following steps of 1, constructing a reusable operator component library, wherein the reusable operator component library comprises a data loading operator component for loading data, a data generating operator component for generating data, a data matching operator component for correlating multi-data source data, a data calculating operator component for carrying out data calculation, a result storing operator component for storing results and a result displaying operator component for displaying results of a chart; step 2, constructing a simulation task flow through a graphical interface, and receiving dragging and connecting operations of a plurality of operators in an operator component library by a user at the graphical interface, wherein the connecting operations are used for defining data flow directions among the operators; Step 3, realizing the individuation data association of the multi-data source products based on the product ID, editing a simulation task flow, aiming at a plurality of data loading operators or data generating operators, each operator is used as an independent data source, and a product ID matching rule among the data sources is established through a data matching operator, so that the association loading and unified output of the multi-data source data are realized; Step 4, parallel task scheduling and distributed computing execution, splitting a simulation task flow into a plurality of subtasks according to a preset parallel computing strategy, and distributing the subtasks to a plurality of computing nodes for computing; and 5, aggregating the calculation results, performing post-processing, collecting and processing the results of each calculation node, and performing post-processing.
- 2. The method of claim 1, wherein the data loading operator component in step 1 comprises a product operation big data operator for loading signal data, parameter data and product characteristic big data operator from a database, and a local data uploading operator for reading local data.
- 3. The method of claim 1, wherein the data loading operator component in step 1 screens or samples data, and the screening or sampling content includes product ID, signal, parameter and date of data.
- 4. The big data simulation method according to claim 1, wherein the data generation operator component in step 1 supports custom product ID number, product parameter name, and statistical distribution type of each parameter and basic parameter value of distribution. The system firstly generates a numerical value equal to the number of the product IDs for each product parameter according to the statistical distribution type and the corresponding basic parameter value, and then associates the parameter values with the product IDs according to a preset combination rule to generate a final data set.
- 5. The big data simulation method of claim 4, wherein the preset combination rule comprises random matching, size sorting matching and specified data pair column matching among parameters, and the statistical distribution type comprises normal distribution, lognormal distribution, weibull distribution, exponential distribution and uniform distribution.
- 6. The big data simulation method according to claim 1, wherein the product ID matching rule between the data sources established by the data matching operator in the step 3 includes product ID consistent matching, random matching between product IDs, sequential matching between product IDs, or user-specified product ID pair-column matching.
- 7. The big data simulation method according to claim 1, wherein the parallel computing strategy in the step 4 includes splitting according to product ID, splitting according to time period of product operation data or splitting according to operators in the simulation task flow and combination thereof.
- 8. A big data simulation platform, comprising: The data loading module is used for carrying out product ID screening and data sampling on the product operation big data and the product characteristic big data which are associated with the product ID in the database or the local area so as to generate a data loading operator; the data generation module is used for defining the number of the product IDs, the names of the product parameters, the statistical distribution type of each parameter and the value of the distributed basic parameter so as to generate a data generation operator; the model development module is used for uploading a calculation model, a data processing model and writing various language programs to form a data calculation operator; The integrated modeling module is used for providing a graphical interface and connecting, arranging and associating data ID between data sources by using a data loading operator, a data generating operator, a data calculating operator and a data matching operator so as to construct a simulation task flow; the calculation setting module is used for configuring calculation resources for the simulation task flow and setting a parallel calculation mode; and the post-processing module is used for screening, statistically analyzing and visually displaying simulation calculation results.
- 9. The big data simulation platform according to claim 8, wherein the computing resources configured by the computing setting module include a central processor, a graphics processor and a memory, and the parallel computing mode includes parallel according to product ID, parallel according to time period or parallel according to operator and a combination thereof.
- 10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the big data emulation method according to any of claims 1 to 9 when the program is executed by the processor.
Description
Big data simulation method, simulation platform and device Technical Field The present invention relates to the field of simulation computing technologies, and in particular, to a big data simulation method, a simulation platform, and a device. Background In the fields of intelligent manufacturing, digital twinning and the like, industrial simulation software is a core tool for product design and performance evaluation, but the current industrial simulation software can only support one section of operation data of one product to perform simulation calculation and cannot directly interface a large database to simultaneously execute large-data simulation calculation of a large number of product individuals and long operation time. The existing big data computing platform is mostly applied to tasks such as data statistics analysis and machine learning, lacks support for services such as physical simulation model integration and multi-science solution scheduling, and is difficult to realize business requirements. The product big data comprises two kinds of big data, namely big data of product operation signal response, such as voltage and current signals, and big data of product characteristics, such as diameter, hardness and pretightening force. Product big data in enterprises are usually stored in various database tables, so that a tool for executing big data simulation must be provided with a sub-data set obtained by screening or sampling data from multi-source data, and then data association matching, unified modeling and simulation calculation are performed on the data in the multi-source sub-data set according to the product ID to which the data belongs. Therefore, a big data simulation method, a simulation platform and a device which can solve the above problems and is convenient for personnel to use and operate are needed. Disclosure of Invention In order to achieve the aim of the invention, the invention adopts the following technical scheme: Step 1, constructing a reusable operator component library, wherein the reusable operator component library comprises a data loading operator component for loading data, a data generating operator component for generating data, a data matching operator component for correlating multiple data source data, a data calculating operator component for carrying out data calculation, a result storing operator component for storing results and a result displaying operator component for displaying results through a chart; step 2, constructing a simulation task flow through a graphical interface, and receiving dragging and connecting operations of a plurality of operators in an operator component library by a user through the graphical interface, wherein the connecting operations are used for defining data flow directions among the operators; Step 3, realizing the individuation data association of the multi-data source products based on the product ID, editing a simulation task flow, aiming at a plurality of data loading operators or data generating operators, each operator is used as an independent data source, and a product ID matching rule among the data sources is established through a data matching operator, so that the association loading and unified output of the multi-data source data are realized; Step 4, parallel task scheduling and distributed computing execution, splitting a simulation task flow into a plurality of subtasks according to a preset parallel computing strategy, and distributing the subtasks to a plurality of computing nodes for computing; and 5, aggregating the calculation results, performing post-processing, collecting and processing the results of each calculation node, and performing post-processing. The data loading operator component in the step 1 comprises a product operation big data operator for loading signal data and parameter data from a database, a product characteristic big data operator and a local data uploading operator for reading local data. As an improvement, the data loading operator component in the step 1 can screen or sample the data, wherein the screening or sampling content comprises a product ID, a signal, a parameter and a data date. As an improvement, the data generation operator component in the step 1 supports the number of custom product IDs, product parameter names, and statistical distribution types and basic parameter values of distribution of each parameter. The system firstly generates a numerical value equal to the number of the product IDs for each product parameter according to the statistical distribution type and the corresponding basic parameter value, and then associates the parameter values with the product IDs according to a preset combination rule to generate a final data set. The preset combination rule comprises random matching, size sorting matching and specified data-to-column matching among parameters, and the statistical distribution type comprises normal distribution, lognormal distribution, weibu