CN-122022208-A - Engineering cost data processing method and system based on big data analysis
Abstract
The application relates to the technical field of data processing, in particular to a method and a system for processing engineering cost data based on big data analysis. According to the scheme, the multi-source mass database and the dynamic knowledge base are constructed, and the mass data are efficiently processed based on the large data frame, so that efficient integration and association analysis of the multi-source mass data are realized, multi-dimensional dynamic cost knowledge information is generated, and efficient and accurate caculate roughly and verification of engineering cost data are performed. And further, through mining deep association relation rules and risk conduction paths, the information such as business data, market data and the like is processed in real time by combining a stream processing engine, so that risk event identification and early warning are realized. The scheme of the application can efficiently and accurately carry out engineering cost caculate roughly and auditing based on the mined dynamic knowledge information, greatly shorten the flow period and improve the accuracy and the processing efficiency of the system.
Inventors
- LIU TAO
- Huang Runlan
- CHEN YAOSEN
- ZHU JUN
- Li Nengmiao
- ZHANG GUANGXING
- WU ZHUANGHAI
- Yang Qiujia
- ZENG WENZE
- CHEN ZHENGWEI
- CHEN YAO
Assignees
- 中水珠江规划勘测设计有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260415
Claims (10)
- 1. A method for processing engineering cost data based on big data analysis, the method comprising: Constructing a mass multisource database and a dynamic knowledge base; Performing data mining on the multi-source data according to a preset period based on the multi-task batch processing engine to generate dynamic cost knowledge information, and storing and/or updating a dynamic knowledge base; monitoring preset business operation events or market data in real time; Based on the flow processing engine, corresponding business/market data are analyzed in real time, a dynamic knowledge base is matched and queried, caculate roughly, check and/or analysis are carried out on engineering data, and caculate roughly, check results and/or early warning information are generated.
- 2. The method according to claim 1, wherein the method further comprises: The batch processing engine is started at fixed time to read various mass multi-source data, and the multi-source data is preprocessed and mapped in a standardized way; Extracting each multisource data preset field to obtain data information, and generating a corresponding data association characteristic table; starting a plurality of parallel mining tasks, and carrying out dynamic price benchmark generation and association relation rule mining to obtain dynamic cost knowledge information; And after the dynamic cost knowledge information is processed in a serialization way, storing and/or updating the dynamic knowledge base.
- 3. The method according to claim 2, wherein the method further comprises: Constructing a joint feature vector containing space-time, engineering type, resource and price multi-dimension by taking engineering items as objects; Coarse-grained clustering is carried out on each joint feature vector based on space-time dimension features to form a plurality of space-time clusters; in each space-time cluster, carrying out secondary fine granularity clustering based on engineering types and resource characteristics to form a plurality of engineering-resource sub clusters; extracting the price corresponding to each engineering-resource sub-cluster, and calculating weighted kernel density estimation; And calculating corresponding price accumulation distribution information according to the weighted kernel density estimation, and determining price references and corresponding confidence intervals of each sub-cluster to obtain dynamic price references of corresponding engineering items.
- 4. The method according to claim 2, wherein the method further comprises: Acquiring preset change item information or market fluctuation information, and generating a corresponding first event; generating a corresponding second event based on the execution result or the cost change of each stage of the project; Collecting all events according to preset transaction granularity, constructing a transaction data set, and generating an initial association relation based on an FP-Growth algorithm; and screening the initial association relation and/or integrating paths according to the time sequence, the confidence level and the conduction information to obtain an association relation rule.
- 5. The method according to claim 2, wherein the method further comprises: Constructing a Flink stream processing engine to analyze corresponding business data and/or market data and acquire corresponding target characteristics; and matching and querying a dynamic knowledge base based on the target characteristic information, and carrying out caculate roughly, checking and/or early warning on the current engineering data.
- 6. The method of claim 5, wherein the method further comprises: Acquiring preset target characteristics of each engineering data, and matching and inquiring a dynamic price benchmark corresponding to a dynamic knowledge base based on similarity calculation and/or key fields of the preset target characteristics; Carrying out caculate roughly and check on the current engineering data according to the queried dynamic price reference to generate caculate roughly and check results; And/or the number of the groups of groups, And detecting whether the pre-event of each association rule occurs or not based on market data and/or business data, and if so, creating a corresponding monitoring task to pre-warn or prompt the target conduction item of the association rule.
- 7. The method according to any one of claims 1-6, further comprising: The stream processing engine monitors the abnormal processing information in real time, and if the similar abnormal processing information exceeds the preset times, the stream processing engine sends preset trigger information; the batch processing engine responds to the trigger information, adjusts the related historical data calculation and analysis, and updates the dynamic knowledge base.
- 8. A construction cost data processing system based on big data analysis, the system comprising: The data construction module is used for constructing a mass multi-source database and a dynamic knowledge base; The data batch processing module is used for carrying out data mining on the multi-source data according to a preset period based on the multi-task batch processing engine, generating dynamic cost knowledge information, and storing and/or updating a dynamic knowledge base; the stream data acquisition module is used for monitoring preset business operation events or market data in real time; The stream data processing module is used for analyzing corresponding business/market data in real time, searching the dynamic knowledge base in a matching way and carrying out caculate roughly, checking and/or analyzing on engineering data; and the result generation module is used for generating and displaying caculate roughly, check results and/or early warning information.
- 9. A computer readable storage medium storing a computer program, which when executed by a processor causes the processor to perform the steps of the method according to any one of claims 1 to 7.
- 10. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the computer program, when executed by the processor, causes the processor to perform the steps of the method according to any of claims 1 to 7.
Description
Engineering cost data processing method and system based on big data analysis Technical Field The application relates to the technical field of data processing, in particular to a method and a system for processing engineering cost data based on big data analysis. Background Along with the rapid transition of engineering cost management to digitization and intellectualization, engineering cost management faces challenges of rapid increase of data volume, scattered sources and high real-time requirements. In the whole life cycle (measurement, summary, preliminary and knot) of engineering cost, the existing scheme lacks the intelligent analysis and check functions of big data, and cost personnel have to manually collect list items to corresponding subjects through a large number of repeated and complicated manual operations. In addition, the existing cost management also depends on staged manual accounting and static indexes, and has the problems of data island lag, passive risk perception, insufficient dynamic adjustment capability, dependence on static experience estimation and the like. Patent document 1 (CN 115423331B) discloses a project cost index determination scheme supporting enterprise custom configuration and automatic aggregation calculation, which establishes an association relationship between an enterprise index subject and an enterprise standard list, configures index numerator/denominator calculation logic through macro code identification, supports enterprise personalized index calculation caliber, and realizes automatic aggregation and index generation of data. However, the scheme only processes the standard list and index subject data uploaded by the enterprise, and the index calculation rule is still static configuration and cannot be automatically optimized according to engineering progress, market change and project deviation. In addition, in the prior art, analysis and processing of the construction cost are performed through a neural network or deep learning, for example, patent document 2 (CN 121327340 a) discloses a real-time construction cost assessment and early warning system based on multi-source data fusion and dynamic learning. However, based on a neural network or deep learning depends on a complex neural network, the neural network is a 'black box' model, when an evaluation result is abnormal or needs to be adjusted, a system cannot give clear and understandable service attribution, rules of the service attribution are hidden in neural network parameters, the service attribution is difficult to extract and multiplex, model fine adjustment is needed for a new project or a new environment each time, calculation resource consumption is large, light-weight and high-frequency rule iteration cannot be realized, and potential conduction paths, cause analysis and the like cannot be obtained during system early warning. Disclosure of Invention Based on the above, the application provides a method and a system for processing engineering cost data based on big data analysis, aiming at intelligently and efficiently fusing multi-source dynamic data through big data processing technology and intelligent algorithm, performing deep mining and real-time service response on the multi-source data and improving the accuracy and processing efficiency of the system. The application provides a construction cost data processing method based on big data analysis, which comprises the following steps: Constructing a mass multisource database and a dynamic knowledge base; Performing data mining on the multi-source data according to a preset period based on the multi-task batch processing engine to generate dynamic cost knowledge information, and storing and/or updating a dynamic knowledge base; monitoring preset business operation events or market data in real time; And analyzing corresponding business/market data in real time based on the stream processing engine, matching and inquiring the dynamic knowledge base, carrying out caculate roughly, checking and/or analyzing on engineering data, and generating caculate roughly, checking results and/or early warning information. Further, the method further comprises: The batch processing engine is started at fixed time to read various mass multi-source data, and the multi-source data is preprocessed and mapped in a standardized way; Extracting each multisource data preset field to obtain data information, and generating a corresponding data association characteristic table; starting a plurality of parallel mining tasks, and carrying out dynamic price benchmark generation and association relation rule mining to obtain dynamic cost knowledge information; And after the dynamic cost knowledge information is processed in a serialization way, storing and/or updating the dynamic knowledge base. Preferably, the method comprises: Constructing a joint feature vector containing space-time, engineering type, resource and price multi-dimension by taking engineering item