Search

CN-121980962-A - Dynamic decision method, device and system for cooperative optimization of long-term and short-term benefits of construction organization

CN121980962ACN 121980962 ACN121980962 ACN 121980962ACN-121980962-A

Abstract

The invention discloses a dynamic decision method, a device and a system for collaborative optimization of long-term benefits of construction organizations, which are used for constructing a virtual simulation environment for construction of a high-fidelity tunnel, constructing an intelligent decision model based on SoftActor-Critic for sensing the environment state and making an optimal decision, and carrying out virtual training and instant reasoning according to the intelligent decision model to realize the dynamic decision of collaborative optimization of the long-term benefits of the construction organizations.

Inventors

  • WEI JIANYING
  • WANG GAN
  • YU XIAOQUAN
  • WANG YIXIAN
  • GUO PANPAN
  • ZHANG WENYAN
  • CAI QI

Assignees

  • 合肥工业大学

Dates

Publication Date
20260505
Application Date
20260313

Claims (7)

  1. 1. A dynamic decision method for collaborative optimization of long-term and short-term benefits of construction organizations is characterized by comprising the following steps: s1, constructing a high-fidelity tunnel construction virtual simulation environment; s2, constructing a SoftActor-Critic-based intelligent decision model for sensing the environment state and making an optimal decision; And S3, performing virtual training and instant reasoning according to the intelligent decision model to realize the dynamic decision of the long-term benefit collaborative optimization of the construction organization.
  2. 2. The method for dynamically deciding the cooperative optimization of long-term and short-term benefits of construction organizations according to claim 1, wherein in the step S1, a virtual simulation environment for high-fidelity tunnel construction is constructed according to mileage coordinates of all active working surfaces along a tunnel, current geological surrounding rock grades and global resource pool states, and based on nonlinear progress deduction of material constraint, dynamic topology evolution judgment, automatic penetration and resource release.
  3. 3. The dynamic decision-making method for collaborative optimization of long and short term benefits of construction organizations as set forth in claim 2, wherein step S3 includes: the virtual environment encapsulates the current mileage coordinates, resource allowance and geological forecast information into a high-dimensional state vector Inputting the data to the SAC intelligent decision model; The SAC intelligent decision model outputs decision actions including working face opening, tunneling direction and working method selection through an Actor network according to the state vector model The virtual environment receives and executes the action to push the simulation time to advance; the virtual environment feeds back instant rewards according to the execution result To the intelligent agent, forming a positive and negative feedback mechanism of reinforcement learning; four-element group of interactive data of each step And in the training process, the intelligent agent randomly samples historical data from the playback pool to perform gradient descent training, so as to realize continuous iteration and optimization of the strategy.
  4. 4. The utility model provides a construction organization long and short period interests are optimized in coordination's dynamic decision-making device which characterized in that includes: The first processing module is used for constructing a high-fidelity tunnel construction virtual simulation environment; the second processing module is used for constructing a SoftActor-Critic-based intelligent decision model which is used for sensing the environment state and making an optimal decision; And the third processing module is used for carrying out virtual training and instant reasoning according to the intelligent decision model so as to realize the dynamic decision of the long-term benefit collaborative optimization of the construction organization.
  5. 5. The dynamic decision-making device for collaborative optimization of long and short term benefits of construction organization according to claim 4, wherein the first processing module constructs a virtual simulation environment for high-fidelity tunnel construction according to mileage coordinates of all active working surfaces along a tunnel, current geological surrounding rock level and global resource pool state, and based on nonlinear progress deduction of material constraint, dynamic topology evolution judgment, automatic penetration and resource release.
  6. 6. The dynamic decision-making device for collaborative optimization of long and short term benefits of construction organizations of claim 5, wherein the third processing module comprises: a state sensing unit for enabling the virtual environment to package the current mileage coordinates, resource allowance and geological forecast information into a high-dimensional state vector Inputting the data to the SAC intelligent decision model; The decision execution unit is used for enabling the SAC intelligent decision model to output decision actions comprising working face opening, tunneling direction and working method selection through the Actor network according to the state vector model The virtual environment receives and executes the action to push the simulation time to advance; a feedback closed loop unit for enabling the virtual environment to feed back instant rewards according to the execution result To the intelligent agent, forming a positive and negative feedback mechanism of reinforcement learning; an experience playback unit for quadrupling the interactive data of each step And in the training process, the intelligent agent randomly samples historical data from the playback pool to perform gradient descent training, so as to realize continuous iteration and optimization of the strategy.
  7. 7. A construction organization long and short term benefit collaborative optimization dynamic decision-making system, comprising a memory and a processor, wherein the memory stores a computer program for execution by the processor, the computer program when executed by the processor performing the construction organization long and short term benefit collaborative optimization dynamic decision-making method according to any one of claims 1-3.

Description

Dynamic decision method, device and system for cooperative optimization of long-term and short-term benefits of construction organization Technical Field The invention belongs to the technical field of intelligent construction of civil engineering, and particularly relates to a dynamic decision method, a device and a system for collaborative optimization of long-term and short-term benefits of construction organizations. Background In modern traffic infrastructure construction, construction of extra-long tunnels (e.g., railway or highway tunnels exceeding 20 km) is an extremely complex system engineering. In order to shorten the construction period, a long-tunnel short-tunneling strategy is generally adopted, namely, advanced tunneling is performed through a parallel pilot tunnel (called flat pilot for short), and a transverse communication channel (called transverse channel for short) is opened at a specific position, so that a new tunneling working face is added in a main tunnel, and parallel operation of multiple working faces is realized. However, the existing tunnel construction organization design and progress optimization technology faces serious challenges, and it is difficult to meet actual engineering requirements: The fidelity of the existing simulation technology is insufficient, and a dynamic coupling mechanism is lacked: Conventional schedule software (e.g., P6, project) is based on static network diagrams (CPM/PERT) and cannot simulate dynamic interactions during construction. For example, when a sudden shortage of supply of materials (e.g., concrete, steel) occurs at a worksite, the conventional model cannot automatically calculate how the shortage nonlinearly affects the tunneling speed to all the worksites. In addition, for the process that the topology structure dynamically evolves along with time, namely 'flat-guide tunneling to a specific position triggers a transverse channel to open, and then triggers a main hole new working face', the existing simulation lacks an endogenous modeling mechanism, and a plan is often required to be manually adjusted. Short-term visual effect and low efficiency of the existing optimization algorithm: heuristic search algorithms such as Genetic Algorithm (GA), particle Swarm Optimization (PSO) and the like which are commonly used at present can search for an optimal solution to a certain extent, but have two fundamental defects: the calculation efficiency is low, the open-loop decision is realized, and the search space is huge and the convergence time is extremely long in the face of massive decision combination (tens of transverse channels, starting time, direction and construction method of each channel). And once geological mutation occurs in actual construction, the previous optimal solution immediately fails, time-consuming global calculation needs to be carried out again, and real-time response cannot be achieved. Short-term value assessment capabilities (shortsights) are lacking, heuristic algorithms typically evaluate based on current or next stage states, and it is difficult to learn complex strategies that "delay satisfies". For example, for the shortest global total period, the current optimal strategy might be to temporarily sacrifice the fast tunneling of the leader, call resources to preferentially open up the cross-way on some critical path. It is difficult for conventional algorithms to capture such causal chains spanning long periods. Disclosure of Invention In order to solve the problems in the prior art, the invention provides a dynamic decision method, a device, a system and a storage medium for collaborative optimization of long-term and short-term interests of construction organizations. In order to achieve the above object, the present invention provides the following solutions: a dynamic decision method for collaborative optimization of long-term and short-term benefits of construction organizations comprises the following steps: s1, constructing a high-fidelity tunnel construction virtual simulation environment; s2, constructing a SoftActor-Critic-based intelligent decision model for sensing the environment state and making an optimal decision; And S3, performing virtual training and instant reasoning according to the intelligent decision model to realize the dynamic decision of the long-term benefit collaborative optimization of the construction organization. In the step S1, a virtual simulation environment for high-fidelity tunnel construction is constructed based on nonlinear progress deduction, dynamic topology evolution determination, automatic penetration and resource release of material constraint according to mileage coordinates of all active working surfaces along the tunnel, the current geological surrounding rock level and the global resource pool state. Preferably, step S3 includes: the virtual environment encapsulates the current mileage coordinates, resource allowance and geological forecast information into a high-di