CN-122021369-A - Structural grid quality optimization method based on deep reinforcement learning

CN122021369ACN 122021369 ACN122021369 ACN 122021369ACN-122021369-A

Abstract

The invention discloses a structural grid quality optimization method based on deep reinforcement learning, which relates to the field of computer-aided engineering and computational fluid mechanics pretreatment, and the method models the problem of position optimization of singular points in a structural grid as a sequential decision process and constructs an interactive framework of reinforcement learning agents and grid environments: and extracting grid characteristics through a state sensing module, outputting singular point displacement actions by the agent strategy network, updating the grid after the environment is executed, calculating a reward value based on quality change, and updating the strategy network after interactive experience is stored. A pre-trained expert strategy network is introduced for accelerating training convergence, and suggested actions are introduced into a reward function to guide the exploration direction of an agent. After training is completed, the fixed strategy network can be rapidly deployed in the optimization of the new geometric model. The automatic global optimization of the singular point layout is realized, the grid quality and the optimization efficiency are obviously improved, and the trained model has good generalization capability and can be widely applied to the generation of the structural grid.

Inventors

LIU YANG
HAN FANG
Tang Menzong
ZHANG QINGDONG
Wang Shajie
LIU WENXIN
PANG YUFEI

Assignees

中国空气动力研究与发展中心计算空气动力研究所

Dates

Publication Date: 20260512
Application Date: 20260415

Claims (10)

1. The structural grid quality optimization method based on deep reinforcement learning is characterized by comprising the following steps of: Step S1, acquiring an initial structural grid to be optimized, and determining singular points and initial coordinates of the singular points in the initial structural grid; s2, extracting state characteristics of the current structural grid and constructing a state vector Wherein the state vector The method comprises the steps of generating a current structural grid, wherein the current structural grid comprises current coordinates of singular points in the current structural grid, and the current coordinates of the singular points are initial coordinates when the current structural grid is iterated for the first time; Step S3, the state vector is processed Inputting the strategy network of the pre-trained or initialized reinforcement learning agent, and outputting the displacement action to the singular point Wherein the reinforcement learning agent comprises a policy network and a value network; step S4, according to the displacement action Moving singular points, updating current coordinates of the singular points into new coordinates, and regenerating updated structural grids based on the updated singular point coordinates; Step S5, calculating the quality index of the updated structural grid, and calculating a rewarding value according to the variation between the quality index and the quality index of the structural grid before movement ; Step S6, re-extracting the state vector based on the updated structural grid, and recording as The interactive experience of the current iteration round is improved , , , ) Storing the data as a set of experience data in an experience pool; s7, based on the interactive experience data stored in the experience pool, updating parameters of a strategy network and a value network of the reinforcement learning agent by adopting a reinforcement learning algorithm; Step S8, connecting And (3) repeating the steps S2 to S7 as the current state vector of the next iteration round until a preset termination condition is met, so as to obtain the optimized structural grid.
2. The method for optimizing the quality of a structural grid based on deep reinforcement learning according to claim 1, wherein the state vector constructed in the step S2 At least one of the following features: Normalized spatial coordinates of singular points; Orthogonality index of grid cells in the singular point neighborhood; smoothness statistics of full field grid cells.
3. The method for optimizing the quality of a structural grid based on deep reinforcement learning according to claim 1, wherein the displacement actions outputted in the step S3 A continuous motion space representation is used, specifically the displacement of each singular point in the x, y and z directions.
4. The method according to claim 1, wherein when the updated structural grid is regenerated in step S4, the grid generating algorithm is called based on the updated singular point coordinates, and the coordinates of all grid nodes are recalculated, and the updated structural grid contains all nodes of the full-field grid.
5. The method for optimizing structural grid quality based on deep reinforcement learning of claim 1, wherein the prize value The calculation mode of (a) is as follows: ; Wherein, the , And As the weight coefficient of the light-emitting diode, And Quality indexes of the structural grids before and after the singular point movement are respectively determined, The action is suggested for the expert policy network, Is a penalty term when the grid is invalid.
6. The method for optimizing the quality of the structural grid based on the deep reinforcement learning according to claim 1, wherein the method further comprises an expert network guidance mechanism: Pre-training an expert strategy network, wherein the expert strategy network is trained through supervised learning, and training data of the expert strategy network is based on an expert data set generated after disturbance is applied to singular point coordinates; In the training process of the reinforcement learning agent, the expert strategy network is used for carrying out the training according to the state vector Output expert policy network suggested actions A reward function is introduced for guiding the exploration direction of the reinforcement learning agent.
7. The method for optimizing the quality of the structural grid based on deep reinforcement learning according to claim 1, wherein the reinforcement learning algorithm adopted in the step S7 is a PPO algorithm, and the combined update of the policy network and the value network of the reinforcement learning agent is performed by sampling the data in the experience pool.
8. The method for optimizing the quality of a structural grid based on deep reinforcement learning according to claim 1, wherein the termination condition in step S8 includes at least one of the following: Reaching a preset maximum iteration number; The lifting amplitude of the quality index of the structural grid in the continuous preset round is smaller than a preset threshold value; Prize value Converging to a stable range.
9. The method for optimizing the quality of a structural grid based on deep reinforcement learning according to claim 1, further comprising the step of S9: A policy deployment step comprising: step S91, obtaining a strategy network of the training-completed and fixed reinforcement learning intelligent agent, and obtaining a fixed strategy network; step S92, acquiring a first structural grid to be optimized, and determining singular points and initial coordinates thereof in the first structural grid; Step S93, obtaining a current structural grid based on the first structural grid, extracting state characteristics of the current structural grid, and constructing a first state vector; step S94, inputting the first state vector into the fixed strategy network, and outputting a first displacement action for the singular point; Step S95, moving the singular point according to the first displacement action to obtain updated singular point coordinates, and updating and generating a second structural grid based on the updated singular point coordinates; step S96, extracting and obtaining a second state vector based on the second structural grid, inputting the second state vector as a state vector of the next iteration, and taking the second structural grid as a current structural grid of the next iteration; and S97, repeating the steps S93 to S96 until the preset deployment termination condition is met, and outputting the optimized structural grid.
10. The deep reinforcement learning based structural grid quality optimization method of claim 9, wherein the deployment termination condition in step S97 comprises at least one of: reaching a preset maximum deployment iteration number; The quality index of the second structural grid reaches a preset threshold; and the lifting amplitude of the quality index of the second structural grid in the continuous preset round is smaller than a preset threshold value.

Description

Structural grid quality optimization method based on deep reinforcement learning Technical Field The invention relates to the field of computer aided engineering and computational fluid dynamics pretreatment, in particular to a structural grid quality optimization method based on deep reinforcement learning. Background In the preprocessing stage of Computational Fluid Dynamics (CFD) and Computer Aided Engineering (CAE), the quality of the structural grid directly affects the accuracy, stability and convergence speed of the simulation calculations. The generation of the structural grid generally adopts a partition topology method, and a complex calculation domain is divided into a plurality of quadrilateral (2D) or hexahedral (3D) topology blocks, wherein the intersection points of the topology blocks are singular points. The singular point refers to a node with the number of connected topology blocks not equal to 4 (2D) or 6 (3D), and the position and distribution of the node determine the trend, the orthogonality and the smoothness of grid lines, which are key factors affecting the quality of the grid. Currently, the placement and optimization of singular points is highly dependent on the manual experience of engineers. The method comprises the typical flow of manually presetting initial positions of singular points according to geometric shapes by engineers, judging areas with poor quality through quality inspection tools (such as indexes of jacobian, orthogonality, length-width ratio and the like) after generating initial grids, and manually and repeatedly adjusting singular point coordinates to perform trial optimization until the grid quality reaches acceptable standards. The process has the following remarkable disadvantages: (1) The problem of high-dimensional nonlinear optimization is that the movements of a plurality of singular points are mutually influenced, one point is adjusted to possibly improve local quality, other areas are deteriorated, the problem of high-dimensional, strong coupling and non-convex optimization is formed, and a global optimal solution is difficult to find by manual adjustment. (2) Depending on expert experience and trial-and-error, even with experienced engineers, a large number of trial-and-error are still required for new configurations or complex areas, the process is time-consuming, tedious and results vary from person to person, lacking in standardization. (3) The lack of systematic automation methods the automated optimization functions of existing commercial software (e.g., laplace fairing) are mainly directed to internal grid nodes and are generally not dare to easily move the singular points that are the root of the topology in order to avoid causing the topology to collapse or creating a negative volume grid. The application of the prior AI technology in grid optimization is concentrated on grid fairing, and an intelligent decision system which can understand a topological structure, sense a geometric environment and take grid quality as a direct optimization target is not known. Disclosure of Invention The invention aims to overcome the defects of the prior art and provides a structural grid quality optimization method based on deep reinforcement learning. The method has the core purposes of modeling the singular point position optimization problem as a sequential decision process, utilizing the autonomous exploration of reinforcement learning to mechanically learn an optimal movement strategy, introducing an expert network constructed by deep learning to conduct directional guidance so as to greatly reduce invalid exploration, shorten training time and realize the automation, intellectualization and global optimization of the quality of the structural grid. In order to achieve the above object, the present invention provides a structural grid quality optimization method based on deep reinforcement learning, the method comprising: Step S1, acquiring an initial structural grid to be optimized, and determining singular points and initial coordinates of the singular points in the initial structural grid; s2, extracting state characteristics of the current structural grid and constructing a state vector Wherein the state vectorThe method comprises the steps of generating a current structural grid, wherein the current structural grid comprises current coordinates of singular points in the current structural grid, and the current coordinates of the singular points are initial coordinates when the current structural grid is iterated for the first time; Step S3, the state vector is processed Inputting the strategy network of the pre-trained or initialized reinforcement learning agent, and outputting the displacement action to the singular pointWherein the reinforcement learning agent comprises a policy network and a value network; step S4, according to the displacement action Moving singular points, updating current coordinates of the singular points into new coordinat