CN-121997631-A - PD-FEM coupling implicit solving method based on GPU parallelism
Abstract
The invention is suitable for the technical field of intersection of computational mechanics and high-performance computation, and relates to a PD-FEM coupling implicit solving method based on GPU parallelism, which comprises the steps of S10, model preprocessing and data initialization, S20, overall stiffness matrix assembly based on cooperation of a CPU and a GPU, S30, parallel solving of a system linear equation set based on the GPU, S40, parallel computation of damage evolution based on the GPU, S50, parallel updating of damage iteration control and stiffness matrix, S60, result feedback and post-processing. The invention has simple flow and convenient operation, not only achieves breakthrough progress in calculation speed, but also has obvious advantages in simulation precision, numerical stability and application range, and provides a numerical simulation scheme with high efficiency, reliability and powerful function for solving the long-standing problem of complex structure fracture analysis in the engineering field.
Inventors
- SHI ZHICHUANG
- GUO CHENGCHAO
- QIN LEI
- SHI FEIFAN
- ZHU YUHUI
Assignees
- 中山大学
Dates
- Publication Date
- 20260508
- Application Date
- 20251222
Claims (10)
- 1. The PD-FEM coupling implicit solving method based on GPU parallelism is characterized by comprising the following steps of: S10, defining the geometric and material properties of the model, discretizing and dividing the model into PD and FEM subregions, constructing a key connection relation between PD material points, and transmitting all basic data structures required by calculation from a host memory to a GPU device memory; s20, constructing a system overall stiffness matrix K in an initial state, wherein stiffness contributions of FEM units and interface coupling units with smaller calculation amount are calculated on a CPU, and stiffness contributions of all keys in PD regions with huge calculation amount are calculated and assembled in a large-scale parallel manner by starting a GPU kernel, and finally, a complete stiffness matrix is formed in a memory of the GPU equipment; S30, solving a large sparse linear equation set Ku=F stored in a memory of the GPU equipment by using a parallelized iteration solver to obtain displacement vectors u of all FE nodes and PD object particles in the current system state; S40, parallelly calculating the elongation of each bond by starting the GPU kernel, comparing the elongation with the breaking criterion of the material, and marking the bond as broken in a global bond state array in the memory of the GPU equipment if the breaking condition is met; S50, the host CPU checks whether new damage is generated in the S40, if so, the rigidity of the system changes, at the moment, the GPU kernel is started, the rigidity matrix K is reassembled in parallel according to the information in the key state array, after updating is finished, the step S30 is returned to use the new rigidity matrix to solve again, if no new damage is generated, the convergence judgment is further carried out on the residual force of the current iteration, if the residual force meets the preset threshold, the system is considered to reach balance and iteration is ended, otherwise, the step S30 is returned to continue iteration solution; And S60, after the damage iteration cycle is finished, transmitting the final calculation result, including the displacement field and the damage distribution, from the memory of the GPU equipment back to the memory of the host computer for subsequent visual analysis and data archiving.
- 2. The GPU-parallel-based PD-FEM coupling implicit solving method according to claim 1, wherein the specific steps of S10 are as follows: S101, defining a geometric model, material properties, boundary constraint conditions and external load working conditions of an analysis object, performing space discretization on the geometric model, dividing the model into FEM subareas and PD subareas, generating a standard finite element grid in the FEM subareas, including nodes and units, and generating discrete substance points in the PD subareas; S102, defining an interface area at the junction of the FEM sub-area and the PD sub-area, and establishing a coupling connection relation between the FEM node and the PD substance point in the area; S103, traversing each substance point i in the PD subregion, searching all other substance points in the near-field neighborhood of the substance points, and storing all searched substance point pairs as a global key list; S104, constructing all necessary data arrays in a host memory according to discretization and neighborhood search results, wherein the data arrays comprise a coordinate array of a node and an object point, an FEM unit connection relation array, an interface coupling unit connection relation array and the PD key list generated in the S103, and distributing continuous memory space for parallel computation in a GPU (graphics processing unit) equipment memory for storing an overall rigidity matrix K, a displacement vector u, a load vector F, a global key state array, material parameters and coordinates; s105, performing a one-time data copying operation from a host memory to a GPU device memory, and transmitting the coordinate array, all the connection relation arrays, the material parameters and the initialized load vector F constructed in the S104 from a CPU (central processing unit) end to a memory space corresponding to the GPU end.
- 3. The GPU-parallel-based PD-FEM coupling implicit solving method according to claim 2, wherein S102 specifically comprises: s121, in a discretized model, defining an FEM subregion and a PD subregion clearly, identifying an interface positioned at the junction of the two subregions, and determining an FE node and a PD object point on the interface; s122, introducing truss units to bridge the near-field dynamics subarea and the finite element subarea; s123, calculating the unit stiffness matrix of each truss unit according to a standard finite element method 。
- 4. A GPU-parallel based implicit solving method for PD-FEM coupling according to claim 3, wherein the specific steps of S20 are as follows: S201, calculating rigidity contributions of FEM subareas and interface areas on a CPU: FEM subarea rigidity, namely traversing all FEM units by a CPU, and calculating a unit rigidity matrix K fem of each unit; The interface coupling rigidity is that the CPU traverses all the interface truss units established in the S122, and calculates the global rigidity matrix K t of each coupling unit according to the method of the S123; Storing all the calculated K fem and K t and the index position information thereof in the integral rigidity matrix K in a temporary buffer area of a CPU memory, and preparing for subsequent transmission to the GPU; S202, on the GPU equipment, the rigidity contribution of the PD subregion is calculated and assembled in parallel, the PD subregion is realized by starting a GPU kernel, and a GPU thread is distributed for each key of the key list in the S103; S203, the rigidity contribution data K fem and K t calculated by the CPU in S201 are transmitted to the memory of the GPU equipment from the memory of the CPU host, a GPU kernel function is started, the rigidity contribution from the CPU is added to the integral rigidity matrix K in the memory of the GPU equipment through atomic operation, and final assembly is completed.
- 5. The GPU-parallel-based PD-FEM coupling implicit solving method according to claim 4, wherein the specific steps of S30 are as follows: S301, initializing vectors required by solving on the GPU, wherein the vectors comprise an initial displacement vector u 0 , a residual vector r 0 = F - K u0 and an initial search direction vector p 0 = r 0 ; s302, performing iterative loop on the GPU until convergence conditions are met; and S303, after iteration convergence, the finally obtained global displacement vector u is stored in the memory of the GPU equipment.
- 6. The GPU-parallel based PD-FEM coupling implicit solution method of claim 5, wherein in S302, each iteration loop includes the following GPU-parallel computations: sparse matrix-vector multiplication parallel computing This operation concurrently performs multiply and accumulate operations through a large number of cores of the GPU using sparse matrices and vectors p k stored in the GPU memory; parallel reduction by GPU parallel reduction algorithm, calculating inner product to determine step size ; Vector update-updating of the displacement solution vector and residual vector in parallel, each thread being responsible for one or more components, implementing And ; Convergence judgment, parallel computing norm of new residual error, through another parallel protocol computation Comparing the error with a preset convergence tolerance, and stopping iteration if the error is smaller than the tolerance; Updating direction vector, if not, calculating and updating the search direction of the next iteration through parallel reduction sum AXPY operation Wherein 。
- 7. The GPU-parallel-based PD-FEM coupling implicit solving method according to claim 6, wherein the specific steps of S40 are as follows: S401, starting a GPU-kernel configured with N bonds threads by a host CPU according to the total length N bonds of the key list in the S103, wherein each GPU thread is uniquely mapped to a PD key; S402, on the GPU, all N bonds threads are executed simultaneously and parallelly, each thread k reads the displacements u i and u j of the object points i and j at the two ends of the corresponding key from the global displacement vector u of the memory of the GPU equipment, and then the threads are executed according to the initial relative positions of the object points And the current relative displacement Calculate the relative elongation of the bond ; S403, the thread k compares the calculated elongation S with the critical elongation S 0 of the material, and if S > S 0 and the current state of the key in the global key state array is still intact, the thread updates the state of the key to broken.
- 8. The GPU-parallel-based PD-FEM coupling implicit solving method according to claim 7, wherein the specific steps of S50 are as follows: S501, executing one parallel protocol operation by the GPU, rapidly counting the number of keys with changed states in the global key state array to obtain a newly added damage count value, and retrieving the newly added damage count value from the GPU by a host CPU; s502, the host CPU judges the newly added damage count: if the newly increased damage count is greater than 0, the system is indicated to change the rigidity due to the damage of the material, the current displacement solution u is not valid any more, the system must update the rigidity matrix and solve again, and the step S503 is executed; if the newly increased damage count is= 0, and after the convergence of the residual force of the current iteration is further determined, the residual force still meets the preset threshold, which indicates that the damage of the material is not expanded under the current load, the system reaches the equilibrium state, the implicit iteration cycle is ended, and the step S60 is executed; s503, updating the rigidity matrix in parallel, and reconstructing a system integral rigidity matrix K; And S504, after the GPU in S503 completes updating the rigidity matrix K, returning the program control flow to S30, carrying out parallel solving steps on the system linear equation set based on the GPU, and re-solving Ku=F by combining the updated K matrix with the original load vector F by the system so as to obtain a new displacement field u new , and returning to S40.
- 9. The GPU-parallel-based PD-FEM coupling implicit solving method of claim 8, wherein S503 specifically comprises: S531, before reassembling, clearing all rigidity contribution parts related to PD keys in an overall rigidity matrix K stored in a memory of the GPU equipment; S532, the host CPU starts a GPU kernel and allocates a GPU thread to each key in the key list, in the kernel, each thread k corresponds to a key (i, j), and a conditional assembly logic is executed: thread k reads its state value in the global key state array from GPU global memory first; If the state is good, the thread K calculates the global rigidity matrix K bond of the key, and adds 36 components of the global rigidity matrix K to the corresponding position of the rigidity matrix K by using atomicAdd () function; If the state is broken, the thread K skips all subsequent rigidity calculation and atomicAdd () assembly operations, directly ends the task, and after the process is completed, the K matrix in the memory of the GPU device is updated to a new rigidity matrix only comprising all intact keys, FEM units and interface units.
- 10. The method for implicit solving of PD-FEM coupling based on GPU parallelism according to claim 1, wherein the step S60 specifically comprises the following steps: s601, a host CPU initiates a data transmission operation, key result data stored in a memory of a GPU device is copied back to the memory of the CPU host, and the data comprises a final global displacement vector u and a final global key state array; S602, after receiving data, a host CPU executes post-processing operation, including storing displacement and damage data to a disk file, calling a visual program library to graphically display a displacement field and a global key state array, and extracting key data points according to requirements for quantitative analysis; and S603, after finishing the post-processing, ending the whole simulation flow.
Description
PD-FEM coupling implicit solving method based on GPU parallelism Technical Field The invention belongs to the technical field of intersection of computational mechanics and high-performance computation, and particularly relates to a PD-FEM coupling implicit solving method based on GPU parallelism. Background In the safety assessment and life prediction of engineering structures, accurate simulation of crack growth is crucial. At present, the mainstream numerical methods mainly comprise a Finite Element Method (FEM) and a near field dynamics (PD) method, and the finite element method is used as a mature method based on a continuous medium mechanics theory, has the characteristics of high efficiency and accuracy when dealing with the continuity problems such as elastic deformation and the like, and has been widely applied in the engineering field. However, since the theoretical basis depends on the spatial partial derivative of the displacement field, when the discontinuous surfaces such as crack tips are faced, the problem of singularity can be generated, and the crack propagation can be simulated by means of an external fracture criterion and a complex grid reconstruction technology, so that the algorithm complexity is increased, human errors can be introduced, and the crack path prediction accuracy is affected. In contrast, near field dynamics is an integral form theory based on a non-local idea, which describes the interaction force by means of "bonds" between pairs of points, avoiding the dependence on spatial derivatives, thus enabling natural, spontaneous simulation of crack initiation and propagation without complex fracture criteria. However, the non-local nature of the near field dynamics approach requires high density discretization of the model, resulting in far more computational effort than the traditional finite element approach, which is computationally expensive and computationally efficient when applied to the overall analysis of large structures. The patent application with the publication number of CN119864107A provides a method for predicting crack growth of a brittle material based on key-shaped near-field dynamics, which comprises the following steps of S1, establishing a geometric model of the brittle material structure, setting boundary conditions as a preprocessing input file, S2, establishing a balance equation on a unit based on a finite element method, S3, solving the balance equation, calculating damage of a brittle material matrix, and S4, displaying a calculation result through a post-processing platform. In the patent application, the simulation of the whole process of crack initiation, propagation and intersection of the elastic and brittle material is realized through key-shaped near-field dynamics, the calculated amount is large, the efficiency is low, and the defects same as the defects in the prior art exist. Therefore, how to provide a numerical analysis method with the combination of calculation accuracy, numerical stability and calculation efficiency is a problem to be solved by those skilled in the art. Disclosure of Invention Aiming at the defects of the prior art, the invention aims to provide a PD-FEM coupling implicit solving method based on GPU parallelism, so as to solve the problems of lower calculation accuracy and lower calculation efficiency of a numerical analysis method in the prior art. In order to solve the technical problems, the invention adopts the following technical scheme: The invention provides a PD-FEM coupling implicit solving method based on GPU parallelism, which comprises the following steps: S10, defining the geometric and material properties of the model, discretizing and dividing the model into PD and FEM subregions, constructing a key connection relation between PD material points, and transmitting all basic data structures required by calculation from a host memory to a GPU device memory; s20, constructing a system overall stiffness matrix K in an initial state, wherein stiffness contributions of FEM units and interface coupling units with smaller calculation amount are calculated on a CPU, and stiffness contributions of all keys in PD regions with huge calculation amount are calculated and assembled in a large-scale parallel manner by starting a GPU kernel, and finally, a complete stiffness matrix is formed in a memory of the GPU equipment; S30, solving a large sparse linear equation set Ku=F stored in a memory of the GPU equipment by using a parallelized iteration solver to obtain displacement vectors u of all FE nodes and PD object particles in the current system state; S40, parallelly calculating the elongation of each bond by starting the GPU kernel, comparing the elongation with the breaking criterion of the material, and marking the bond as broken in a global bond state array in the memory of the GPU equipment if the breaking condition is met; S50, the host CPU checks whether new damage is generated in the S40, if so, the rigidity