CN-122023436-A - Fabric cutting optimization method and device based on deep reinforcement learning
Abstract
The invention provides a fabric cutting optimization method and device based on deep reinforcement learning, which relates to the technical field of fabric cutting optimization, the invention initializes the geometric parameters of all parts to be arranged, establishes a parameterized geometric mapping field, cooperatively defines the initial layout of all parts through the control parameters thereof, calculates geometric characteristic indexes based on the part layout generated by the current mapping field, outputs the adjustment of the control parameters of the mapping field by a deep reinforcement learning controller according to the indexes, and synchronously updating the positions and the rotation angles of all parts according to the optimized control parameters, generating a layout of a new design iteration, recalculating geometric characteristic indexes of the layout, finally calculating an optimized gain value by comparing old and new indexes, updating the internal parameters of the controller, repeating the optimization process until the layout simultaneously meets the zero overlapping of the contour and the space utilization rate reaches a preset threshold value, and finally outputting an optimal cutting scheme.
Inventors
- SHEN HAISHENG
- WU TONG
- QIU HUA
Assignees
- 南通赛晖科技发展股份有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260414
Claims (8)
- 1. The fabric cutting optimization method based on deep reinforcement learning is characterized by comprising the following specific steps of: setting an initial part layout in a two-dimensional design area, extracting geometric parameters of all parts, establishing a mapping field covering parameterized geometric shapes of the design area, and cooperatively defining the part layout of all the parts by control parameters in the mapping field, wherein the part layout comprises positions and rotation angles of all the parts in the design area; step 2, calculating a group of geometric characteristic indexes based on part layout generated by a mapping field under the current design iteration, wherein the geometric characteristic indexes comprise space filling density distribution, contour overlapping depth functions and layout inclusion metrics, and constructing a depth reinforcement learning controller, and the depth reinforcement learning controller receives the geometric characteristic indexes as input and outputs an action, wherein the action is used for adjusting control parameters of the mapping field; step 3, synchronously updating the positions and the rotation angles of all parts based on the adjusted and optimized control parameters, generating a part layout of a new design iteration, and calculating geometric characteristic indexes in the new design iteration; And 4, comparing the new geometric characteristic index with the old geometric characteristic index, calculating an optimized gain value, updating network parameters of the deep reinforcement learning controller based on the optimized gain value, and repeatedly executing the steps 2 to 3 until the generated part layout simultaneously meets the convergence condition that the contour overlapping depth function value is zero and the space utilization rate reaches a preset threshold value, and outputting a part layout scheme of final fabric cutting.
- 2. The fabric cutting optimization method based on deep reinforcement learning of claim 1, wherein the mapping field of parameterized geometry is realized by adopting a deformation model based on thin plate splines, and the method specifically comprises the following steps: The deformation model defines continuous deformation of the whole design area through displacement of a group of mapping field control points, a rectangular coordinate system is established by taking the center of the design area as an origin, and a plurality of mapping field control points which are uniformly distributed at the boundary and the inner position of the design area are generated in the design area; The method comprises the steps that through continuous deformation of a mapping field control point displacement definition design area, a deformation model updates coordinates of the mapping field control point according to part layout changes in each design iteration, so that time consistency is ensured, part position adjustment is adapted, mapping coordinates of any mapping field control point in the mapping field output design area are cooperatively controlled, and positions and rotation angles of all parts are cooperatively controlled; when the initial part layout is carried out, the displacement of the control points of the mapping field is set to be zero, the mapping field is expressed as unit transformation, and along with the progress of the optimization process, the control points of the mapping field are gradually adjusted according to the instruction of the deep reinforcement learning controller, so that a nonlinear deformation effect is generated, and the part layout is gradually optimized.
- 3. The method for optimizing fabric cutting based on deep reinforcement learning according to claim 2, wherein the spatial packing density distribution is obtained by systematically setting sampling points in a design area and evaluating the density of the distribution of surrounding parts, and specifically comprises the following steps: Establishing a uniform grid lattice in a design area as a sampling point set, and calculating the number of parts in a preset radius range around each sampling point, so as to form space filling density distribution, wherein the radius range is adaptively adjusted according to the proportion of the total area of the design area to the number of the parts, and the space filling density distribution is used for quantifying the distribution uniformity of the parts in space and providing distribution balance information for a deep reinforcement learning controller; The contour overlapping depth function is used for estimating the overlapping degree by calculating the penetration distance between the contours of the parts, wherein the smaller the value is, the more serious the overlapping is, and the contour overlapping depth function is realized based on a sign distance function and specifically comprises the following steps: For any group of parts with geometric overlapping, respectively calculating the sign distance functions of the outlines of the parts, and identifying an overlapping area with negative values by comparing the two sign distance functions; And accumulating the local penetration depths at all overlapping points of any group of part pairs with geometric overlapping to obtain the total overlapping depth of the part pairs, and finally, accumulating the total overlapping depths of all the part pairs with overlapping again to obtain the sum which is the contour overlapping depth function value of the current overall layout, wherein the depth reinforcement learning controller systematically reduces and finally eliminates the overlapping among the parts by driving layout optimization to minimize the contour overlapping depth function.
- 4. The fabric cutting optimization method based on deep reinforcement learning of claim 3, wherein the layout inclusion metric is used for evaluating the adaptation degree of the part layout to the boundary of the design area, and the expression is: Wherein, the For the purpose of the layout inclusion measure, For the total number of parts, Is the first Shortest distance from the center of the individual part to the boundary of the design area, For reference distance, for distance normalization, The layout inclusion metric is calculated in each design iteration and used to deep reinforcement learn the controller's decisions.
- 5. The method for optimizing fabric cutting based on deep reinforcement learning of claim 4, wherein the deep reinforcement learning controller adopts reinforcement learning algorithm of Actor-Critic architecture, the state space of the deep reinforcement learning controller comprises vectorization form of current geometric feature index, and the action space of the deep reinforcement learning controller is the adjustment of parameterized geometric mapping field control parameters; the reward function of the deep reinforcement learning controller is based on an optimized benefit value, and the expression is as follows: Wherein, the In order to optimize the value of the benefit, For the value of the depth function of the contour overlap, For a preset maximum overlapping depth reference value, In order to achieve a space utilization rate, For the purpose of the layout inclusion measure, To map the resized L2 norm of the field control parameter, 、 、 And Is a weight coefficient; The optimized gain value is used for updating the network parameters of the depth reinforcement learning controller, and the iterative processes of the steps 2 to 3 are repeatedly executed until the generated part layout simultaneously meets the contour overlapping depth function value Below a preset threshold and space utilization And when the preset target value is reached, outputting the final part layout as a fabric cutting scheme.
- 6. The fabric cutting optimization method based on deep reinforcement learning of claim 5, wherein updating network parameters of the deep reinforcement learning controller is realized based on an experience playback mechanism and an Actor-Critic framework, and specifically comprises the following steps: After each design iteration, taking the current geometric characteristic index vector as a state, adjusting control parameters as an action, taking an optimized gain value as an instant reward, taking a newly generated geometric characteristic index vector as a new state, forming an experience tuple and storing the experience tuple in an experience playback buffer area; Sampling a batch of experience tuples from an experience playback buffer zone, calculating a time sequence difference error to update parameters of a strategy network and a value evaluation network, wherein the time sequence difference error is the difference between a target value and a value prediction value of the value evaluation network on a current state, the target value is formed by adding a discount factor to the product of the instant rewards and the value prediction value of the value evaluation network on a new state, updating the parameters of the strategy network by adopting a strategy gradient algorithm based on the time sequence difference error, and synchronously updating the parameters of the value evaluation network, so that the decision strategy of a controller is continuously optimized.
- 7. The fabric cutting optimization method based on deep reinforcement learning of claim 6, wherein the method is characterized by synchronously updating the positions and the rotation angles of all parts, and specifically comprises the following steps: calculating a local jacobian matrix of the parameterized geometric mapping field at the geometric center of each part based on the updated parameterized geometric mapping field, and analyzing a rotation component and a translation component from the local jacobian matrix; And synchronously applying corresponding transformation to all parts through parallel matrix operation according to the sequence of rotation and translation, so as to generate a new round of design iterative part layout.
- 8. A fabric cutting optimization device based on deep reinforcement learning is characterized in that the device is used for executing the fabric cutting optimization method based on the deep reinforcement learning, and the fabric cutting optimization device based on the deep reinforcement learning comprises the following steps: the initial design module is used for setting initial part layout in a two-dimensional design area, extracting geometric parameters of all parts, establishing a mapping field covering parameterized geometric shapes of the design area, and cooperatively defining part layout of all parts by control parameters in the mapping field, wherein the part layout comprises positions and rotation angles of all parts in the design area; the optimization adjustment module is used for calculating a group of geometric characteristic indexes based on the part layout generated by the mapping field under the current design iteration, wherein the geometric characteristic indexes comprise space filling density distribution, contour overlapping depth functions and layout inclusion metrics, and a depth reinforcement learning controller is constructed, and the depth reinforcement learning controller receives the geometric characteristic indexes as input and outputs an action, and the action is used for adjusting control parameters of the mapping field; the parameter updating module is used for synchronously updating the positions and the rotation angles of all parts based on the adjusted and optimized control parameters, generating a part layout of a new design iteration, and calculating geometric characteristic indexes in the new design iteration; And the iteration convergence module is used for comparing the new geometric characteristic index with the old geometric characteristic index, calculating an optimized gain value, updating the network parameters of the deep reinforcement learning controller based on the optimized gain value, repeatedly executing the steps 2 to 3 until the generated part layout simultaneously meets the convergence condition that the contour overlapping depth function value is zero and the space utilization rate reaches a preset threshold value, and outputting a part layout scheme of final fabric cutting.
Description
Fabric cutting optimization method and device based on deep reinforcement learning Technical Field The invention relates to the technical field of fabric cutting optimization, in particular to a fabric cutting optimization method and device based on deep reinforcement learning. Background In the industrial fields of clothing manufacturing, aerospace composite material cutting and the like, how to efficiently arrange a large number of irregularly-shaped parts on a given two-dimensional fabric is a critical optimization problem, the optimization aim is to maximize the material utilization rate on the premise of ensuring that the parts are not overlapped and do not exceed the material boundary, so that waste is reduced, cost is saved, the traditional automatic cutting typesetting method is mostly dependent on a sequential placement strategy or a group intelligent-based optimization algorithm, each part is generally regarded as an independent individual, and the position and angle of each part are sequentially or parallelly adjusted by calculating the collision relation between every two parts or the relative position between every two parts and the material boundary, and a feasible arrangement scheme is searched by means of heuristic rules. In the prior art, the main stream automatic discharging method mainly depends on two types of ideas, namely a heuristic algorithm based on geometric rules, parts are placed one by one through a predefined priority rule, but collaborative optimization is difficult to carry out from a global view, and a depth reinforcement learning method based on discrete action space models a discharging process as a sequence decision problem, and the depth reinforcement learning method sequentially decides the placement position of the next part, wherein the two main stream ideas essentially belong to a greedy decision mode, namely each decision only focuses on the local optimal solution of the current single part, but cannot carry out global and synchronous position adjustment and posture optimization on all parts, and the inherent locality makes the optimization process extremely easy to be trapped into local optimal, and is difficult to generate a highly compact global discharging scheme; The disadvantage of the prior art is that the decision process is not cooperated, since the parts are placed one by one and independently, the layout decisions of the subsequent parts cannot correct and cooperatively adjust the positions of the placed parts, which causes the errors of early decisions in the layout process to be continuously accumulated and amplified, the system lacks an effective mechanism to carry out dislike and remodel on the whole layout, and in addition, the discrete decision mode is difficult to directly process complex interrelationships among the parts, for example, reasonable overlapping is allowed to exist among the parts in the initial stage of optimization so as to explore the better layout, and overlapping is gradually eliminated through cooperative deformation, so that a new model capable of synchronously and cooperatively optimizing all the parts from the whole is urgently needed to break through the bottleneck of the prior art in global optimizing capability and solution quality. The above information disclosed in the background section is only for enhancement of understanding of the background of the disclosure and therefore it may include information that does not form the prior art that is already known to a person of ordinary skill in the art. Disclosure of Invention The invention aims to provide a fabric cutting optimization method and device based on deep reinforcement learning, which are used for solving the problems in the background technology. In order to achieve the above purpose, the present invention provides the following technical solutions: A fabric cutting optimization method based on deep reinforcement learning specifically comprises the following steps: setting an initial part layout in a two-dimensional design area, extracting geometric parameters of all parts, establishing a mapping field covering parameterized geometric shapes of the design area, and cooperatively defining the part layout of all the parts by control parameters in the mapping field, wherein the part layout comprises positions and rotation angles of all the parts in the design area; step 2, calculating a group of geometric characteristic indexes based on part layout generated by a mapping field under the current design iteration, wherein the geometric characteristic indexes comprise space filling density distribution, contour overlapping depth functions and layout inclusion metrics, and constructing a depth reinforcement learning controller, and the depth reinforcement learning controller receives the geometric characteristic indexes as input and outputs an action, wherein the action is used for adjusting control parameters of the mapping field; step 3, synchronously updating the pos