CN-121981061-A - Time sequence driven macro module layout method

CN121981061ACN 121981061 ACN121981061 ACN 121981061ACN-121981061-A

Abstract

The invention discloses a time sequence driven macro module layout method which comprises the steps of clustering standard units and macro modules in a circuit netlist to generate a clustering level tree, abstracting leaf node clusters of the clustering level tree into pseudo-macros with specific areas and shapes to generate global initial layout positions of the pseudo-macros, constructing a virtual connection weight model in each cluster, solving the relative position relation of the macro modules in the cluster, regarding the macro modules in the cluster with the determined relative positions as macro groups, performing macro group position optimization, completing global macro layout, screening out key macro modules according to a time sequence analysis report, classifying and identifying the key macro modules as array macros or free macros, respectively performing position update, performing direction optimization, and outputting a final macro module layout result. The invention can effectively solve the layout problem of a large-scale mixed size circuit, remarkably improve worst negative relaxation and total negative relaxation while maintaining layout regularity, and promote timing sequence convergence of physical design.

Inventors

ZHU ZIRAN
FU WEI

Assignees

东南大学

Dates

Publication Date: 20260505
Application Date: 20260107

Claims (10)

1. A time sequence driven macro module layout method is characterized by comprising the following steps: step one, clustering standard units and macro modules in a circuit netlist to generate a clustering level tree by adopting a logic level-based multi-level clustering method; step two, abstracting leaf node clusters of the cluster level tree into pseudo-macros with specific areas and shapes, calculating expansion areas based on connection densities, and generating global initial layout positions of the pseudo-macros by utilizing a nonlinear analytic layout algorithm; thirdly, constructing a virtual connection weight model based on time sequence association degree aiming at the inside of each cluster, and constructing an integer linear programming model to solve the relative position relation of macro modules in the clusters; Taking the intra-cluster macro module with the determined relative position as a macro group, performing macro group position optimization based on boundary sensing, and performing macro combination normalization on the basis of maintaining the current relative position relationship to complete global macro layout; step five, standard cell layout is executed, static time sequence analysis is carried out, and a key macro module is obtained through screening according to a time sequence analysis report; Step six, classifying the key macro modules, identifying the key macro modules as array macro or free macro, and respectively updating the positions; Extracting physical slots of the array macro, and optimizing the positions of the array macro in the array by adopting a limited position exchange algorithm; Aiming at the free macro, constructing a micro-objective function comprising an gravitation item and a displacement item, and adopting a topology-limited projection gradient descent algorithm to finely adjust the position; And seventhly, carrying out direction optimization on the macro module with updated positions, and outputting a final macro module layout result.
2. The method of claim 1, wherein the first step is performed by a process comprising, Step 11, traversing the logic level tree from top to bottom, judging whether the number of examples contained in the current logic module exceeds a preset maximum scale threshold value, if so, carrying out minimum division on the logic module by adopting a multi-level hypergraph division algorithm, decomposing the logic module into a plurality of sub-clusters with the scale meeting the requirement, and lifting the sub-clusters into candidate clusters of the current level; step 12, combining the decomposed candidate clusters from bottom to top to define clusters Is a connected feature vector of (a) , , In the formula, For the number of clusters of the current level, As contiguous indication function, when clustered And (3) with Is the number of interconnected nets of (a) Greater than the connection threshold The value is 1 when the time is taken, otherwise, the value is 0; will have the same connection feature vector And the brother clusters meeting the size constraint are combined to generate a cluster level tree.
3. The method according to claim 1, wherein in the second step, the expansion area is calculated based on the connection density, and the first step is set Coefficient of expansion of pseudo-macros The calculation formula is that, , Wherein, the For the area of the core layout area, Is the first The area of the individual pseudo-macros, For the normalized connection density inside the pseudo-macro, Is a preset adjustment coefficient.
4. The method according to claim 1, wherein the specific process of the third step is, Step 31, constructing a virtual connection weight model between macro blocks, and for any two macro blocks in a cluster And Its virtual connection weight It is defined that the first and second components, , Wherein, the Representing slave macro blocks To the point of Is provided for the set of all active data paths, Representing a path A register stage on the register stage; Step 32, an integer linear programming model is built, defining an objective function as follows, , Wherein, the For a set of intra-cluster macroblocks, Is the center coordinate of the macro block, As a position deviation penalty function, it is defined as: , Wherein, the And The width and height of the pseudo-macro respectively, And Is a coefficient for controlling the importance of the degree of deviation in the horizontal and vertical directions; constraint conditions include boundary constraints and non-overlapping constraints; The boundary constraint is such that, , Wherein, the 、 The width and height of the layout area respectively, 、 Respectively macro modules Is arranged between the width and the height of the steel plate, Is a macro module Is defined by the center coordinates of (a); The non-overlapping constraint is that, 。
5. The method of claim 1, wherein the specific process of the fourth step is, Step 41, arranging the total areas of the macro groups in descending order, the first Personal macro group Is (are) optimized cost function In order to achieve this, the first and second, , Wherein, the For the current macro group And a fixed macro group Is arranged in the overlapping area of the two layers, Is the initial coordinates before the optimization, The penalty coefficient is chosen to be the overlapping penalty coefficient, For the boundary-aware penalty term, defined as a function of the current location and chip edge distance, , Wherein, the The penalty term enables the macro group to be closer to the chip boundary and the total cost to be smaller for the geometric center of the chip layout area; Step 42, defining the optimization variables as macro groups Legal coordinates of (c) The original coordinates are Constructing an objective function, minimizing the weighted displacement distance and corner absorption cost, , Wherein, the As a weight coefficient positively correlated to the macro-group area, Representing the distance from the macro group to the nearest corner of the chip layout; Constraints include non-overlapping constraints and relative order preserving constraints; non-overlapping constraint refers to the fact that for any two macro groups And The method satisfies the following requirements, , Relative order retention constraint means that the relative positional relationship after legalization is ensured to be consistent with that before optimization, i.e. satisfied, , Thereby ensuring that the relative position relationship of the legal macro group is consistent with that before optimization.
6. The method according to claim 1, wherein the specific process of the fifth step is, Step 51, obtaining the time sequence looseness of all macro module pins by using the static time sequence analysis engine, and for any macro module Define its negative relaxation pin set , , Wherein, the Is a macro module Is provided with a set of all input-output pins, Is the looseness of the pins; step 52, tracking negative slack pins The time sequence units directly adjacent on the time sequence path are recorded as a set of the time sequence units Screening macro blocks meeting the following conditions As a key macroblock of the video signal, , In the formula, To have a negative set of pins with a sag, Is a pin A set of sequential cells immediately adjacent on a sequential path, Is a time sequence unit Is not limited to the sag of (1).
7. The method of claim 1, wherein the specific process identified as an array macro or a free macro in step six is, Step 61, extracting the abscissa set of all the macroblock center points in the logical cluster And the ordinate sets And are arranged in ascending order based on alignment tolerance One-dimensional clustering of coordinates will satisfy The arithmetic average value of each subgroup is calculated as the ideal physical grid line coordinate of the cluster, and the first is recorded The coordinates of the ideal vertical network line are Record the first The coordinates of the ideal horizontal network line are ; Step 62, calculate each macroblock center Manhattan distance to nearest ideal grid line as its positional deviation And , , Defining a geometric validation instruction function Only when the macro module Deviation from ideal network line And Remains in the array when the following conditions are met, , Step 63, calculating the median of the intra-cluster macroblock widths and heights And Defining a size deviation indication function Only when the macroblock size is The retention is achieved when the following conditions are met, , Wherein, the Is an allowable dimensional tolerance; Step 64, will simultaneously satisfy And Is identified as an array macro and the remaining as free macros.
8. The method of claim 1, wherein in the sixth step, the location optimization of the array macro is performed by, Step 611, calculating the best position gravity center of each macro block , , In the formula, Is a macro module A set of negative slack sequential cells immediately adjacent on the sequential path, For the connection weight, the negative relaxation pin and the relaxation of the connected negative relaxation time sequence unit are used for determining, , Step 612, constructing a cost matrix Computing macro blocks Is placed in the groove position Is added to the total cost of (1), , Wherein the first term is the distance cost relative to the gravity center of gravity, and the second term is relative to the initial position The stability penalty of (2); in step 613, the matching problem is solved using a linear assignment algorithm, and the location of the array macro is updated.
9. The method of claim 1, wherein in the sixth step, the fine-tuning of the position of the free macro is performed by a specific process, Step 621, for any critical free macros Searching the physical field, if the distance between the macro module and another macro module or chip boundary is smaller than the preset threshold, determining that physical blocking exists, constructing a blocking vector based on the determination result Wherein Respectively indicating whether blocking exists in the left direction, the right direction, the upper direction and the lower direction, wherein a value of 1 indicates that blocking exists, otherwise, the value of 0; Step 622, constructing a two-dimensional directional gradient mask based on the blocking vector For forcing zero-setting of gradients in restricted directions, where At the position of And is also provided with The value is 0 when the time is taken, otherwise, the value is 1; At the position of And is also provided with The value is 0 when the time is taken, otherwise, the value is 1; step 623, establishing a microtopography loss function including the gravitation term and the displacement regularization term Which is defined as follows, , Wherein, the In order to connect the weights, the weight of the connection, For the coordinates of the relevant node(s), Is the initial position; Step 624, calculate the original gradient of the loss function with respect to the coordinates And carrying out Hadamard product operation by using the directional gradient mask to obtain a corrected gradient , , Step 625, performing projection gradient update, calculating the first The position of the next iteration , , In the formula, In order for the rate of learning to be high, For projection operators, for mapping updated coordinates to nearest non-overlapping legal areas And (3) inner part.
10. The method of claim 1, wherein in the seventh step, four directions of N, S, FN, FS of the macro block are enumerated, the bus length is calculated, and the direction with the smallest line length is selected as the final macro block layout result.

Description

Time sequence driven macro module layout method Technical Field The invention belongs to the technical field of integrated circuit physical design automation, relates to a time sequence driving layering macro unit layout scheme for large-scale complex IP modules, and particularly relates to a time sequence driving macro module layout method. Background As integrated circuit fabrication processes continue to evolve, the design scale and complexity of a system on a chip (SoC) grows exponentially. In modern high performance chip designs, in addition to millions of standard cells, hundreds or thousands of memories (SRAMs), analog modules, and various types of IP cores, collectively referred to as macro-modules, are integrated. The macro module layout is used as an initial stage of a physical design flow and is responsible for determining the physical positions of the large modules on the chip layout. Because the macro module occupies most of the area in the chip layout, the layout quality directly defines the physical framework of the chip, not only determines the available layout area and power grid distribution of the subsequent standard units, but also restricts distribution of wiring resources and performance, power consumption and area (PPA) indexes of the final chip from the global topology. In the actual physical design process, the huge physical size of the macro block makes it a non-negligible layout obstacle. Unreasonable macro layout tends to cut off the data flow logic between standard cells, forcing the signal lines to bypass the macro blocks for long distance transmission, creating severe parasitic resistance capacitance delays. However, conventional macro-placement approaches typically target a single optimization with a minimum bus length (HPWL), which is a geometric distance-oriented strategy that is frustrating in the face of timing convergence issues. This is because there is a significant nonlinear relationship between bus length and critical path timing, and the shortest line length does not mean that the timing is optimal. For high frequency designs, small physical distance increases on critical paths may lead to serious setup time violations, while the prior art often lacks an efficient mechanism to differentially model and perceive such timing sensitivity early in macro layout. Furthermore, as design scales expand, modern socs typically employ deep hierarchical netlist structures to manage complex logic functions. In dealing with such large scale designs, physical layout faces the challenge of logical level and physical location mismatch. If the clustering is simply performed according to the logic level, the key time sequence paths crossing the levels are ignored, and if the logic level is scattered excessively during layout, the unit congestion of the local area can be caused, and the wiring difficulty is increased. Particularly in many-core processor or AI accelerator designs that include a large number of memory macro-blocks, these macro-blocks typically need to be arranged in a regular array structure in order to reduce power supply design complexity and promote routability. Such strict array constraints often treat the macro-blocks as rigid bodies, greatly limiting the optimization space inside. When a macro in the array is located on a critical timing path, if fine tuning is forbidden to maintain the shape of the array, the timing cannot be converged, otherwise, if the array structure is destroyed at will, serious layout fragmentation and congestion problems are caused. More importantly, macro layout typically occurs at an early stage of physical design when standard cells have not been placed and the exact net delay and timing margin are not accurately known. Thus, the global macro layout stage often can only rely on virtual connections or rough estimation models for guidance, which results in the generated layout schemes being prone to exposing serious timing bottlenecks in subsequent detailed designs. After the standard cell layout is completed, although an accurate static time sequence analysis result can be obtained, the layout structure is basically solidified at the moment, and moving the macro module is extremely difficult. The prior art lacks a mechanism capable of effectively utilizing the later accurate time sequence information and performing incremental fine repair on the macro module on the premise of not damaging the overall layout frame and the array regularity. Disclosure of Invention The invention aims to provide a time sequence driven macro module layout and an increment optimization method thereof, which can solve the problems that the prior macro module layout stage lacks consideration of time sequence indexes, lacks an increment macro module repair mechanism based on later accurate time sequence feedback and the like, can effectively process the layout problem of a large-scale mixed size circuit, remarkably improve worst negative relaxation (WNS) and total ne