CN-122021769-A - Method and related device for compressing large model weight channel group of surgical navigation
Abstract
The application belongs to a large model optimization method, and provides a method for grouping and compressing a weight channel of a large model of surgical navigation and a related device, which aims at solving the technical problems that the quantization precision and the reasoning speed cannot be balanced and the multi-mode surgical navigation scene is difficult to adapt when the large model of the existing large model of the surgical navigation of the end side of a compound surgical robot is applied, and after sampling activation is obtained, the sampling activation is flattened into a two-dimensional matrix, then a mutual information matrix among channels in the sampling activation is calculated, and then taking the mutual information matrix as an inter-channel similarity matrix, clustering and grouping channels through spectral clustering, distributing discrete bit numbers for each group according to the average importance of each group of clustered groups, pruning the weight parameters of the large surgical navigation model, quantifying the weight parameters after pruning according to the result of distributing the discrete bit numbers for each group, effectively balancing the quantification precision and the reasoning speed, and adapting to the multi-mode surgical navigation scene.
Inventors
- XIONG JING
- TAN MIN
- TAO YUSHUN
Assignees
- 中国科学院深圳先进技术研究院
Dates
- Publication Date
- 20260512
- Application Date
- 20260206
Claims (10)
- 1. The method for compressing the operation navigation large model weight channel group is characterized by comprising the following steps of: respectively performing activation sampling on each layer of the large surgical navigation model to obtain sampling activation, and flattening the sampling activation into a two-dimensional matrix, wherein rows in the two-dimensional matrix represent different samples, and columns represent different channels; calculating a mutual information matrix among channels in the sampling activation process; Taking the mutual information matrix as an inter-channel similarity matrix, and clustering and grouping the channels; Distributing discrete bit numbers for each group according to the average importance of each group of clustering groups; pruning is carried out on the weight parameters of the large surgical navigation model, and the weight parameters after pruning are quantized according to the result of the discrete bit number distribution of each group.
- 2. The method for compressing the channel group of the large model weight of the surgical navigation according to claim 1, wherein the method for calculating the mutual information matrix among the channels in the sampling activation comprises the following steps: Calculating the mutual information value in the mutual information matrix by: Wherein, the Is the first The first channel The mutual information value of the individual channels, Is the first The first channel The pearson correlation coefficients of the individual channels, Is the first The first channel Covariance of the individual channels; And carrying out normalization processing on the calculated mutual information value to obtain a normalized mutual information matrix.
- 3. The method for compressing the channel group of the large model weight of the surgical navigation according to claim 1, wherein the method for calculating the mutual information matrix among the channels in the sampling activation comprises the following steps: Calculating the mutual information value in the mutual information matrix by: Wherein, the Is the first The first channel The mutual information value of the individual channels, In order to activate the number of discretized intervals, Is the first The first channel The joint probability distribution of the individual channels, Is the first The edge probability distribution of the individual channels, Is the first Edge probability distribution of individual channels; And carrying out normalization processing on the calculated mutual information value to obtain a normalized mutual information matrix.
- 4. The method for compressing the channel group of the large model weight of the surgical navigation according to claim 1, wherein the method for clustering and grouping the channels comprises the following steps: Mapping the mutual information values in the mutual information matrix into similarity to obtain a similarity matrix; Combining the similarity matrix and the mutual information matrix to obtain a Laplace matrix; Performing feature decomposition on the Laplace matrix, and selecting feature vectors corresponding to the minimum feature values of the preset quantity, which are ranked at the front, to form an intermediate matrix; Performing K-means clustering on the row vectors of the intermediate matrix to obtain IDs of all channels; And determining the optimal group number through the contour coefficient, and calculating a channel list, a group size and a group average importance of each group as a clustering grouping result.
- 5. The method for compressing the large model weight channel group of the surgical navigation according to claim 1, wherein the method for assigning the discrete bit number to each group according to the average importance of each group of the clustered groups comprises the following steps: Descending order of each group according to the average amplitude or variance of each group to obtain a priority ordering result; a larger number of discrete bits is allocated for higher priority groups and ensures that the global average bits meet the target average bits.
- 6. The method for compressing the weight channel group of the large model of the surgical navigation according to claim 1, wherein the method for pruning the weight parameters of the large model of the surgical navigation and quantifying the weight parameters after pruning according to the result of distributing the discrete bit number of each group comprises the following steps: pruning is carried out on the weight parameters of the large surgical navigation model based on the hessian matrix approximation method; calculating to obtain a quantization upper bound according to the result of each group of distributed discrete bit numbers; And performing linear quantization on the pruned weight parameters in combination with an upper quantization bound.
- 7. The method for compressing a large model weight channel group for surgical navigation according to claim 6, wherein after performing linear quantization on the pruned weight parameters, the method further comprises: calculating a quantization error by the following method, and transmitting the quantization error back to the operation navigation large model: Wherein, the In order to quantify the error of the code, For the quantized weight parameter(s), As the weight parameter after pruning, the weight parameter is obtained, Is the upper bound of quantization.
- 8. A surgical navigation large model weight channel packet compression system, comprising: The sampling module is used for respectively carrying out activation sampling on each layer of the large surgical navigation model to obtain sampling activation, and flattening the sampling activation into a two-dimensional matrix, wherein rows in the two-dimensional matrix represent different samples, and columns represent different channels; The calculation module is used for calculating a mutual information matrix among channels in the sampling activation process; the grouping module is used for clustering and grouping the channels by taking the mutual information matrix as an inter-channel similarity matrix; The distribution module is used for distributing discrete bit numbers for each group according to the average importance of each group of clustering groups; and the pruning quantization module is used for pruning the weight parameters of the large surgical navigation model and quantizing the weight parameters after pruning according to the result of distributing the discrete bit number in each group.
- 9. An electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the surgical navigation large model weight channel group compression method according to any one of claims 1-7 when the computer program is executed.
- 10. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the surgical navigation large model weight channel group compression method according to any one of claims 1-7.
Description
Method and related device for compressing large model weight channel group of surgical navigation Technical Field The application belongs to a large model optimization method, and particularly relates to a method and a related device for grouping and compressing a weight channel of a large model for surgical navigation. Background In recent years, the development of a multi-mode large model provides a feasible basis for multi-mode data processing and navigation in a complex operation scene, but the complex operation robot needs extremely low delay to realize real-time decision making and avoid potential safety hazards, and the conventional large model is difficult to meet the requirements on operation speed and calculation complexity. Aiming at the hardware requirement of the intelligent compound surgical robot terminal, the multi-mode surgical navigation large-model end side deployment can realize local high-efficiency calculation, ensure data safety, decision accuracy and instant response, improve the safety and reliability of a navigation system, and the current small and precise end side deployable field model becomes a research hotspot. The end side calculation can reduce the dependence on network connection and adapt to the operation scene with poor network conditions, but the large model of the end side operation navigation of the compound operation robot faces two major core challenges, namely difficult parameter quantification. The model parameters are huge, the calculation accuracy and the parameter compression degree are required to be balanced in efficient quantification, how to mine the parameter correlation and realize efficient coding and error controllable quantification of the information quantity maximized lower end side parameters is still difficult, and the reasoning speed is slow. The model needs to process a large amount of operation and concurrent data flow, and under the condition that terminal hardware resources are limited, high-speed response of different operation tasks is difficult to realize. Therefore, the prior art still has obvious defects, the channel correlation of the quantization scheme is not sufficiently mined, the bit allocation is stiff, the pruning and quantization cooperativity are poor, the multi-mode operation navigation scene is difficult to adapt, the quantization precision and the reasoning speed cannot be balanced, the low-delay requirement of the real-time navigation of the compound operation robot is difficult to meet, and an optimization scheme is needed to solve the problems. Disclosure of Invention Aiming at the technical problems that quantization precision and reasoning speed cannot be balanced and multi-mode surgical navigation scenes are difficult to adapt when the traditional compound surgical robot end-side surgical navigation large model is applied, the application provides a method for compressing a weight channel group of a surgical navigation large model and a related device. In order to achieve the above purpose, the application is realized by adopting the following technical scheme: In a first aspect, the present application provides a method for compressing a large model weight channel group for surgical navigation, including: respectively performing activation sampling on each layer of the large surgical navigation model to obtain sampling activation, and flattening the sampling activation into a two-dimensional matrix, wherein rows in the two-dimensional matrix represent different samples, and columns represent different channels; calculating a mutual information matrix among channels in the sampling activation process; Taking the mutual information matrix as an inter-channel similarity matrix, and clustering and grouping the channels; Distributing discrete bit numbers for each group according to the average importance of each group of clustering groups; pruning is carried out on the weight parameters of the large surgical navigation model, and the weight parameters after pruning are quantized according to the result of the discrete bit number distribution of each group. Further, the method for calculating the mutual information matrix between channels in the sampling activation comprises the following steps: Calculating the mutual information value in the mutual information matrix by: Wherein, the Is the firstThe first channelThe mutual information value of the individual channels,Is the firstThe first channelThe pearson correlation coefficients of the individual channels,Is the firstThe first channelCovariance of the individual channels; And carrying out normalization processing on the calculated mutual information value to obtain a normalized mutual information matrix. Further, the method for calculating the mutual information matrix between channels in the sampling activation comprises the following steps: Calculating the mutual information value in the mutual information matrix by: Wherein, the Is the firstThe first channelThe mutua