CN-116702844-B - Operation control method, device and equipment of network model

CN116702844BCN 116702844 BCN116702844 BCN 116702844BCN-116702844-B

Abstract

The embodiment of the invention provides a method, a device and equipment for controlling operation of a network model, wherein the method comprises the steps of determining operation data types of target network layers in a plurality of network layers included in the network model, acquiring target static data of the target network layers from a first storage space under the condition that the operation data types are static data types, wherein the static storage space in an operation scene of the network model is divided into the first storage space and a second storage space, and operating the target network layers by at least using the target static data. The invention solves the problem of lower storage resource utilization rate of the static storage space supporting the network model operation, thereby achieving the effect of improving the storage resource utilization rate of the static storage space supporting the network model operation.

Inventors

YAO HAIDONG
LIU JUAN
CHANG RUI

Assignees

中兴智能科技南京有限公司

Dates

Publication Date: 20260505
Application Date: 20220223

Claims (14)

1. A method for controlling operation of a network model, comprising: determining the operation data type of a target network layer in a plurality of network layers included in the network model; Under the condition that the operation data type is a static data type, acquiring target static data of the target network layer from a first storage space, wherein the static storage space in an operation scene of the network model is divided into the first storage space and a second storage space, the first storage space is used for storing static data used by the network model in an operation process, and the second storage space is used for storing dynamic data used and output by the network model in the operation process; operating the target network layer using at least the target static data; The method further comprises the steps of determining a current allocation direction of the output data of the target network layer in the second storage space, wherein the current allocation direction is opposite to the last allocation direction of the data allocation storage space in the second storage space, the target storage space is used for storing target output data of the target network layer in the second storage space according to the current allocation direction, the determining the current allocation direction of the output data allocation storage space of the target network layer in the second storage space comprises determining a reference pointer used for the data allocation storage space in the second storage space last time in the second storage space from a first pointer and a second pointer, the first pointer is used for indicating a top end starting position of the data allocation storage space from the top end of the second storage space, the second pointer is used for indicating a bottom end starting position of the data allocation storage space from the bottom end of the second storage space, the second pointer is used for indicating the first pointer, the second pointer is used for determining the first pointer is used for indicating the current allocation direction, and the second pointer is used for indicating the first pointer is used for the current allocation direction.
2. The method of claim 1, wherein the obtaining the target static data of the target network layer from the first storage space comprises: Determining a target operation type of the target network layer from operation types of a plurality of network layers using static data; Determining a target storage subspace corresponding to the target operation type from a plurality of operation types and a plurality of storage subspaces which have corresponding relations; and reading the target static data of the target network layer from the target storage subspace.
3. The method according to claim 2, wherein determining the target storage subspace corresponding to the target operation type from the plurality of operation types and the plurality of storage subspaces having the correspondence relation comprises: determining a first subspace of the plurality of storage subspaces as the target storage subspace when the target operation type is convolution operation; and determining a second subspace of the plurality of storage subspaces as the target storage subspace when the target operation type is an active operation.
4. The method of claim 3, wherein the reading the target static data of the target network layer from the target storage subspace comprises: Reading convolution weight data and offset data of the target network layer from the first subspace as the target static data under the condition that the target operation type is convolution operation; and under the condition that the target operation type is an activation operation, reading the activation coefficient data of the target network layer from the second subspace as the target static data.
5. The method of claim 1, wherein said allocating a target storage space in said second storage space for output data of said target network layer according to said current allocation direction comprises: distributing the target storage space in the second storage space from the starting position indicated by the target pointer to output data of the target network layer according to the current distribution direction; And updating the target pointer according to the ending position of the target storage space.
6. The method of claim 1, wherein after said allocating a target storage space in said second storage space for output data of said target network layer according to said current allocation direction, said method further comprises: Recording the residual use times corresponding to the target storage space, wherein the initial value of the residual use times is the number of network layers using the output data of the target network layer as input data; In the case where the output data of the target network layer is used as input data, subtracting 1 from the remaining number of uses; And releasing the target storage space and updating the target pointer under the condition that the residual use times are 0.
7. The method of claim 1, wherein the determining a current allocation direction in the second storage space to allocate storage space for output data of the target network layer comprises: determining data attributes of output data of the target network layer; and determining a current allocation direction of allocating storage space for output data of the target network layer in the second storage space under the condition that the data attribute is used for indicating the same group of network layers which do not comprise the target network layer in the network model.
8. The method of claim 7, wherein after said determining the data attributes of the output data of the target network layer, the method further comprises: determining a storage space corresponding to output data of the target network layer from the storage space allocated for the network layer group where the same group of network layers are located under the condition that the data attribute is used for indicating the same group of network layers including the target network layer in the network model and the same group of network layers are allocated with the storage space, wherein the output data of the same group of network layers and the output data of the target network layer are both input to the same network layer; and determining the target network layer and the same group of network layers as a reference network layer group under the condition that the data attribute is used for indicating the same group of network layers including the target network layer in the network model and the same group of network layers do not allocate storage space, wherein output data of the same group of network layers and output data of the target network layer are input to the same network layer, determining the current allocation direction of allocating storage space for the output data of the reference network layer group in the second storage space, allocating storage space for the output data of the reference network layer group in the second storage space according to the current allocation direction, and dividing storage space for each network layer included in the reference network layer group in the storage space allocated for the reference network layer group.
9. An operation control virtual device of a network model, comprising: a first determining module, configured to determine an operation data type of a target network layer among a plurality of network layers included in the network model; The system comprises an acquisition module, a storage module and a control module, wherein the acquisition module is used for acquiring target static data of the target network layer from a first storage space under the condition that the operation data type is the static data type, wherein the static storage space in an operation scene of the network model is divided into the first storage space and a second storage space, the first storage space is used for storing static data used by the network model in the operation process, and the second storage space is used for storing dynamic data used and output by the network model in the operation process; the operation module is used for operating the target network layer at least by using the target static data; The device further comprises a second determining module, a first determining module and a second determining module, wherein the second determining module is used for determining the current distribution direction of the storage space for the output data of the target network layer in the second storage space, and the current distribution direction is opposite to the distribution direction of the storage space for data distribution in the second storage space last time; the distribution module is used for distributing a target storage space for the output data of the target network layer in the second storage space according to the current distribution direction, wherein the target storage space is used for storing target output data output by the target network layer; The second determining module includes: A third determining unit configured to determine a reference pointer used for allocating a storage space for data in the second storage space last time from among a first pointer for indicating a top end start position of allocating a storage space from a top end to a bottom end of the second storage space and a second pointer for indicating a bottom end start position of allocating a storage space from a bottom end to a top end of the second storage space; A fourth determining unit configured to determine, when the reference pointer is the first pointer, the second pointer as a target pointer, and determine an allocation direction indicated by the second pointer as the current allocation direction; a fifth determining unit configured to determine, when the reference pointer is the second pointer, the first pointer as a target pointer, and determine an allocation direction indicated by the first pointer as the current allocation direction.
10. An operation control device of a network model is characterized by comprising a processor and a static memory, wherein, The static memory is divided into a first storage space and a second storage space, the first storage space is used for storing static data of the network model in the operation process, the second storage space is used for storing dynamic data of the network model in the operation process, and the network model comprises a plurality of network layers; The processor is used for determining the operation data type of a target network layer in the plurality of network layers, acquiring target static data of the target network layer from the first storage space when the operation data type is a static data type, and operating the target network layer by at least using the target static data; The processor comprises a data acquisition port and a data processing unit, wherein, The data acquisition port is configured to determine a current allocation direction of allocating a storage space for output data of the target network layer in the second storage space, where the current allocation direction is opposite to an allocation direction of allocating the storage space for data in the second storage space last time; The data processing unit is configured to allocate a target storage space for output data of the target network layer in the second storage space according to the current allocation direction, where the target storage space is used to store target output data output by the target network layer; The method comprises the steps of determining a current allocation direction of the storage space for output data of the target network layer in the second storage space, wherein the current allocation direction comprises a first pointer and a second pointer, wherein the first pointer is used for indicating a top end starting position of the storage space allocated from the top end to the bottom end of the second storage space, the second pointer is used for indicating a bottom end starting position of the storage space allocated from the bottom end to the top end of the second storage space, the second pointer is determined to be a target pointer when the reference pointer is the first pointer, and the allocation direction indicated by the second pointer is determined to be the current allocation direction when the reference pointer is the second pointer, and the first pointer is determined to be a target pointer and the allocation direction indicated by the first pointer is determined to be the current allocation direction when the reference pointer is the second pointer.
11. The apparatus of claim 10, wherein the first storage space comprises a first subspace and a second subspace, wherein, The first subspace is used for storing weight data and bias data of a network layer with a convolution operation type; and the second subspace is used for storing the activation coefficient data of the network layer with the operation type of activation operation.
12. The apparatus of claim 11, wherein the processor comprises a convolution processing unit and an activation processing unit, wherein, The convolution processing unit is configured to acquire, from the first subspace, target weight data and target offset data of the target network layer with an operation type of convolution operation as the target static data; and the activation processing unit is used for acquiring target activation coefficient data of the target network layer with the operation type of activation operation from the second subspace as the target static data.
13. A computer readable storage medium, characterized in that a computer program is stored in the computer readable storage medium, wherein the computer program, when being executed by a processor, implements the steps of the method according to any of the claims 1 to 8.
14. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method of any one of claims 1 to 8 when the computer program is executed.

Description

Operation control method, device and equipment of network model Technical Field The embodiment of the invention relates to the field of computers, in particular to a method, a device and equipment for controlling operation of a network model. Background In recent years, by means of great improvement of computational power and convenience in acquiring big data, the deep learning technology and the application of a network model are greatly improved, and the problems of more and more image processing, natural language analysis and the like can be well solved through the deep learning technology. Deep-learning network models are currently typically run on a general purpose GPU (Graphics Processing Unit, graphics processor), and running floor of the network model is an important part of deep-learning applications. It is naturally a strong need to design efficient ways to support the operation of network models. The network model has the self-owned operation characteristics, so that the network model has specific requirements on storage resources. On-chip storage of the GPU is generally realized through SRAM (Static Random Access Memory ), the read-write efficiency of the SRAM is higher, but the cost is also higher, in the equipment, the SRAM resources are often limited, full utilization is needed, and the on-chip storage resource design and the use of the reasoning equipment with the deep learning network model are particularly critical. Disclosure of Invention The embodiment of the invention provides a method, a device and equipment for controlling the operation of a network model, which at least solve the problem of low utilization rate of storage resources of a static storage space supporting the operation of the network model in the related technology. According to one embodiment of the invention, an operation control method of a network model is provided, which comprises the steps of determining operation data types of a target network layer in a plurality of network layers included in the network model, acquiring target static data of the target network layer from a first storage space in the case that the operation data types are static data types, wherein the static storage space in an operation scene of the network model is divided into the first storage space and a second storage space, the first storage space is used for storing static data used by the network model in the operation process, the second storage space is used for storing dynamic data used and output by the network model in the operation process, and operating the target network layer by at least using the target static data. In one exemplary embodiment, the acquiring the target static data of the target network layer from the first storage space includes determining a target operation type of the target network layer from a plurality of operation types of network layers using the static data, determining a target storage subspace corresponding to the target operation type from a plurality of operation types and a plurality of storage subspaces having a corresponding relationship, and reading the target static data of the target network layer from the target storage subspace. In an exemplary embodiment, the determining the target storage subspace corresponding to the target operation type from the plurality of operation types and the plurality of storage subspaces with corresponding relations includes determining that a first subspace of the plurality of storage subspaces is the target storage subspace when the target operation type is a convolution operation, and determining that a second subspace of the plurality of storage subspaces is the target storage subspace when the target operation type is an activation operation. In one exemplary embodiment, the reading the target static data of the target network layer from the target storage subspace includes reading convolution weight data and offset data of the target network layer from the first subspace as the target static data if the target operation type is a convolution operation, and reading activation coefficient data of the target network layer from the second subspace as the target static data if the target operation type is an activation operation. In an exemplary embodiment, the method further comprises determining a current allocation direction of allocating storage space for output data of the target network layer in the second storage space, wherein the current allocation direction is opposite to an allocation direction of allocating storage space for data in the second storage space last time, and allocating target storage space for output data of the target network layer in the second storage space according to the current allocation direction, wherein the target storage space is used for storing target output data output by the target network layer. In one exemplary embodiment, the determining the current allocation direction of the storage space for the output data of t