Search

CN-112633460-B - Method and device for constructing neural network, and image processing method and device

CN112633460BCN 112633460 BCN112633460 BCN 112633460BCN-112633460-B

Abstract

The application discloses a method and a device for constructing a neural network, and an image processing method and device in the field of artificial intelligence. The method for constructing the neural network comprises the steps of constructing a search space according to application requirements of a target neural network, wherein the search space comprises M elements, the M elements are used for indicating M network structures, each element in the M elements comprises the number of blocks in a stage in the corresponding network structure and the number of channels of the blocks, M is a positive integer, and the target network structure is selected from the M network structures according to distribution relations of non-evaluated elements in the search space. The method can efficiently construct the neural network meeting the performance requirements.

Inventors

  • ZHANG YIKANG
  • ZHONG ZHAO

Assignees

  • 华为技术有限公司

Dates

Publication Date
20260512
Application Date
20200624
Priority Date
20190924

Claims (20)

  1. 1. An image processing method, comprising: Acquiring an image to be processed; classifying the images to be processed according to a target neural network to obtain classification results of the images to be processed; the target neural network is constructed by a target network structure, and the target network structure is determined by the following steps: constructing a search space according to application requirements of the target neural network, wherein the search space comprises M elements, the M elements are used for indicating M network structures, each element in the M elements comprises the number of blocks in a stage in the corresponding network structure and the number of channels of the blocks, M is a positive integer, the target neural network is used for image classification, the application requirements of the target neural network comprise the running speed of the target neural network, the parameter number of the target neural network or the structure requirements of the target neural network, and the structure requirements comprise the number of blocks in each stage in the structure of the target neural network and the number of channels of each block; and selecting the target network structure from the M network structures according to the distribution relation of the non-evaluated elements in the search space, wherein the distribution relation of the non-evaluated elements is a clustering result of the non-evaluated elements.
  2. 2. The method of claim 1, wherein constructing a search space according to the application requirements of the target neural network comprises: Constructing an initial search space according to the application requirement of the target neural network, wherein the initial search space comprises N initial elements, the N initial elements are used for indicating N initial network structures, each element in the N initial elements comprises the number of blocks in a stage in the corresponding initial network structure and the number of channels of the blocks, and N is a positive integer greater than or equal to M; Screening the N initial network structures indicated by the N initial elements according to a preset rule, so as to obtain the M elements in the search space, wherein the preset rule comprises: and deleting the first initial element from the initial search space if the number of blocks in each stage in the first initial network structure indicated by the first initial element in the N initial elements is not greater than the number of blocks in the corresponding stage in the second initial network structure indicated by the second initial element in the N initial elements, and the number of channels of each block in each stage in the first initial network structure is not greater than the number of channels of each block in the corresponding stage in the second initial network structure.
  3. 3. The method according to claim 1 or 2, wherein said selecting the target network structure from the M network structures according to the distribution relation of the unavaluated elements in the search space comprises: determining S elements in the unevaluated elements according to the distribution relation of the unevaluated elements, wherein S is a positive integer smaller than M; And selecting the target network structure from the M network structures according to the S elements.
  4. 4. A method according to claim 3, wherein said selecting said target network structure from said M network structures according to said S elements comprises: Evaluating S network structures indicated by S elements in the non-evaluated elements to obtain evaluation results of the evaluated elements, wherein the evaluation results of the evaluated elements comprise the evaluation results of the S network structures; And selecting the target network structure from the M network structures according to the evaluation results of the evaluated elements.
  5. 5. The method of claim 4, wherein selecting the target network structure from the M network structures based on the evaluation results of the evaluated elements comprises: Modeling a first unavaluated element according to the evaluation result of the evaluated element to obtain a model of the first unavaluated element, wherein the first unavaluated element comprises other elements except the evaluated element in the search space; And selecting the target network structure from the M network structures according to the model of the first unevaluated element.
  6. 6. The method of claim 5, wherein in the case that the target network structure is not selected from the M network structures according to a distribution relation of non-evaluated elements in a search space, the target network structure is determined by: And selecting the target network structure from the M network structures according to the distribution relation of the first unevaluated element and a model of the first unevaluated element constructed based on the evaluated element.
  7. 7. The method of claim 6, wherein selecting the target network structure from the M network structures based on the distribution relationship of the first unevaluated element and a model of the first unevaluated element constructed based on the evaluated element comprises: Determining L elements in the first unevaluated element according to the distribution relation of the first unevaluated element, wherein L is a positive integer smaller than M; And selecting the target network structure from the M network structures according to the model of the L elements and the first non-evaluated element constructed based on the evaluated elements.
  8. 8. The method of claim 7, wherein the selecting the target network structure from the M network structures according to the model of the L elements and the first non-evaluated element constructed based on the evaluated elements comprises: According to the model of the first non-evaluated element constructed based on the evaluated elements, Q elements are determined from the L elements, and Q is a positive integer smaller than L; Evaluating the Q network structures indicated by the Q elements to obtain an evaluation result of a first evaluated element, wherein the evaluation result of the first evaluated element comprises the evaluation result of the S network structures and the evaluation result of the Q network structures; And selecting the target network structure from the M network structures according to the distribution relation between the evaluation result of the first evaluated element and a second unavaluated element, wherein the second unavaluated element comprises other elements except the first evaluated element in the search space.
  9. 9. The method of claim 7, wherein the distribution relationship of the first unavaluated element is a clustering result of the first unavaluated element, and the L elements are elements in L clusters included in the clustering result of the first unavaluated element, respectively.
  10. 10. The method of claim 9, wherein the L elements are L elements corresponding to a center of the L clusters, respectively.
  11. 11. The method according to claim 8, wherein in case the target network structure is not selected from the M network structures according to the distribution relation of the first non-evaluated element and the model of the first non-evaluated element constructed based on the evaluated elements, the target network structure is determined by: And re-selecting the target network structure from the M network structures according to the distribution relation of the second unevaluated element and a model of the second unevaluated element constructed based on the first evaluated element.
  12. 12. The method according to any one of claims 6 to 10, wherein the model of the first unavaluated element constructed based on the evaluated element comprises: the model of the first non-evaluation element is obtained according to the evaluation result of the evaluated network structure and the following formula: And/or ; Wherein, the Representing an unevaluated network structure in the search space, The threshold value of the accuracy is represented, Representing an accuracy greater than an accuracy threshold Network structure of (2) Is a function of the conditional probability density of (c), , Representing accuracy less than or equal to an accuracy threshold Network structure of (2) Is a function of the conditional probability density of (c), , A mixed gaussian distribution is represented and is shown, , Representation of And (3) with The euclidean distance between the two, The representation is composed of A function of the distance to be composed, , Representing the first of the evaluated network structures The structure of the individual network(s), Representing the first of the evaluated network structures The accuracy of the structure of the individual networks, Representation of The corresponding weight is used to determine the weight, The normalization factor is represented as such, Representation of Is used for the preparation of the super-ginseng, Is a positive integer which is used for the preparation of the high-voltage power supply, Is the base of a natural logarithmic function, Positively correlated with the performance of the network architecture, Positively correlated with uncertainty in network fabric performance.
  13. 13. The method according to any one of claims 6 to 10, wherein the model of the first unavaluated element constructed based on the evaluated element comprises: the model of the first non-evaluation element is obtained according to the evaluation result of the evaluated network structure and the following formula: Wherein, the Representing an unevaluated element in the search space; , Representing the first of the evaluated elements The number of elements to be added to the composition, Representing the first of the evaluated elements Precision of individual elements; , representing a desired function; in order to obey the random variable of the gaussian distribution, Mean of (2) And Variance of (2) And (3) with Heel input The following relationship is satisfied: Wherein, the For the number of evaluated network structures, Is that The vector of the components is a vector of the components, , , Is that The vector of the components is a vector of the components, , , Is that The matrix is composed of a matrix of which the number is equal to, , , , Is a super-ginseng, which is used for the treatment of the heart disease, Representing the first of the evaluated elements The number of elements to be added to the composition, Representation and representation The same or a different evaluated element(s), A base that is a natural logarithmic function; is a matrix of units which is a matrix of units, Is also a super-ginseng, which is also called a super-ginseng, Is a positive integer.
  14. 14. A method according to claim 3, wherein the S elements are elements in S clusters included in the clustering result of the unevaluated elements, respectively.
  15. 15. The method of claim 14, wherein the S elements are S elements corresponding to a center of the S clusters, respectively.
  16. 16. An image processing apparatus, comprising: an acquisition unit configured to acquire an image to be processed; the image processing unit is used for classifying the images to be processed according to the target neural network to obtain classification results of the images to be processed; the target neural network is constructed by a target network structure, and the target network structure is determined by the following steps: constructing a search space according to application requirements of the target neural network, wherein the search space comprises M elements, the M elements are used for indicating M network structures, each element in the M elements comprises the number of blocks in a stage in the corresponding network structure and the number of channels of the blocks, M is a positive integer, the target neural network is used for image classification, the application requirements of the target neural network comprise the running speed of the target neural network, the parameter number of the target neural network or the structure requirements of the target neural network, and the structure requirements comprise the number of blocks in each stage in the structure of the target neural network and the number of channels of each block; and selecting the target network structure from the M network structures according to the distribution relation of the non-evaluated elements in the search space, wherein the distribution relation of the non-evaluated elements is a clustering result of the non-evaluated elements.
  17. 17. The apparatus of claim 16, wherein the constructing a search space according to the application requirements of the target neural network comprises: Constructing an initial search space according to the application requirement of the target neural network, wherein the initial search space comprises N initial elements, the N initial elements are used for indicating N initial network structures, each element in the N initial elements comprises the number of blocks in a stage in the corresponding initial network structure and the number of channels of the blocks, and N is a positive integer greater than or equal to M; Screening the N initial network structures indicated by the N initial elements according to a preset rule, so as to obtain the M elements in the search space, wherein the preset rule comprises: and deleting the first initial element from the initial search space if the number of blocks in each stage in the first initial network structure indicated by the first initial element in the N initial elements is not greater than the number of blocks in the corresponding stage in the second initial network structure indicated by the second initial element in the N initial elements, and the number of channels of each block in each stage in the first initial network structure is not greater than the number of channels of each block in the corresponding stage in the second initial network structure.
  18. 18. The apparatus according to claim 16 or 17, wherein said selecting said target network structure from said M network structures according to a distribution relation of unavaluated elements in a search space comprises: determining S elements in the unevaluated elements according to the distribution relation of the unevaluated elements, wherein S is a positive integer smaller than M; And selecting the target network structure from the M network structures according to the S elements.
  19. 19. The apparatus of claim 18, wherein said selecting said target network structure from said M network structures according to said S elements comprises: Evaluating S network structures indicated by S elements in the non-evaluated elements to obtain evaluation results of the evaluated elements, wherein the evaluation results of the evaluated elements comprise the evaluation results of the S network structures; And selecting the target network structure from the M network structures according to the evaluation results of the evaluated elements.
  20. 20. The apparatus of claim 19, wherein the selecting the target network structure from the M network structures according to the evaluation result of the evaluated element comprises: Modeling a first unavaluated element according to the evaluation result of the evaluated element to obtain a model of the first unavaluated element, wherein the first unavaluated element comprises other elements except the evaluated element in the search space; And selecting the target network structure from the M network structures according to the model of the first unevaluated element.

Description

Method and device for constructing neural network, and image processing method and device Technical Field The present application relates to the field of artificial intelligence, and more particularly, to a method and apparatus for constructing a neural network, and an image processing method and apparatus. Background Artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) is the theory, method, technique, and application system that simulates, extends, and extends human intelligence using a digital computer or a machine controlled by a digital computer, perceives the environment, obtains knowledge, and uses the knowledge to obtain optimal results. In other words, artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar manner to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision. Research in the field of artificial intelligence includes robotics, natural language processing, computer vision, decision and reasoning, man-machine interaction, recommendation and search, AI-based theory, and the like. With the rapid development of artificial intelligence technology, the performance of a neural network (for example, a convolutional neural network) is continuously improved, and the neural network also achieves great achievement in processing and analyzing various media signals such as images, videos and voices. A neural network with excellent performance often has a exquisite network structure, but in practical application, due to different training sets, index requirements and application targets, the existing network structure cannot be directly used. At present, it is common practice to adjust the network structure based on the existing network structure for different tasks, however, it is often difficult to obtain a network structure with excellent performance through adjustment, and another common practice is to automatically search the network structure based on automatic machine learning (automated MACHINE LEARNING, autoML), but AutoML is a method for designing the network structure from scratch, and the complexity of the task is very high. Therefore, how to efficiently construct a neural network meeting performance requirements is a technical problem to be solved. Disclosure of Invention The application provides a method and a device for constructing a neural network, and an image processing method and device, which can be used for efficiently constructing the neural network meeting performance requirements. In a first aspect, a method for constructing a neural network is provided, and the method comprises the steps of constructing a search space according to application requirements of a target neural network, wherein the search space comprises M elements, the M elements are used for indicating M network structures, each element of the M elements comprises the number of blocks in a stage in the corresponding network structure and the number of channels of the blocks, M is a positive integer, and the target network structure is selected from the M network structures according to distribution relations of non-evaluated elements in the search space. In the embodiment of the application, the representative elements in the search space can be selected according to the distribution relation of the non-evaluated elements, and at this time, the target network structure can be efficiently selected from the M network structures according to the representative elements, so that the neural network meeting the performance requirement can be efficiently constructed. Optionally, each element of the M elements includes the corresponding network structure means that each element of the M elements indicates the network structure. Accordingly, for each element of the M elements, the each element includes a number of blocks in a stage in the network structure indicated by the element and a number of channels of the blocks. With reference to the first aspect, in certain implementations of the first aspect, the application requirement of the target neural network includes an operation speed of the target neural network, a parameter number of the target neural network, or a structural requirement of the target neural network, where the structural requirement includes a number of blocks of each stage in the target neural network structure and a channel number of each of the blocks. In the embodiment of the application, the search space is constructed according to the running speed of the target neural network, the parameter quantity of the target neural network or the structural requirement of the target neural network, and a plurality of network structures with low performance can be screened out in the process of constructing the search space, so