US-12626038-B2 - System for universal hardware-neural network architecture search (co-design)

US12626038B2US 12626038 B2US12626038 B2US 12626038B2US-12626038-B2

Abstract

An architecture search system evaluates a search space of neural network and hardware architectures with a plurality of candidate controllers. Each controller attempts to identify an optimized architecture using a different optimization algorithm. To identify a controller for the search space, the architecture search system samples subspaces of the search space having a portion of the neural network search space and a portion of the hardware search space. For each subspace, candidate controllers are scored with respect to the optimized design determined by the respective candidate controllers. Using the scores for the various candidate controllers across the sampled subspaces, a controller is selected to optimize the overall network architecture search space.

Inventors

Yash Akhauri
Nilesh Jain
Juan Pablo Munoz Chiabrando
Adithya M. Niranjan

Assignees

INTEL CORPORATION

Dates

Publication Date: 20260512
Application Date: 20211216
Priority Date: 20210929

Claims (20)

1 . A method for optimizing a hardware-network architecture search for automated computer models, comprising: sampling a plurality of search subspaces from a search space of neural network architectures and hardware architectures, each search subspace including a portion of the search space; determining controller-optimized designs for the plurality of search subspaces using a plurality of candidate controllers, wherein the plurality of candidate controllers identify the controller-optimized designs using different optimization algorithms, wherein a controller-optimized design is a hardware-neural network design; scoring the plurality of candidate controllers by determining a plurality of scores for a single candidate controller, wherein determining the plurality of scores for the single candidate controller comprises: determining a first score for the single candidate controller based on one or more controller-optimized designs for a first search subspace and an optimal subspace design for the first search subspace, and determining a second score for the single candidate controller based on one or more controller-optimized designs for a second search subspace and an optimal subspace design for the second search subspace; selecting an optimization controller from the plurality of candidate controllers based on the scoring; determining a neural network architecture and hardware architecture by applying the selected optimization controller on at least part of the search space; generating an automated computer model with the neural network architecture; and executing the automated computer model using a hardware device with the hardware architecture.
2 . The method of claim 1 , further comprising applying a search definition for the search space to the selected optimization controller, wherein the search definition specifies an objective of a task to be performed by the automated computer model.
3 . The method of claim 1 , wherein for each search subspace, the scoring includes all of the plurality of candidate controllers.
4 . The method of claim 1 , wherein, for each search subspace, the plurality of candidate controllers are scored in an exploration order, wherein for a particular candidate controller in the exploration order, a controller-optimized design for the particular candidate controller is evaluated with respect to a quality threshold, such that when the quality threshold is met, a subsequent controller in the exploration order is not evaluated for the search subspace, and when the quality threshold is not met, the next controller in the exploration order is evaluated.
5 . The method of claim 1 , wherein the optimal subspace design is determined from a plurality of subspace configurations by enumerating all subspace configurations within the search subspace.
6 . The method of claim 1 , wherein one or more of the plurality of candidate controllers include interfacing between a hardware controller and a neural network controller.
7 . The method of claim 6 , wherein the hardware controller or neural network controller includes differentiable algorithms, genetic algorithms, or reinforcement algorithms.
8 . The method of claim 1 , wherein the neural network architecture and hardware architecture include one or more of discrete designs, probabilistic designs, and a design population.
9 . The method of claim 1 , wherein the scoring is based on an objective included in the search definition.
10 . A system comprising: a processor; and a non-transitory computer-readable storage medium containing computer program code for execution by the processor for: sampling a plurality of search subspaces from a search space of neural network architectures and hardware architectures, each search subspace including a portion of the search space, determining controller-optimized designs for the plurality of search subspaces using a plurality of candidate controllers, wherein the plurality of candidate controllers identify the controller-optimized designs using different optimization algorithms, wherein a controller-optimized design is a hardware-neural network design, scoring the plurality of candidate controllers by determining a plurality of scores for a single candidate controller, wherein determining the plurality of scores for the single candidate controller comprises: determining a first score for the single candidate controller based on one or more controller-optimized designs for a first search subspace with and an optimal subspace design for the first search subspace, and determining a second score for the single candidate controller based on one or more controller-optimized designs for a second search subspace and an optimal subspace design for the second search subspace, selecting an optimization controller from the plurality of candidate controllers based on the scoring, determining a neural network architecture and hardware architecture by applying the selected optimization controller on at least part of the search space, generating an automated computer model with the neural network architecture, and executing the automated computer model using a hardware device with the hardware architecture.
11 . The system of claim 10 , wherein the computer program code is for further execution by the processor for applying a search definition for the search space to the selected optimization controller, wherein the search definition specifies an objective of a task to be performed by the automated computer model.
12 . The system of claim 10 , wherein for each search subspace, the scoring includes all of the plurality of candidate controllers.
13 . The system of claim 10 , wherein, for each search subspace, the plurality of candidate controllers are scored in an exploration order, wherein for a particular candidate controller in the exploration order, a controller-optimized design for the particular candidate controller is evaluated with respect to a quality threshold, such that when the quality threshold is met, a subsequent controller in the exploration order is not evaluated for the search subspace, and when the quality threshold is not met, the next controller in the exploration order is evaluated.
14 . The system of claim 10 , wherein the optimal subspace design is determined from a plurality of subspace configurations by enumerating all subspace configurations within the search subspace.
15 . The system of claim 10 , wherein one or more of the plurality of candidate controllers include interfacing between a hardware controller and a neural network controller.
16 . The system of claim 15 , wherein the hardware controller or neural network controller includes differentiable algorithms, genetic algorithms, or reinforcement algorithms.
17 . The system of claim 10 , wherein the neural network architecture and hardware architecture include one or more of discrete designs, probabilistic designs, and a design population.
18 . The system of claim 10 , wherein the scoring is based on an objective included in the search definition.
19 . A non-transitory computer-readable storage medium containing instructions executable by a processor for: sampling a plurality of search subspaces from a search space of neural network architectures and hardware architectures, each search subspace including a portion of the search space; determining controller-optimized designs for the plurality of search subspaces using a plurality of candidate controllers, wherein the plurality of candidate controllers identify the controller-optimized designs using different optimization algorithms, wherein a controller-optimized design is a hardware-neural network design; scoring the plurality of candidate controllers by determining a plurality of scores for a single candidate controller, wherein determining the plurality of scores for the single candidate controller comprises: determining a first score for the single candidate controller based on one or more controller-optimized designs for a first search subspace and an optimal subspace design for the first search subspace, and determining a second score for the single candidate controller based on one or more controller-optimized designs for a second search subspace and an optimal subspace design for the second search subspace; selecting an optimization controller from the plurality of candidate controllers based on the scoring; determining a neural network architecture and hardware architecture by applying the selected optimization controller on at least part of the search space; generating an automated computer model with the neural network architecture; and executing the automated computer model using a hardware device with the hardware architecture.
20 . The non-transitory computer-readable storage medium of claim 19 , wherein the instructions are further executable for applying a search definition for the search space to the selected optimization controller, wherein the search definition specifies an objective of a task to be performed by the automated computer model.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS The present application claims priority to Indian Provisional Patent Application No. 202141044206, filed on Sep. 29, 2021, the contents of which are hereby incorporated by reference in its entirety. TECHNICAL FIELD This disclosure relates generally to optimization of computer model architectures, and particularly to optimization of computer model architectures jointly with specialized hardware executing the computer model. BACKGROUND The industry has been facing challenges in scaling the deployment of AI algorithms (e.g., computer models) on a range of hardware. While hardware-aware model architecture search alleviates the issue of optimization of a neural network on a single hardware platform, there is a shift towards driving the performance of hardware architectures further with custom and configurable accelerators (ASICs, FPGAs) tailored for deep learning. However, research in automatically optimizing for both hardware and computer model parameters is still in early stages, with most current industry approaches being manual. In addition, there are additional challenges in making approaches generalizable, rather than to address specific, narrower challenges. As a result, it is a challenge to develop an approach that converges effectively for a broad range of hardware accelerators, addresses highly irregular spaces, generalizes to computer model and hardware design, allows arbitrary design and evaluation methods, or allow for multi-workload optimization. BRIEF DESCRIPTION OF THE DRAWINGS Embodiments will be readily understood by the following detailed description in conjunction with the accompanying drawings. To facilitate this description, like reference numerals designate like structural elements. Embodiments are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings. FIG. 1 is an example system for performing joint computer model and hardware architecture search, according to some embodiments. FIG. 2 provides a data flow overview of applying a controller to a subspace according to one embodiment. FIG. 3 shows an example of controller interfacing between various neural network controllers and hardware controllers. FIG. 4 provides examples of joint controllers that illustrate example relationships between various neural network and hardware controllers and the generated network representations. FIG. 5 shows an example flow for a particular controller, according to one embodiment. FIG. 6 shows an example flow for evaluating candidate controllers for a search space, according to one embodiment. FIG. 7 shows an additional example for search space exploration of a plurality of candidate controllers using an exploration order, according to one embodiment. FIG. 8 shows example computer model inference and computer model training. FIG. 9 illustrates an example neural network architecture. FIG. 10 is a block diagram of an example computing device that may include one or more components used for training, analyzing, or implementing a computer model in accordance with any of the embodiments disclosed herein. DETAILED DESCRIPTION Overview Described herein is a system that supports automatic optimization of any given input hardware (“HW”) and neural network (“NN”) search spaces for arbitrary tasks. Although the term “neural network” is used herein, embodiments of the present invention apply more generally to additional types of computer models. This innovation can extend optimization to both the neural network and hardware jointly as well as only to individual dimensions (neural network or hardware) depending on the user's needs. Such a system is fully configurable in its interaction between its components and, which are further easily extendable to support new components. This contrasts with all present frameworks, which focus on expert users to tailor the components of the framework to suit their requirements. Described herein are search space analyzers thus provide approaches that automatically determine optimizations which do not require manual experts to design such frameworks, as well as interfacing units that automate the flow based on the search space analyzers' decisions. This innovation enables the end user to make optimal decisions without hardware or computer model (e.g., neural network) architectural expertise or manual tuning, tasks which require significant engineering cost and domain knowledge in both neural network and hardware design. An end-to-end automated general Hardware-Neural Architecture Search solution requires a combination of efficient methods for manipulating and representing neural network and hardware architectures, and methods to recognize the most efficient algorithm to apply for exploring the chosen search space. Described herein is a system that integrates a wide array of optimization algorithms (e.g., optimization controllers), interfacing these methods along with search space analyzers to rec