Search

CN-121981176-A - Modifying a computational graph

CN121981176ACN 121981176 ACN121981176 ACN 121981176ACN-121981176-A

Abstract

The present invention relates to modifying a computational graph. Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for modifying a computational graph to include a sending node and a receiving node. By inserting a transmitting node and a receiving node into each sub-graph, communications between unique devices performing the operations of different sub-graphs of the computational graph can be efficiently handled. When executed, the operations represented by these transmitting and receiving nodes may enable unique device pairs to communicate with each other in a self-sufficient manner. This removes the burden of collaborative communication from the backend, which provides an opportunity for a system that processes such computational graph representations to perform one or more other processes while the device is executing the subgraph.

Inventors

  • Vijay vasudevan
  • DEAN JEFFREY ADGATE
  • Sanjay Gehmawat

Assignees

  • 谷歌有限责任公司

Dates

Publication Date
20260505
Application Date
20161028
Priority Date
20151028

Claims (18)

  1. 1. A computer-implemented method, comprising: a request to perform an operation represented by two or more nodes in a first sub-graph of a plurality of sub-graphs of a computational graph is received by a first component from a plurality of components in the system, wherein: the plurality of sub-graphs of the computational graph are distributed among different ones of the plurality of components of the system, such that at least two of the different components are each configured to process sub-graphs that are different from one another, The computational graph includes a plurality of nodes and a plurality of directed edges, each node from the plurality of nodes representing a respective operation, each directed edge of the plurality of directed edges connecting a respective first node to a respective second node representing an operation that receives as input an output of the operation represented by the respective first node; The plurality of directed edges including one or more cross-component directed edges in the computational graph, each of the one or more cross-component directed edges being a directed edge connecting a respective first node to a respective second node assigned to a component different from the component to which the respective first node is assigned, the one or more cross-component directed edges including a first cross-component directed edge connecting a first particular node in the first sub-graph assigned to the first component with a second particular node in a second sub-graph of the plurality of sub-graphs, the second sub-graph assigned to a second different component from the plurality of components in the system, and For each of the one or more component-spanning directed edges, the computation graph includes a) a sending node between the respective first node and the respective second node in the computation graph, and b) a receiving node between the sending node and the respective second node in the computation graph, wherein the second subgraph assigned to the second distinct component is modified to include a first sending node on the first component-spanning directed edge between the first particular node and the second particular node, the first subgraph for the first component is modified to include a first receiving node on the first component-spanning directed edge between the first particular node and a first receiving node, and the first receiving node represents an operation to receive data output from the first sending node by the second particular node assigned to the second distinct component, and Receiving data output by the second particular node assigned to the second different component using the operation represented by the first receiving node from the second different component using the operation represented by the first transmitting node, and The data output by the second particular node is provided as input to the operation represented by the first particular node using the operation represented by the first receiving node.
  2. 2. The method according to claim 1, wherein: the first component includes a first device from a plurality of devices in the system; The second distinct component includes a second distinct device from the plurality of devices of the system; Receiving the data includes receiving, by the first device, data output by the second particular node assigned to the second different device from the second different device using the operation represented by the first transmitting node using the operation represented by the first receiving node, and Providing the data includes providing, by the first device, data output by the second particular node as input to the operation represented by the first particular node using the operation represented by the first receiving node.
  3. 3. The method of claim 1, wherein receiving data output by the second particular node comprises receiving, from the second different component, a tensor output by the second particular node assigned to the second different component.
  4. 4. The method according to claim 1, comprising: Transmitting a request to the second different component for output of the operation represented by the first transmitting node using the operation represented by the first receiving node, wherein receiving data output by the second particular node assigned to the second different component from the second different component using the operation represented by the first transmitting node is in response to transmitting the request to output of the operation represented by the first transmitting node.
  5. 5. The method according to claim 1, wherein: The first component includes first hardware resources included in one or more first machines; the second different component includes a second hardware resource included in one or more second machines, and Receiving data output by the second particular node includes receiving, by the first component, data output by the second particular node assigned to the second different component from the second different component according to a protocol corresponding to each of the first hardware resource and the second hardware resource.
  6. 6. The method of claim 1, wherein receiving data output by the second particular node comprises receiving data output by the second particular node assigned to the second different component using the operation represented by the first receiving node from the second different component using the operation represented by the first sending node independent of operations represented by other sending nodes and other receiving nodes for cross-component directed edges other than the first cross-component directed edge.
  7. 7. A system comprising a plurality of components at one or more locations, the one or more components being implemented on one or more computers having instructions stored thereon, the instructions being operable to cause the system to perform acts comprising: Receiving, by a first component from the plurality of components in the system, a request to perform an operation represented by two or more nodes in a first sub-graph of a plurality of sub-graphs of a computational graph, wherein: the plurality of sub-graphs of the computational graph are distributed among different ones of the plurality of components of the system, such that at least two of the different components are each configured to process sub-graphs that are different from one another, The computational graph includes a plurality of nodes and a plurality of directed edges, each node from the plurality of nodes representing a respective operation, each directed edge of the plurality of directed edges connecting a respective first node to a respective second node representing an operation that receives as input an output of the operation represented by the respective first node; The plurality of directed edges including one or more cross-component directed edges in the computational graph, each of the one or more cross-component directed edges being a directed edge connecting a respective first node to a respective second node assigned to a component different from the component to which the respective first node is assigned, the one or more cross-component directed edges including a first cross-component directed edge connecting a first particular node in the first sub-graph assigned to the first component with a second particular node in a second sub-graph of the plurality of sub-graphs, the second sub-graph assigned to a second different component from the plurality of components in the system, and For each of the one or more component-spanning directed edges, the computation graph includes a) a sending node between the respective first node and the respective second node in the computation graph, and b) a receiving node between the sending node and the respective second node in the computation graph, wherein the second subgraph assigned to the second distinct component is modified to include a first sending node on the first component-spanning directed edge between the first particular node and the second particular node, the first subgraph for the first component is modified to include a first receiving node on the first component-spanning directed edge between the first particular node and a first receiving node, and the first receiving node represents an operation to receive data output from the first sending node by the second particular node assigned to the second distinct component, and Receiving data output by the second particular node assigned to the second different component using the operation represented by the first receiving node from the second different component using the operation represented by the first transmitting node, and The data output by the second particular node is provided as input to the operation represented by the first particular node using the operation represented by the first receiving node.
  8. 8. The system of claim 7, wherein: the first component includes a first device from a plurality of devices in the system; The second distinct component includes a second distinct device from the plurality of devices of the system; Receiving the data includes receiving, by the first device, data output by the second particular node assigned to the second different device from the second different device using the operation represented by the first transmitting node using the operation represented by the first receiving node, and Providing the data includes providing, by the first device, data output by the second particular node as input to the operation represented by the first particular node using the operation represented by the first receiving node.
  9. 9. The system of claim 7, wherein receiving data output by the second particular node comprises receiving, from the second different component, a tensor output by the second particular node assigned to the second different component.
  10. 10. The system of claim 7, the acts comprising: Transmitting a request to the second different component for output of the operation represented by the first transmitting node using the operation represented by the first receiving node, wherein receiving data output by the second particular node assigned to the second different component from the second different component using the operation represented by the first transmitting node is in response to transmitting the request to output of the operation represented by the first transmitting node.
  11. 11. The system of claim 7, wherein: The first component includes first hardware resources included in one or more first machines; the second different component includes a second hardware resource included in one or more second machines, and Receiving data output by the second particular node includes receiving, by the first component, data output by the second particular node assigned to the second different component from the second different component according to a protocol corresponding to each of the first hardware resource and the second hardware resource.
  12. 12. The system of claim 7, wherein receiving data output by the second particular node comprises receiving data output by the second particular node assigned to the second different component using the operation represented by the first receiving node from the second different component using the operation represented by the first sending node independent of operations represented by other sending nodes and other receiving nodes for cross-component directed edges other than the first cross-component directed edge.
  13. 13. A non-transitory computer-readable storage device having instructions stored thereon that, when executed, cause performance of acts comprising: a request to perform an operation represented by two or more nodes in a first sub-graph of a plurality of sub-graphs of a computational graph is received by a first component from a plurality of components in the system, wherein: the plurality of sub-graphs of the computational graph are distributed among different ones of the plurality of components of the system, such that at least two of the different components are each configured to process sub-graphs that are different from one another, The computational graph includes a plurality of nodes and a plurality of directed edges, each node from the plurality of nodes representing a respective operation, each directed edge of the plurality of directed edges connecting a respective first node to a respective second node representing an operation that receives as input an output of the operation represented by the respective first node; The plurality of directed edges including one or more cross-component directed edges in the computational graph, each of the one or more cross-component directed edges being a directed edge connecting a respective first node to a respective second node assigned to a component different from the component to which the respective first node is assigned, the one or more cross-component directed edges including a first cross-component directed edge connecting a first particular node in the first sub-graph assigned to the first component with a second particular node in a second sub-graph of the plurality of sub-graphs, the second sub-graph assigned to a second different component from the plurality of components in the system, and For each of the one or more component-spanning directed edges, the computation graph includes a) a sending node between the respective first node and the respective second node in the computation graph, and b) a receiving node between the sending node and the respective second node in the computation graph, wherein the second subgraph assigned to the second distinct component is modified to include a first sending node on the first component-spanning directed edge between the first particular node and the second particular node, the first subgraph for the first component is modified to include a first receiving node on the first component-spanning directed edge between the first particular node and a first receiving node, and the first receiving node represents an operation to receive data output from the first sending node by the second particular node assigned to the second distinct component, and Receiving data output by the second particular node assigned to the second different component using the operation represented by the first receiving node from the second different component using the operation represented by the first transmitting node, and The data output by the second particular node is provided as input to the operation represented by the first particular node using the operation represented by the first receiving node.
  14. 14. The computer-readable storage device of claim 13, wherein: the first component includes a first device from a plurality of devices in the system; The second distinct component includes a second distinct device from the plurality of devices of the system; Receiving the data includes receiving, by the first device, data output by the second particular node assigned to the second different device from the second different device using the operation represented by the first transmitting node using the operation represented by the first receiving node, and Providing the data includes providing, by the first device, data output by the second particular node as input to the operation represented by the first particular node using the operation represented by the first receiving node.
  15. 15. The computer-readable storage device of claim 13, wherein receiving data output by the second particular node comprises receiving, from the second different component, a tensor output by the second particular node assigned to the second different component.
  16. 16. The computer-readable storage device of claim 13, the acts comprising: Transmitting a request to the second different component for output of the operation represented by the first transmitting node using the operation represented by the first receiving node, wherein receiving data output by the second particular node assigned to the second different component from the second different component using the operation represented by the first transmitting node is in response to transmitting the request to output of the operation represented by the first transmitting node.
  17. 17. The computer-readable storage device of claim 13, wherein: The first component includes first hardware resources included in one or more first machines; the second different component includes a second hardware resource included in one or more second machines, and Receiving data output by the second particular node includes receiving, by the first component, data output by the second particular node assigned to the second different component from the second different component according to a protocol corresponding to each of the first hardware resource and the second hardware resource.
  18. 18. The computer-readable storage device of claim 13, wherein receiving data output by the second particular node comprises receiving data output by the second particular node assigned to the second different component using the operation represented by the first receiving node from the second different component using the operation represented by the first sending node independent of operations represented by other sending nodes and other receiving nodes for cross-component directed edges other than the first cross-component directed edge.

Description

Modifying a computational graph Description of the division The application belongs to a divisional application of Chinese application patent application No. 201680063358.6 with the application date of 2016, 10 and 28. Background The present description relates to modifying a computational graph representing a neural network and/or the use of the modified computational graph in processing model inputs. Neural networks are machine learning models that employ one or more layers of models to generate an output, e.g., one or more classifications, for a received input. Some neural networks include one or more hidden layers in addition to the output layer. The output of each hidden layer is used as an input to the next layer in the network (i.e., the next hidden layer or output layer of the network). Each layer in the network generates an output from the received input based on the current values of the layer's respective parameter set. The layers of the neural network may be handled by a single device. The device may have a processor that performs an operation such as generating an output of the layer from the input, and storing the output from the operation in a memory. Since the operations that are typically required to generate output in a neural network are of large number and size, one device may take a significant amount of time to process the layers of the neural network. Disclosure of Invention In general, this specification describes systems for modifying computational graphs representing neural networks and other machine learning models. Particular embodiments of the subject matter described in this specification can be implemented to realize one or more of the following advantages. The operation of the neural network (e.g., the operation of generating inferences from inputs or training the neural network) may be represented as a computational graph of nodes and directed edges. The system processes such computational graph representations to efficiently perform the operation of the neural network. Illustratively, the subgraphs of the computational graph may be assigned to unique devices, each of which performs operations in the respective subgraph to reduce the overall time required to perform neural network operations. By inserting a transmitting node and a receiving node into each sub-graph, communications between unique devices performing the operations of different sub-graphs of the computational graph can be efficiently handled. When executed, the operations represented by these transmitting and receiving nodes may enable unique device pairs to communicate with each other in a self-sufficient manner. This removes the burden of collaborative communication from the backend, which provides an opportunity for a system that processes such computational graph representations to perform one or more other processes while the device is executing the subgraph. The transmitting node and the receiving node serve to divide the sub-graphs in a way that allows the neural network or a part of the neural network represented by these sub-graphs to be trained on one device and then assigned to another device. For at least these reasons, modifying the computational graph to include a transmitting node and receiving node pair may help reduce the time costs and network traffic required to process the computational graph in a distributed manner. In one aspect, the subject matter described in this specification can be embodied in a method that can include acts of obtaining data representing a computational graph that includes a plurality of nodes and directed edges, wherein each node represents a respective operation, and wherein each directed edge connects a respective first node to a respective second node that represents an operation that receives as input an output of the operation represented by the respective first node, obtaining data identifying an assignment across the plurality of devices that assigns each node in the computational graph to a respective device in the plurality of devices, identifying one or more directed edges in the computational graph that each is a directed edge connecting the respective first node to the respective second node, the respective second node being assigned in the assignment to a device different from the respective first node, and for each directed edge in the computation graph, assigning an assignment between the respective first node in the computational graph and a respective second node in the computation graph and a sending device inserted between the respective first node and the respective second node in the computation graph and the receiving node and the assigned to the same node in the receiving node according to the same assignment. These methods may be computer-implemented methods. In one embodiment, the method further includes receiving model inputs and processing the model inputs according to the operations represented by the modified computational graph. In anoth