US-20260127481-A1 - PARALLEL DECODING FOR QUANTUM ERROR CORRECTING CODES
Abstract
Methods, systems, and apparatus for parallel decoding for quantum error correction codes. In one aspect, a classical computer system is configured to implement a decoding process on measurement data received from a quantum computing system to determine errors in a quantum computation. The classical computing system implements a main thread, multiple worker threads, and a data structure common to each worker thread. The data structure stores data of a dynamic system of disjoint clusters of nodes of a detector graph for the decoding process, where the data includes compressed logical flip information of child nodes in each cluster of nodes. During execution of the decoding process, the multiple worker threads are configured to, in parallel: obtain clusters of nodes and modify the clusters of nodes, where, for each modification, the worker thread updates data in the data structure that corresponds to the cluster under an atomicity primitive.
Inventors
- Noah John Shutty
Assignees
- GOOGLE LLC
Dates
- Publication Date
- 20260507
- Application Date
- 20241119
Claims (20)
- 1 . A system comprising: a classical computer system configured to implement a worker thread of a decoding process on measurement data received from a quantum computing system to determine errors in a quantum computation implemented by the quantum computing system, wherein the worker thread is one of a plurality of worker threads; a data structure on the classical computer system, wherein each other worker thread also has a data structure common to the data structure on the classical computer system, wherein the data structure stores data of a dynamic system of disjoint clusters of nodes of a detector graph for the decoding process, wherein: each cluster of nodes comprises a root node that has no ascendant node and one or more descending child nodes, wherein the child nodes include leaf nodes with no descending nodes or one or more descending child nodes; and the data comprises compressed logical flip information of child nodes in each cluster of nodes; wherein during execution of the decoding process, the worker thread is configured to, in parallel with each of the other worker threads: obtain, from a main thread running on a data processing apparatus, one or more clusters of nodes in the detector graph, wherein each cluster of nodes comprises one or more detection events in the measurement data; and execute the decoding process to modify the one or more clusters of nodes, wherein, for each modification of a cluster, the worker thread updates data in the data structure that corresponds to the cluster.
- 2 . The system of claim 1 , wherein the data structure stores, at a root node of each cluster of nodes, data specifying one or more of: a node type specifier, a parity of detection events in the cluster of nodes, a total size of the cluster of nodes, boundary nodes of the cluster of nodes, a minimum time coordinate, and a maximum time coordinate.
- 3 . The system of claim 1 , wherein the classical computing system further comprises one or more memory regions external to the data structure, wherein each memory region stores a boundary map for a respective cluster of nodes.
- 4 . The system of claim 3 , wherein data objects in the memory region are stored under a conservation law, wherein under the conservation law, data in the data objects are popped off a child node memory before push updates add the data to objects associated with a root node.
- 5 . The system of claim 1 , wherein the compressed logical flip information of a child node included in a respective cluster of nodes is stored at the child node.
- 6 . The system of claim 1 , wherein compressed logical flip information of a child node comprises a parity of logical flips along a decoding graph path from the child node to its parent node, and wherein the compressed logical flip information is used with cluster parity information to determine a parity flip applied to a logical observable by a fusion of two clusters.
- 7 . The system of claim 1 , wherein the classical computing system is further configured to recover a decoding output of the decoding process, comprising using data in the data structure to: compute, for each node in the detector graph that is associated with a detection event, a net logical flip on a path to a respective root node of the node; and compute a total parity of the net logical flips.
- 8 . The system of claim 1 , wherein the decoding process comprises a union-find or minimum weight perfect matching decoding process.
- 9 . The system of claim 1 , wherein the data structure is lock-free.
- 10 . The system of claim 1 , wherein, for each modification of a cluster, the worker thread updates data in the data structure that corresponds to the cluster under an atomicity primitive.
- 11 . The system of claim 10 , wherein the atomicity primitive comprises a compare-and-swap atomicity primitive or an atomic pool allocator with reference counting.
- 12 . The system of claim 1 , wherein updates of data in the data structure that correspond to a respective modification of a cluster are packed into a single word.
- 13 . The system of claim 12 , wherein the single word comprises a pointer to external atomic data.
- 14 . A computer implemented method comprising: obtaining, by a worker thread of multiple other worker threads and from a main thread, a cluster of nodes that comprises one or more detection events in measurement data received from a quantum computing system, wherein the cluster of nodes comprises a root node that has no ascendant node and one or more descending child nodes, the child nodes including leaf nodes with no descending nodes or one or more descending child nodes; executing a decoding process on a detector graph to modify the obtained cluster of nodes, comprising, for each modification, updating, in a data structure, data that corresponds to the cluster, wherein the data structure stores data of a dynamic system of disjoint clusters of nodes of the detector graph and the data structure is common to the worker thread and the other worker threads, and the data comprises compressed logical flip information of child nodes in each cluster of nodes; and recovering a decoding output of the decoding process using the updated data in the data structure.
- 15 . The method of claim 14 , wherein recovering the decoding output of the decoding process using the updated data in the data structure comprises: computing, for each node in the detector graph that is associated with a detection event, a net logical flip on a path to a respective root node of the node; and computing a total parity of the net logical flips.
- 16 . The method of claim 15 , wherein the data structure stores, at a root node of each cluster of nodes, data specifying one or more of: a node type specifier, a parity of detection events in the cluster of nodes, a total size of the cluster of nodes, boundary nodes of the cluster of nodes, a minimum time coordinate, and a maximum time coordinate.
- 17 . The method of claim 14 , wherein compressed logical flip information of a child node comprises a parity flip applied to a logical observable by an operator with a 0-boundary that comprises the child node and a parent node in a corresponding graph tree.
- 18 . The method of claim 14 , wherein the updating of the data that corresponds to the cluster is performed under an atomicity primitive.
- 19 . The method of claim 18 , wherein the atomicity primitive comprises a compare-and-swap atomicity primitive or an atomic pool allocator with reference counting.
- 20 . The method of claim 14 , wherein updating the data that corresponds to the cluster comprises packing the updates into a single word, wherein the single word comprises a pointer to external atomic data.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS This application is a continuation application and claims priority under 35 U.S.C. § 120 to U.S. patent application Ser. No. 18/329,377, filed on Jun. 5, 2023. The disclosure of the foregoing application is incorporated herein by reference in its entirety for all purposes. BACKGROUND This specification relates to quantum computing. Quantum computing provides a means to solve certain problems that cannot be solved in a reasonable period of time using conventional classical computers. These problems include factoring very large numbers into their primes and searching large, unstructured data sets. A number of physical systems are being explored for their use in quantum computing, including ions, spins in semiconductors, and superconducting circuits. However, these systems do not perform sufficiently well to serve directly as computational qubits. For example, single two-state physical systems, which can be used as physical qubits, cannot reliably encode and retain information for long enough to be useful, e.g., due to noise. Quantum error correction is a technology that can enable a quantum computer to reliably execute a quantum algorithm despite noise afflicting its qubits. A decoder is a key component of quantum error correction schemes whose role is to identify errors faster than they accumulate in the quantum computer. The decoder takes as an input a syndrome, which is measurement data extracted from quantum parity check measurements, and returns as output an estimation of error. Given this estimation, the effect of the error can be reversed. Decoders should be implemented with minimum hardware resources in order to scale to the regime of practical applications of quantum computing. SUMMARY This specification describes methods, systems and apparatus for parallel decoding for quantum error correcting codes. One innovative aspect of the subject matter described in this specification can be implemented in a classical computer system configured to implement a decoding process on measurement data received from a quantum computing system to determine errors in a quantum computation implemented by the quantum computing system, wherein the classical computing system implements: a main thread; multiple worker threads; and a data structure common to each worker thread of the multiple worker threads, wherein the data structure stores data of a dynamic system of disjoint clusters of nodes of a detector graph for the decoding process, wherein: each cluster of nodes comprises a root node that has no ascendant node and one or more descending child nodes, wherein the child nodes include leaf nodes with no descending nodes or one or more descending child nodes; and the data comprises compressed logical flip information of child nodes in each cluster of nodes; wherein during execution of the decoding process, each of the multiple worker threads is configured to, in parallel with each of the other worker threads: obtain, from the main thread, one or more clusters of nodes in the detector graph, wherein each cluster of nodes comprises one or more detection events in the measurement data; and execute the decoding process to modify the one or more clusters of nodes, wherein, for each modification of a cluster, the worker thread updates data in the data structure that corresponds to the cluster under an atomicity primitive. The classical computer system can be configured to perform particular operations or actions by virtue of having software, firmware, hardware, or a combination thereof installed on the system that in operation causes or cause the system to perform the actions. One or more computer programs can be configured to perform particular operations or actions by virtue of including instructions that, when executed by data processing apparatus, cause the apparatus to perform the actions. The foregoing and other implementations can each optionally include one or more of the following features, alone or in combination. In some implementations the data structure stores, at a root node of each cluster of nodes, data specifying one or more of: a node type specifier, a parity of detection events in the cluster of nodes, a total size of the cluster of nodes, boundary nodes of the cluster of nodes, a minimum time coordinate, and a maximum time coordinate. In some implementations the classical computing system further comprises one or more memory regions external to the data structure, wherein each memory region stores a boundary map for a respective cluster of nodes. In some implementations data objects in the memory region are stored under a conservation law, wherein under the conservation law data in the data objects are popped off a child node memory before push updates add the data to objects associated with a root node. In some implementations the compressed logical flip information of a child node included in a respective cluster of nodes is stored at the child node. In some implem