Search

KR-20260068039-A - Heterogeneous Hybrid Computing System featuring Quantum Advantage Prediction-based Dynamic Task Classification, Local Closed-loop Error Correction, and GPU Digital Twins

KR20260068039AKR 20260068039 AKR20260068039 AKR 20260068039AKR-20260068039-A

Abstract

The present invention relates to a heterogeneous hybrid computing system (100) that executes a quantum-classical hybrid algorithm. A host CPU (110) actively classifies computational tasks into classical and quantum through a quantum advantage prediction module (111) and a dynamic classifier (112) and schedules them to a GPU (120) and a QPU (130), respectively. A local controller (131) within the QPU (130) performs an error correction feedback loop independently without intervention from the host CPU (110) and the GPU (120) to minimize round-trip delay. A semiconductor spin qubit array (132) and an analog CIM-based NPU (133) are 3D stacked via a TSV (134) to directly process analog measurement signals without ADC conversion and perform decoherence preemptive prediction correction through a reservoir computing layer (142). The GPU (120) maintains the digital twin noise model of the QPU (130) to pre-optimize the quantum circuit, and a low-latency direct connection channel (123) between the GPU (120) and the QPU (130) local controller (131) directly transmits variational parameters by bypassing the CPU (110). As an extended embodiment, a quantum noise fingerprint (150)-based hardware authentication, qubit phase rerouting, qRAM integrated memory hierarchy, federated quantum learning, and a dual-mode qubit sense-computation direct connection pipeline (147) are included.

Inventors

  • 안범주

Assignees

  • 안범주

Dates

Publication Date
20260513
Application Date
20260426

Claims (15)

  1. In a heterogeneous computing system that executes a quantum-classical hybrid algorithm based on user commands, A host central processing unit (CPU) that receives and analyzes the above user command, and divides the entire computation task into a classical task and a quantum task for scheduling; A graphics processing unit (GPU) that receives the classical computation task from the host CPU and performs large-scale data parallel processing; and Characterized by including a quantum processing unit (QPU) that receives the quantum computation task from the host CPU and performs exponential computation, and has a local controller internally. Heterogeneous hybrid computing system.
  2. In paragraph 1, While the host CPU controls the global data exchange and execution flow between the GPU and QPU, The above quantum processing device (QPU) is characterized by minimizing computational latency by independently performing an error correction feedback loop operation regarding the state change of qubits occurring during quantum computation through the local controller without the intervention of the host CPU and GPU. Heterogeneous hybrid computing system.
  3. In paragraph 2, The above quantum processing device (QPU) is, A semiconductor spin qubit array and, An analog Computing-In-Memory (CIM) based neural network processing unit (NPU) that functions as the local controller and directly receives an analog signal measured from the qubit array without analog-to-digital conversion (ADC) to calculate an error correction value is configured as a three-dimensional (3D) stacked hardware module, The analog NPU within the quantum processing unit (QPU) is characterized by independently performing a local closed-loop operation to correct the gate voltage of the qubit array in real time through a vertical through-hole electrode (TSV) penetrating the three-dimensional stacked structure. Heterogeneous hybrid computing system.
  4. In paragraph 2, The above GPU is equipped with a low-latency direct feed channel that directly transmits variational parameters calculated by a classical optimization routine during the execution of a variational quantum algorithm to a local controller of the QPU without passing through the host CPU. The local controller is characterized by reducing the round-trip latency of the variational iteration loop compared to passing through the host CPU by updating the quantum circuit gate parameters of the corresponding iteration immediately upon receiving the variation parameters and initiating the next quantum operation. Heterogeneous hybrid computing system.
  5. In paragraph 1, The above host CPU is, A quantum advantage prediction module that calculates estimated values for quantum circuit depth, qubit number requirements, and entanglement entropy for each operation subroutine included in the received user command; and A dynamic classifier that classifies subroutines whose calculated estimate exceeds a preset quantum advantage threshold as quantum computation tasks, and classifies subroutines whose value is below the threshold as classical computation tasks; The dynamic classifier is characterized by receiving information on the current qubit availability and error rate of the QPU in real time from the local controller and adaptively adjusting the threshold value. Heterogeneous hybrid computing system.
  6. In paragraph 3, The above analog CIM-based neural network processing unit (NPU) is, It includes a reservoir computing layer in which a fixed weight matrix learned from the past measurement history of the above qubit array is stored in a low-temperature non-volatile device, and The reservoir computing layer is characterized by receiving a current analog measurement signal of the qubit array as input, inferring a decoherence pattern predicted to occur in the next operation cycle, and stabilizing the qubit state before an error occurs by proactively applying a gate voltage correction value of the qubit array through the TSV based on the inference result. Heterogeneous hybrid computing system.
  7. In paragraph 2, The above local controller is, A speculative prefetch function that receives in advance a subsequent quantum circuit expected to be the next execution of a currently executing quantum computation task from the host CPU, and prepares qubit initialization and gate sequences; and It includes a rollback function that immediately executes the prepared circuit when an execution confirmation signal is received from the host CPU, and destroys the prepared state and re-initializes the qubit when a cancellation signal is received; Characterized by the fact that the qubit initialization waiting time due to the above-mentioned speculative prefetch is substantially hidden (latency hiding), thereby reducing the execution time of the entire quantum-classical hybrid algorithm, Heterogeneous hybrid computing system.
  8. In paragraph 2, The above heterogeneous hybrid computing system includes a plurality of quantum processing units (QPUs), and The local controller provided in each QPU is equipped with an Inter-QPU peer-to-peer (P2P) interface that communicates directly with the local controller of an adjacent QPU without the intervention of the host CPU, and By exchanging synchronization signals and Bell-state measurement results required for entanglement generation between qubits through the above P2P interface, Characterized by performing distributed quantum operations across multiple QPUs independently of the global scheduling layer managed by the host CPU. Heterogeneous hybrid computing system.
  9. In paragraph 3, The above quantum processing unit (QPU) is placed in a cryogenic operating region, and the host CPU and GPU are placed in an ambient temperature operating region. The above system is, A cryogenic optical interposer located at the boundary between the cryogenic region and the room temperature region, which optically insulates and transmits signals between the two regions; and The device further comprises a power domain separation circuit that supplies operating power to the analog NPU within the cryogenic region from an independent power rail physically separated from the room temperature region, thereby limiting the thermal load to within the coherence maintenance temperature of the qubit array. Heterogeneous hybrid computing system.
  10. In paragraph 4, The above GPU is, A twin management module that maintains and updates a digital twin noise model simulating the current noise characteristics of the QPU in real time based on qubit-specific error rate, gate fidelity, and coherence time data received from the local controller; and A circuit preprocessing module that utilizes the digital twin noise model to perform gate reordering and noise-aware decomposition to minimize fidelity loss in a predicted noise environment before transmitting the original quantum circuit received from the host CPU to the QPU; The digital twin noise model is characterized by automatically updating by receiving correction data from the local controller when the fidelity deviation from the actual measurement result of the QPU exceeds a preset threshold, thereby forming a noise model synchronization loop between the GPU and the QPU. Heterogeneous hybrid computing system.
  11. In paragraph 2, The above local controller is, A fingerprint generation module that applies a no-operation cycle to a plurality of reference qubits and measures phase error and amplitude damping patterns that spontaneously occur in each qubit during the cycle to generate a quantum noise fingerprint uniquely determined by the manufacturing process deviation and physical environment of the QPU; and A hardware authentication module that, upon receiving a hardware authentication request from the host CPU, verifies whether it is the same physical QPU by comparing a currently measured quantum noise fingerprint with an initial registered fingerprint; is included. The above quantum noise fingerprint functions as a Physical Unclonable Function (PUF) that cannot be forged by temperature, electromagnetic environment, or software, and is characterized in that when the above QPU is transferred or replaced to an unauthorized environment, the host CPU detects this and blocks the execution of operations. Heterogeneous hybrid computing system.
  12. In paragraph 3, The above local controller is, A dynamic qubit scheduler that continuously monitors the error rate of each qubit in the above qubit array and, if the error rate of a specific qubit exceeds a preset isolation threshold, excludes the qubit in real time from the computation qubit pool; and It includes a topology re-routing engine that bypasses the excluded qubits to remap the physical implementation location of a logical qubit to an adjacent available qubit via the TSVs, and resynthesizes a subsequent gate sequence in real time to match the phase of the remapped qubits; The dynamic qubit scheduler and phase rerouting engine described above operate autonomously at the local controller stage without requesting recompilation from the host CPU and GPU, thereby characterized in that the execution of the quantum algorithm continues without interruption even in the event of a qubit error cluster. Heterogeneous hybrid computing system.
  13. In paragraph 1, The above heterogeneous hybrid computing system further includes a Quantum Random Access Memory (qRAM) module, and The above qRAM module is connected via a dual-port to the high-speed memory bus of the GPU and the local controller of the QPU, respectively. The above GPU converts large-scale classical data into a quantum amplitude encoding format and writes it to the above qRAM, and the above QPU loads data from the above qRAM in batches in a superposition state and utilizes it for quantum search or quantum principal component analysis (Quantum PCA) computations. The host CPU is characterized by providing a unified memory abstraction layer that manages the address space of the qRAM as an extension of the classical virtual memory address space, thereby enabling classical applications to utilize the qRAM with standard memory access commands without a separate quantum programming interface. Heterogeneous hybrid computing system.
  14. In paragraph 8, Each of the plurality of above-mentioned QPUs is, A local gradient reporting module that extracts a quantum gradient calculated by executing a variational quantum circuit on a local data subset under its charge, adds quantum differential privacy noise to the quantum gradient, and transmits it to a designated aggregation QPU via the P2P interface; The local controller of the above aggregation QPU aggregates multiple received noise-added gradient values using a superposition averaging method utilizing quantum interference to calculate the global gradient, and The above-described global gradient is transmitted to the classical optimization routine of the GPU via the host CPU to update variational parameters, thereby configuring a federated quantum learning system that jointly learns a quantum model while preserving privacy without concentrating data distributed across multiple QPUs into a central server. Heterogeneous hybrid computing system.
  15. In paragraph 2, The qubit array within the above QPU is composed of dual-mode qubits capable of dynamically switching between computing mode and sensing mode, and The above local controller is, A quantum detection management module that switches a specified subset of qubits to a detection mode in response to a detection command from the host CPU, and operates the qubits as quantum sensors coupled to an external physical quantity to acquire a measurement value; and It includes a sense-to-compute direct pipeline that directly transmits the measurement result obtained by the above quantum sensing to the analog NPU in an analog form without ADC conversion, and the analog NPU immediately initiates a quantum computation task in which the measurement result is set as an initial condition; The above dual-mode switching and sense-computation direct-link pipeline is characterized by the collection of external environment information and the execution of quantum algorithms based thereon being performed continuously without delay within a single QPU. Heterogeneous hybrid computing system.

Description

Heterogeneous Hybrid Computing System featuring Quantum Advantage Prediction-based Dynamic Task Classification, Local Closed-loop Error Correction, and GPU Digital Twins The present invention relates to a heterogeneous computing system that executes a quantum-classical hybrid algorithm. More specifically, the invention relates to a heterogeneous hybrid computing system comprising a structure in which a host central processing unit (CPU) dynamically classifies and divides computational tasks into classical computational tasks and quantum computational tasks and schedules them to a graphics processing unit (GPU) and a quantum processing unit (QPU), respectively, a local controller provided within the QPU independently performs an error correction feedback loop without intervention from the CPU and GPU, and the GPU maintains and updates the digital twin noise model of the QPU in real time to preprocess quantum circuits. Modern high-performance computing (HPC) environments have evolved to handle various computational loads by combining heterogeneous processors such as CPUs, GPUs, and FPGAs. However, problems requiring exploration of exponential state spaces, such as molecular simulation, combinatorial optimization, and cryptography, face fundamental limitations when relying solely on classical computing resources. Quantum Processing Units (QPUs) are attracting attention as next-generation accelerators because they can provide exponential computational superiority over classical computing for specific problem types by utilizing the principles of quantum superposition and entanglement. However, conventional quantum-classical hybrid systems are limited to the CPU merely delegating computational tasks to the QPU and lack an intelligent scheduling structure capable of actively classifying and partitioning entire algorithmic tasks between classical and quantum computation paradigms. Consequently, inefficiencies frequently occur where tasks that could be processed more efficiently on classical processors are unnecessarily passed to the QPU, or conversely, computations with clear quantum advantages are processed using the classical method. In addition, conventional systems adopt a structure that communicates with a host CPU or GPU to correct qubit errors occurring within the QPU. In such a structure, round-trip latency inevitably occurs, where measurement results are transmitted to the host, error correction operations are performed on the host, and the results are fed back to the QPU. Since the coherence time of a qubit is only at the level of tens of microseconds to a few milliseconds, if this round-trip latency exceeds the coherence time, the error correction itself becomes meaningless and the reliability of the operation is significantly reduced. Furthermore, conventional GPUs have remained limited to the role of classical computational accelerators and have failed to perform the active role of identifying the real-time noise characteristics of the QPU and reflecting them to pre-optimize quantum circuits. Consequently, quantum circuits delivered to the QPU are executed without reflecting the noise environment of actual physical qubits, resulting in a loss of fidelity and requiring repeated additional error mitigation operations to compensate for this. Furthermore, conventional systems fail to provide a systematic structure for QPU hardware authentication, dynamic reconfiguration of qubit phases, management of quantum-classical integrated memory layers, privacy-preserving distributed learning among multiple QPUs, and dual-mode operation of qubits; thus, an integrated architecture is urgently required for the practical application of next-generation quantum-classical hybrid computing. Figure 1 is a block diagram of the entire configuration of a heterogeneous hybrid computing system (100). Figure 2 is a flowchart of the task reception, analysis, division, and scheduling operations of the host CPU (110). FIG. 3 is a hierarchy diagram of global data exchange and local control loops between CPU (110), GPU (120), and QPU (130). FIG. 4 is a timing diagram of the error correction feedback loop operation of the local controller (131) in the QPU (130). FIG. 5 is a 3D stacked cross-sectional view of a semiconductor spin qubit array (132) and an analog NPU (133). Figure 6 is a real-time gate voltage correction signal path diagram through the TSV (134). FIG. 7 is a VQE iterative loop diagram through a GPU (120)-QPU (130) low-latency direct connection channel (123). FIG. 8 is a conceptual diagram of the dynamic classification criteria and adaptive threshold adjustment of the CPU (110) quantum advantage prediction module (111). FIG. 9 is a decoherence prediction and preemptive correction signal flow diagram of the analog NPU (133) reservoir computing layer (142). FIG. 10 is a state transition diagram of a local controller (131) and a speculative prefetch module (139). FIG. 11 is a diagram of an entanglement coordination protocol t