Search

US-12627487-B1 - End-to-end hardware acceleration for ZKP from witness generation to proof generation

US12627487B1US 12627487 B1US12627487 B1US 12627487B1US-12627487-B1

Abstract

A local buffer is integrated with a witness generator and a proof generator on a cryptographic processor and is separate from host memory accessed by a host processor operating with the cryptographic processor in a proving computing system. The witness generator: receives, from software program running on the host processor, compiled code of a zero-knowledge-proof (ZKP) program and specific input to the ZKP program; executes the ZKP program by way of executing the compiled code; records specific output generated from the ZKP program with the specific input, intermediate variable values, and the specific input, as a specific witness of executing the ZKP program; stores the specific witness in the local buffer. The proof generator: receives, from the software program running on the host processor, a proving key; accesses the specific witness in the local buffer; generates a specific zero-knowledge proof for executing the ZKP program with the specific input.

Inventors

  • Patrick Xu
  • Minglei Wang
  • Sidong Li
  • De Vu
  • Saptadeep Pal
  • Lei Chang

Assignees

  • Auradine, Inc.

Dates

Publication Date
20260512
Application Date
20240903

Claims (20)

  1. 1 . A computer-implemented method comprising: receiving, by a verification system, a verification key that was generated along with a proving key; receiving, by the verification system, a specific proof generated by a proving system from an execution of a cryptographic program; wherein the proving system comprises a host processor and a cryptographic engine on a semiconductor device, the cryptographic engine includes a witness generator, a proof generator and a local witness buffer; wherein the cryptographic engine is separate from the host processor; wherein the local witness buffer in the cryptographic engine is separate from host memory accessed by the host processor; wherein the witness generator in the cryptographic engine (a) receives, from the host processor, compiled code of the cryptographic program and specific input to the cryptographic program, (b) executes the cryptographic program by way of executing the compiled code, (c) records specific output generated from the cryptographic program, intermediate variable values, and the specific input, as a specific witness of executing the cryptographic program, and (d) stores the specific witness in the local witness buffer in the cryptographic engine; wherein the specific witness has a data size exceeding a maximum supported memory access bandwidth of the host memory for a single clock cycle; wherein the proof generator in the cryptographic engine (e) receives, from the host processor, the proving key that was generated along with the verification key, (f) accesses the specific witness in the local witness buffer in the cryptographic engine, wherein the specific witness is accessed in its entirety by the proof generator over first clock cycles fewer than second clock cycles over which the host memory stored data of the same data size is accessed from the host memory and (g) generates the specific proof for executing the cryptographic program with a combination of the proving key, the specific witness and the specific input; executing, by the verification system, a verification algorithm to verify whether the specific proof received is valid.
  2. 2 . The method of claim 1 , wherein the verification key is generated concurrently with the proving key in a trusted setup.
  3. 3 . The method of claim 1 , wherein the cryptographic program accesses private data not shared by the proving system with other computing systems including the verification system that validates the specific proof.
  4. 4 . The method of claim 1 , wherein the compiled code defines a circuit formed by a plurality of gates derived from the compiled code.
  5. 5 . The method of claim 1 , wherein the cryptographic engine comprises a plurality of computation logics for performing parallel computations relating to one or more of: fast Fourier transformation, number theoretic transformation, or multi-scalar multiplication.
  6. 6 . The method of claim 1 , wherein the cryptographic engine is implemented as one of: a single semiconductor device, a single integrated circuit, or a single semiconductor chip that includes one or more chiplets.
  7. 7 . The method of claim 1 , wherein the cryptographic engine is a zero-knowledge-proof (ZKP) program.
  8. 8 . A verification system comprising: a processor; and a memory, wherein the memory stored thereon executable instructions that when executed by the processor, cause the processor to: receive a verification key that was generated along with a proving key; receive a specific proof generated by a proving system from an execution of a cryptographic program; wherein the proving system comprises a host processor and a cryptographic engine on a semiconductor device, the cryptographic engine includes a witness generator, a proof generator and a local witness buffer; wherein the cryptographic engine is separate from the host processor; wherein the local witness buffer in the cryptographic engine is separate from host memory accessed by the host processor; wherein the witness generator in the cryptographic engine (a) receives, from the host processor, compiled code of the cryptographic program and specific input to the cryptographic program, (b) executes the cryptographic program by way of executing the compiled code, (c) records specific output generated from the cryptographic program, intermediate variable values, and the specific input, as a specific witness of executing the cryptographic program, and (d) stores the specific witness in the local witness buffer in the cryptographic engine; wherein the specific witness has a data size exceeding a maximum supported memory access bandwidth of the host memory for a single clock cycle; wherein the proof generator in the cryptographic engine (e) receives, from the host processor, the proving key that was generated along with the verification key, (f) accesses the specific witness in the local witness buffer in the cryptographic engine, wherein the specific witness is accessed in its entirety by the proof generator over first clock cycles fewer than second clock cycles over which the host memory stored data of the same data size is accessed from the host memory and (g) generates the specific proof for executing the cryptographic program with a combination of the proving key, the specific witness and the specific input; execute a verification algorithm to verify whether the specific proof received is valid.
  9. 9 . The apparatus of claim 8 , wherein the verification key is generated concurrently with the proving key in a trusted setup.
  10. 10 . The apparatus of claim 8 , wherein the cryptographic program accesses private data not shared by the proving system with other computing systems including the verification system that validates the specific proof.
  11. 11 . The apparatus of claim 8 , wherein the compiled code defines a circuit formed by a plurality of gates derived from the compiled code.
  12. 12 . The apparatus of claim 8 , wherein the cryptographic engine comprises a plurality of computation logics for performing parallel computations relating to one or more of: fast Fourier transformation, number theoretic transformation, or multi-scalar multiplication.
  13. 13 . The apparatus of claim 8 , wherein the cryptographic engine is implemented as one of: a single semiconductor device, a single integrated circuit, or a single semiconductor chip that includes one or more chiplets.
  14. 14 . The apparatus of claim 8 , wherein the cryptographic engine is a zero-knowledge-proof (ZKP) program.
  15. 15 . A non-transitory computer readable medium, storing software instructions, which when executed by one or more processors cause the one or more processors to perform operations comprising: receive, by a verification system, a verification key that was generated along with a proving key; receive, by the verification system, a specific proof generated by a proving system from an execution of a cryptographic program; wherein the proving system comprises a host processor and a cryptographic engine on a semiconductor device, the cryptographic engine includes a witness generator, a proof generator and a local witness buffer; wherein the cryptographic engine is separate from the host processor; wherein the local witness buffer in the cryptographic engine is separate from host memory accessed by the host processor; wherein the witness generator in the cryptographic engine (a) receives, from the host processor, compiled code of the cryptographic program and specific input to the cryptographic program, (b) executes the cryptographic program by way of executing the compiled code, (c) records specific output generated from the cryptographic program, intermediate variable values, and the specific input, as a specific witness of executing the cryptographic program, and (d) stores the specific witness in the local witness buffer in the cryptographic engine; wherein the specific witness has a data size exceeding a maximum supported memory access bandwidth of the host memory for a single clock cycle; wherein the proof generator in the cryptographic engine (e) receives, from the host processor, the proving key that was generated along with the verification key, (f) accesses the specific witness in the local witness buffer in the cryptographic engine, wherein the specific witness is accessed in its entirety by the proof generator over first clock cycles fewer than second clock cycles over which the host memory stored data of the same data size is accessed from the host memory and (g) generates the specific proof for executing the cryptographic program with a combination of the proving key, the specific witness and the specific input; executing, by the verification system, a verification algorithm to verify whether the specific proof received is valid.
  16. 16 . The non-transitory computer readable medium of claim 15 , wherein the verification key is generated concurrently with the proving key in a trusted setup.
  17. 17 . The non-transitory computer readable medium of claim 15 , wherein the cryptographic program accesses private data not shared by the proving system with other computing systems including the verification system that validates the specific proof.
  18. 18 . The non-transitory computer readable medium of claim 15 , wherein the compiled code defines a circuit formed by a plurality of gates derived from the compiled code.
  19. 19 . The non-transitory computer readable medium of claim 15 , wherein the cryptographic engine comprises a plurality of computation logics for performing parallel computations relating to one or more of: fast Fourier transformation, number theoretic transformation, or multi-scalar multiplication.
  20. 20 . The non-transitory computer readable medium of claim 15 , wherein the cryptographic engine is implemented as one of: a single semiconductor device, a single integrated circuit, or a single semiconductor chip that includes one or more chiplets.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application claims benefit as a continuation of application Ser. No. 18/525,604, filed Nov. 30, 2023, which is a continuation of application Ser. No. 18/214,462, filed Jun. 26, 2023, now issued as U.S. Pat. No. 11,882,216, issued Jan. 23, 2024, which is a continuation of application Ser. No. 18/207,620, filed Jun. 8, 2023, the entire contents of the aforementioned are hereby incorporated by reference as if fully set forth herein, under 35 U.S.C. § 120. The applicant hereby rescind any disclaimer of claim scope in the parent application or the prosecution history thereof and advise the USPTO that the claims in this application may be broader than any claim in the parent application. TECHNICAL FIELD Embodiments relate generally to computing systems, and, more specifically, to end-to-end hardware acceleration for ZKP from witness generation to proof generation in computing systems. BACKGROUND OF THE INVENTION The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Zero-knowledge proof can be generated and used by computing systems in real-world applications to prove and validate a statement without the proving (or prover) computing system revealing the statement itself to the verifier computing system. For example, the proving computing system may prove the statement by way of proving access to or knowledge of a witness to hidden information or program execution relating to the statement. A witness used by the proving computing system to generate a zero-knowledge proof typically includes numerous internal variable values and input or output parameters. Accessing the witness along with a proving key as input, the proving computing system can generate the zero-knowledge proof having a relatively small data size as compared with that of the witness. The zero-knowledge proof can be readily validated or verified with little computational overhead and time latency by a verifier computing system having a verification key. In comparison, the proving computing system that generates the witness and zero-knowledge proof may incur relatively high computational costs, large memory usages and long latencies. To generate the witness and zero-knowledge proof, the proving computing system may need to execute numerous parallelized or serialized computations, tasks or routines of different types including but are not limited to generating, accessing, moving, and processing large amounts of computer generated data across multiple sub-systems and host processing logic. BRIEF DESCRIPTION OF DRAWINGS The present inventive subject matter is illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements and in which: FIG. 1 illustrates an example proving computing system; FIG. 2 illustrates an example cryptographic processor; FIG. 3 illustrates example computation unit sub-circuits in a cryptographic processor; FIG. 4 illustrates an example process flow; and FIG. 5 is block diagram of a computer system upon which embodiments of the inventive subject matter may be implemented. DETAILED DESCRIPTION OF THE INVENTION In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present inventive subject matter. It will be apparent, however, that the present inventive subject matter may be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form in order to avoid unnecessarily obscuring the present inventive subject matter. Embodiments are described herein according to the following outline: 1.0. General Overview2.0. Structural Overview 2.1. Host Computing Processor2.2. Cryptographic Processor2.3. Memory Controller2.4. Host Memory2.5. Communication Interface(s)2.6. Witness Generator2.7. Proof Generator2.8. Local Buffer2.9. Miscellaneous 3.0. Functional Overview 3.1. Witness Generation3.2. Proof Generation3.3. Example Process Flows 4.0. Implementation Mechanism—Hardware Overview5.0. Extensions and Alternatives 1.0. General Overview In a proving computing system as described herein, a hardware accelerator—also referred to as acceleration hardware or cryptographic processor/engine—can be implemented as a single semiconductor chip or integrated circuit (IC). The hardware accelerator includes a relatively large number of (e.g., fast Fourier transformation or FFT, number theoretic transformation or NTT, multi-scalar multiplication or MSM, etc.) computational units for generating witnesses and corresponding proofs. In a