US-20260128047-A1 - AUDIO CODING WITH DEPTH AND BANDWIDTH SCALABILITY
Abstract
Techniques are directed to an audio codec configured to process audio in such a way that enables the codec to decompress an encoded audio signal at an increased bandwidth and/or bit depth. In some implementations, the audio codec is configured to operate on audio data expressed in the Opus format. In such implementations, the Opus format enables such decompression at increased bandwidth while preserving backward compatibility with standard decompression in the Opus format. The decompression at increased bandwidth and/or bit depth is enabled via a set of extension bits in addition to a base set of bits that represent a set of compressed audio frames. In the case of Opus format, the additional bandwidth and/or bit depth may be specified in a header. In these cases, for decoders that do not enable such decompression at the increased bandwidth, they may ignore the extension bits to preserve backward compatibility.
Inventors
- Jean-Marc Valin
- Arvindh KRISHNASWAMY
- Jeffrey Ryan Peil
Assignees
- GOOGLE LLC
Dates
- Publication Date
- 20260507
- Application Date
- 20251031
Claims (20)
- 1 . A method, comprising: receiving, by a decoder of an audio codec using a vector quantizer having a codebook that includes a first number of entries, a bitstream including a set of base bits representing a compressed frame of an audio signal; and in response to the decoder receiving a set of extension bits that extends the codebook, the set of extension bits representing data enabling the decoder to increase a resolution used by the decoder to an extended resolution greater than an initial resolution by an extension of the codebook from the first number of entries to a second number of entries, decoding the compressed frame of the audio signal at the extended resolution using the extension of the codebook having the second number of entries.
- 2 . The method as in claim 1 , wherein in response to the decoder not receiving the set of extension bits, decoding the compressed frame of the audio signal at the initial resolution using the codebook having the first number of entries.
- 3 . The method as in claim 1 , wherein the vector quantizer is a pyramid vector quantizer.
- 4 . The method as in claim 3 , wherein the second number of entries of the extension of the codebook of the pyramid vector quantizer is an odd multiple of the first number of entries of the codebook of the pyramid vector quantizer.
- 5 . The method as in claim 4 , wherein the odd multiple is one less than a power of two, and wherein decoding the compressed frame at the extended resolution achieves an additional bit depth based on the power of two.
- 6 . The method as in claim 1 , wherein the set of extension bits representing data enabling the decoder to increase a resolution used by the decoder increases a bandwidth used by the decoder.
- 7 . The method as in claim 6 , wherein decoding the compressed frame includes: determining that the codebook is larger than a threshold size; and in response to the determining, performing decoding using one of a pyramid vector quantizer or a cubic quantizer.
- 8 . The method as in claim 1 , wherein the audio codec is an Opus codec, and the set of extension bits are stored in a padding layer of a data packet that includes multiple frames.
- 9 . The method as in claim 1 , wherein the vector quantizer is configured to output coefficients for an inverse modified discrete cosine transform.
- 10 . A computer program product comprising a nontransitory storage medium, the computer program product including code that, when executed by processing circuitry, causes the processing circuitry to perform a method, the method comprising: receiving, by a decoder of an audio codec using a vector quantizer having a codebook that includes a first number of entries, a bitstream including a set of base bits representing a compressed frame of an audio signal; in response to the decoder receiving a set of extension bits that extends the codebook, the set of extension bits representing data enabling the decoder to increase a resolution used by the decoder to an extended resolution greater than an initial resolution by an extension of the codebook from the first number of entries to a second number of entries, decoding the compressed frame of the audio signal at the extended resolution using the extension of the codebook having the second number of entries; and in response to the decoder not receiving the set of extension bits, decoding the compressed frame of the audio signal at the initial resolution using the codebook having the first number of entries.
- 11 . The computer program product as in claim 10 , wherein the vector quantizer is a pyramid vector quantizer.
- 12 . The computer program product as in claim 11 , wherein the second number of entries of the extension of the codebook of the pyramid vector quantizer is an odd multiple of the first number of entries of the codebook of the pyramid vector quantizer.
- 13 . The computer program product as in claim 12 , wherein the odd multiple is one less than a power of two, and wherein decoding the compressed frame at the extended resolution achieves an additional bit depth based on the power of two.
- 14 . The computer program product as in claim 11 , wherein decoding the compressed frame includes: determining that the codebook is larger than a threshold size; and in response to the determining, performing decoding using one of a pyramid vector quantizer or a cubic quantizer.
- 15 . The computer program product as in claim 10 , wherein the audio codec is an Opus codec, and the set of extension bits are stored in a padding layer of a data packet that includes multiple frames.
- 16 . An electronic apparatus, the electronic apparatus comprising: memory; and a processor coupled to the memory, the processor being configured to: receive, by a decoder of an audio codec using a vector quantizer having a codebook that includes a first number of entries, a bitstream including a set of base bits, the set of base bits representing a compressed frame of an audio signal; in response to the decoder receiving a set of extension bits that extends the codebook, the set of extension bits representing data enabling the decoder to increase a resolution used by the decoder to an extended resolution greater than an initial resolution by an extension of the codebook from the first number of entries to a second number of entries, decode the compressed frame of the audio signal at the extended resolution using the extension of the codebook having the second number of entries; and in response to the decoder not receiving the set of extension bits, decode the compressed frame of the audio signal at the initial resolution using the codebook having the first number of entries.
- 17 . The electronic apparatus as in claim 16 , wherein the vector quantizer is a pyramid vector quantizer.
- 18 . The electronic apparatus as in claim 17 , wherein the second number of entries of the extension of the codebook of the pyramid vector quantizer is an odd multiple of the first number of entries of the codebook of the pyramid vector quantizer.
- 19 . The electronic apparatus as in claim 18 , wherein the odd multiple is one less than a power of two, and wherein decoding the compressed frame at the extended resolution achieves an additional bit depth based on the power of two.
- 20 . The electronic apparatus as in claim 16 , wherein the processor configured to decode the compressed frame is further configured to: determine that the codebook is larger than a threshold size; and in response to the determining, performing decoding using one of a pyramid vector quantizer or a cubic quantizer.
Description
CROSS REFERENCE TO RELATED APPLICATIONS This application claims the benefit of U.S. Provisional Application No. 63/715,141, filed on Nov. 1, 2024, the disclosure of which is hereby incorporated by reference in its entirety. BACKGROUND An audio codec is software or a hardware device capable of encoding or decoding a digital data stream representing an audio signal. In software, an audio codec can take the form of a computer program implementing an algorithm that compresses and decompresses digital audio data according to a given audio file or streaming media audio coding format. An objective of the algorithm is to represent a high-fidelity audio signal with a minimum number of bits while retaining quality. This can effectively reduce the storage space and the bandwidth required for transmission of the stored audio file. Some audio compression and decompression algorithms are based on a modified discrete cosine transform (MDCT) and linear predictive coding (LPC). An example of an audio coding format is the Opus format. Opus combines speech-oriented LPC-based SILK algorithm and a lower-latency MDCT-based CELT algorithm, switching between or combining them as needed. Bitrate, audio bandwidth, complexity, and algorithm choice can be adjusted for each individual frame. Opus has low algorithmic delay configured for use as part of a real-time communication link, networked music performances, and live lip sync. SUMMARY Implementations described herein relate to an audio codec configured to process audio in such a way that enables the codec to decompress an encoded audio signal at an increased bandwidth beyond 20 kHz and/or bit depth. In some implementations, the audio codec is configured to operate on audio data expressed in the Opus format. In such implementations, the Opus format enables such decompression at increased bandwidth and/or bit depth while preserving backward compatibility with standard decompression in the Opus format. The decompression at increased bandwidth and/or bit depth is enabled via a set of extension bits in addition to a base set of bits that represent a set of compressed audio frames. For example, in the Opus format, the set of extension bits may be stored in a padding layer within an audio data packet. In the case of Opus format, the additional resolution and/or bandwidth may be specified in a data packet header. In these cases, for decoders that do not enable such decompression at the increased resolution and/or bandwidths, the decoders may simply not receive or ignore the extension bits to preserve backward compatibility and allow Opus formats to use the extension bits whether the encoders are configured for increased resolution and/or bandwidths or not. Such extension bits enable high resolution audio for devices such as earbuds, and the framework enabling the extension bits can be released via open source and may be configured for a broad industry standard. In one general aspect, a method can include receiving, by a decoder of an audio codec using a vector quantizer having a codebook that includes a first number of entries, a bitstream including a set of base bits, the set of base bits representing a compressed frame of an audio signal. The method can also include, in response to the decoder receiving a set of extension bits that extends the codebook, the set of extension bits representing data enabling the decoder to increase a resolution used by the decoder to an extended resolution greater than an initial resolution by an extension of the codebook from the first number of entries to a second number of entries, decoding the compressed frame of the audio signal at the extended resolution. The method can further include, in response to the decoder not receiving the set of extension bits, decoding the compressed frame of the audio signal at the initial resolution using the codebook having the first number of entries. In another general aspect, a computer program product comprising a nontransitory storage medium, the computer program product including code that, when executed by a processor, causes the processor to perform a method. The method can include receiving, by a decoder of an audio codec using a vector quantizer having a codebook that includes a first number of entries, a bitstream including a set of base bits, the set of base bits representing a compressed frame of an audio signal. The method can also include, in response to the decoder receiving a set of extension bits that extends the codebook, the set of extension bits representing data enabling the decoder to increase a resolution used by the decoder to an extended resolution greater than an initial resolution by an extension of the codebook from the first number of entries to a second number of entries, decoding the compressed frame of the audio signal at the extended resolution. The method can further include, in response to the decoder not receiving the set of extension bits, decoding the compressed frame of the audio si