EP-4736159-A1 - APPARATUS, METHODS AND COMPUTER PROGRAM FOR ENCODING SPATIAL AUDIO CONTENT
Abstract
Examples of the disclosure relate to encoding spatial audio content using immersive voice and audio services (IVAS) codec. In examples an apparatus is configured to obtain a selected input format and an encoding option for encoding spatial audio content wherein the encoding option is configured to be switched to an alternative encoding option for encoding the spatial audio content. The apparatus is also configured to receive an indication of an output format configured to be used by a playback device. The apparatus is also configured to upmix the selected input format to an alternative input format wherein the alternative input format comprises more channels than the selected input format, and the alternative input format is selected based, at least in part, on the output format.
Inventors
- PIHLAJAKUJA, Tapani
- LAAKSONEN, LASSE JUHANI
- PAJUNEN, Lauros
- MATE, SUJEET SHYAMSUNDAR
Assignees
- Nokia Technologies Oy
Dates
- Publication Date
- 20260506
- Application Date
- 20240611
Claims (20)
- 1 . An apparatus comprising means for: obtaining a selected input format and an encoding option for encoding spatial audio content wherein the encoding option is configured to be switched to an alternative encoding option for encoding the spatial audio content; receiving an indication of an output format configured to be used by a playback device; and upmixing the selected input format to an alternative input format wherein the alternative input format comprises more channels than the selected input format, and the alternative input format is selected based, at least in part, on the output format.
- 2. An apparatus as claimed in claim 1 , wherein the means are for at least one of: switching the encoding option to the alternative encoding option and wherein the switching is based, at least in part, on the output format used by a playback device and the alternative input format; and encoding the spatial audio content using the alternative encoding option.
- 3. An apparatus as claimed in any preceding claim, wherein the alternative encoding option enables a higher bit rate to be supported.
- 4. An apparatus as claimed in any preceding claim, wherein the means are for at least one of: receiving a list of supported modes for encoding the spatial audio content; sorting the list into a preferred order; and adding one or more alternative modes to the list where the alternative modes comprise multi-channel formats that support a higher bit rate.
- 5. An apparatus as claimed in claim 4, wherein a mode for encoding the spatial audio content comprises an input/output format and an encoding option.
- 6. An apparatus as claimed in any preceding claim, wherein the indication of the output format used by a playback device comprises a value of a parameter.
- 7. An apparatus as claimed in any preceding claim, wherein the means are for mixing the spatial audio content such that additional channels of the alternative input format do not contain any content.
- 8. An apparatus as claimed in any of claims 1 to 6, wherein the means are for mixing the spatial audio content such that additional channels of the alternative input format do contain some content.
- 9. An apparatus as claimed in any of claims 1 to 6, wherein the means are for mixing the spatial audio content such that no change is made to content of channels of the selected input format when upmixing to the alternative input format.
- 10. An apparatus as claimed in any of claims 1 to 7, wherein the means are for mixing the spatial audio content such that at least some changes are made to content of the channels of the selected input format when upmixing to the alternative input format.
- 11. An apparatus as claimed in any preceding claim, wherein the alternative input format comprises a format that provides at least two additional channels.
- 12. An apparatus as claimed in any preceding claim, wherein the output format used by a playback device is a binaural format and the updated encoding option comprises multichannel coding using metadata-assisted spatial audio.
- 13. An apparatus as claimed in any preceding claim, wherein the means are for enabling switching to the alternative encoding option when it is determined that the alternative encoding option supports a higher bit rate.
- 14. An apparatus as claimed in any preceding claim, wherein the means are for determining the alternative input format and the encoding option to be used for spatial audio encoding through negotiation with a playback device.
- 15. An apparatus as claimed in any of claims 1 to 13, wherein the means are for determining the alternative input format and the encoding option to be used for spatial audio encoding based on the indication of the output format used by a playback device.
- 16. A method comprising: obtaining a selected input format and an encoding option for encoding spatial audio content wherein the encoding option is configured to be switched to an alternative encoding option for encoding the spatial audio content; receiving an indication of an output format configured to be used by a playback device; and upmixing the selected input format to an alternative input format wherein the alternative input format comprises more channels than the selected input format, and the alternative input format is selected based, at least in part, on the output format.
- 17. A computer program comprising instructions which, when executed by an apparatus, cause the apparatus to perform: obtaining a selected input format and an encoding option for encoding spatial audio content wherein the encoding option is configured to be switched to an alternative encoding option for encoding the spatial audio content; receiving an indication of an output format configured to be used by a playback device; and upmixing the selected input format to an alternative input format wherein the alternative input format comprises more channels than the selected input format, and the alternative input format is selected based, at least in part, on the output format.
- 18. An apparatus comprises: at least one processor; and at least one memory storing instructions that, when executed by the at least one processor, cause the apparatus at least to: obtain a selected input format and an encoding option for encoding spatial audio content wherein the encoding option is configured to be switched to an alternative encoding option for encoding the spatial audio content; receive an indication of an output format configured to be used by a playback device; and upmix the selected input format to an alternative input format wherein the alternative input format comprises more channels than the selected input format, and the alternative input format is selected based, at least in part, on the output format.
- 19. An apparatus as claimed in claim 18, wherein the apparatus is caused to: switch the encoding option to the alternative encoding option, wherein the switching is based, at least in part, on the output format used by the playback device and the alternative input format; and encoding spatial audio content using the alternative encoding option.
- 20. An apparatus as claimed in any of claim 18 or 19, wherein the alternative encoding option enables a higher bit rate to be supported.
Description
TITLE Apparatus, Methods and Computer Program for Encoding Spatial Audio Content TECHNOLOGICAL FIELD Examples of the disclosure relate to apparatus, methods and computer programs for encoding spatial audio content. Some relate to apparatus, methods and computer programs for encoding spatial audio content using immersive voice and audio services (IVAS) codec. BACKGROUND Spatial audio content can be used for immersive voice and audio services such as immersive voice and audio for virtual reality or mediated reality. The methods used to encode the audio content can affect the audio quality perceived by a listener. BRIEF SUMMARY According to various, but not necessarily all, examples of the disclosure there is provided an apparatus comprising means for: obtaining a selected input format and an encoding option for encoding spatial audio content wherein the encoding option is configured to be switched to an alternative encoding option for encoding the spatial audio content; receiving an indication of an output format configured to be used by a playback device; and upmixing the selected input format to an alternative input format wherein the alternative input format comprises more channels than the selected input format, and the alternative input format is selected based, at least in part, on the output format. The means may be for switching the encoding option to an alternative encoding option and wherein the switching is based, at least in part, on the output format used by the playback device and the alternative input format; and encoding spatial audio content using the alternative encoding option. The alternative encoding option may enable a higher bit rate to be supported. The means may be for receiving a list of supported modes for encoding spatial audio content and sorting the list into a preferred order and adding one or more alternative modes to the list where the alternative modes comprise multi-channel formats that support a higher bit rate. A mode for encoding spatial audio content may comprise an input/output format and an encoding option. The indication of an output format used by the playback device may comprise a value of a parameter. The means may be for mixing the spatial audio content such that additional channels of the alternative input format do not contain any content. The means may be for mixing the spatial audio content such that additional channels of the alternative input format do contain some content. The means may be for mixing the spatial audio content such that no change is made to content of channels of the selected input format when upmixing to the alternative input format. The means may be for mixing the spatial audio content such that at least some changes are made to content of the channels of the selected input format when upmixing to the alternative input format. The alternative input format may comprise a format that provides at least two additional channels. The output format used by the playback device may be a binaural format and the updated encoding option may comprise multichannel coding using metadata-assisted spatial audio. The means may be for enabling switching to the alternative encoding option if it is determined that the alternative encoding option supports a higher bit rate. The means may be for determining the alternative input format and encoding option to be used for spatial audio encoding through negotiation with a playback device. The means may be for determining the alternative input format and encoding option to be used for spatial audio encoding based on the indication of the output format used by the playback device. According to various, but not necessarily all, examples of the disclosure there is provided a method comprising: obtaining a selected input format and an encoding option for encoding spatial audio content wherein the encoding option is configured to be switched to an alternative encoding option for encoding the spatial audio content; receiving an indication of an output format configured to be used by a playback device; and upmixing the selected input format to an alternative input format wherein the alternative input format comprises more channels than the selected input format, and the alternative input format is selected based, at least in part, on the output format. According to various, but not necessarily all, examples of the disclosure there is provided a computer program comprising instructions which, when executed by an apparatus, cause the apparatus to perform: obtaining a selected input format and an encoding option for encoding spatial audio content wherein the encoding option is configured to be switched to an alternative encoding option for encoding the spatial audio content; receiving an indication of an output format configured to be used by a playback device; and upmixing the selected input format to an alternative input format wherein the alternative input format comprises more channels than the selected input forma