JP-2026076253-A - Converting audio signals captured in different formats to fewer formats to simplify encoding and decoding operations.

JP2026076253AJP 2026076253 AJP2026076253 AJP 2026076253AJP-2026076253-A

Abstract

[Problem] To convert audio signals captured in various formats by various capture devices into a limited number of formats that can be processed by an audio codec. [Solution] In the system 200, the simplification unit 230 receives an audio signal captured by one or more capture units 210 coupled to the system. The simplification unit determines whether the audio signal is in a format supported or unsupported by the encoding unit 240. Based on the determination, the simplification unit converts the audio signal to a format supported by the encoding unit. In one embodiment, if the simplification unit determines that the audio signal is in a spatial format, the simplification unit converts the audio signal to a spatial "mezzanine" format supported by encoding. [Selection Diagram] Figure 2

Inventors

ブルーン，ステファン
エッカート，マイケル
トレス，ジュアンフェリックス
ブラウン，ステファニー
マグラス，デイヴィッドエス．

Assignees

ドルビーラボラトリーズライセンシングコーポレイション
ドルビー・インターナショナル・アーベー

Dates

Publication Date: 20260511
Application Date: 20260122
Priority Date: 20181008

Claims (1)

The simplification stage is a step of receiving audio signals in multiple formats and metadata for those audio signals from the sound preprocessing stage, wherein the audio signals represent audio captured by at least one microphone; The simplification stage includes a step of receiving attributes of the device from the device, wherein the attributes include one or more audio formats supported by the device, and the one or more audio formats include spatial formats; The simplification stage includes the step of converting the audio signal into a spatial mezzanine format compatible with the one or more audio formats; The simplification stage includes the step of providing the converted audio signal to the encoding stage, the output of which is for downstream processing in the apparatus. method.

Description

Cross-reference to Related Applications: This application claims priority from U.S. Provisional Patent Application No. 62/742,729, filed on 8 October 2018. The contents of that application are incorporated in their entirety by reference herein. The embodiments of this disclosure relate, broadly, to audio signal processing, and more specifically, to the distribution of captured audio signals. The development of audio and video encoder/decoder ("codec") standards has recently focused on the development of codecs for Immersive Voice and Audio Services (IVAS). IVAS is expected to support a range of service functions, from mono to stereo operation, and even fully immersive audio encoding, decoding, and rendering. A suitable IVAS codec also provides high error robustness against packet loss and delay jitter under different transmission conditions. IVAS is intended to be supported by a wide range of devices, endpoints, and network nodes, including but not limited to mobile phones and smartphones, electronic tablets, personal computers, conference phones, conference rooms, virtual and augmented reality devices, home theater systems, and other suitable devices. Because these devices, endpoints, and network nodes may have various acoustic interfaces for sound capture and rendering, it may not be practical for an IVAS codec to accommodate all the different ways in which audio signals are captured and rendered. Drawings often show specific arrangements or sequences of schematic elements, such as those representing devices, units, instruction blocks, and data elements, for the sake of clarity. However, those skilled in the art should understand that the specific ordering or arrangement of schematic elements in the drawings is not intended to imply a specific order or sequence of operations, or separation of processes. Furthermore, the inclusion of schematic elements in the drawings is not intended to imply that such elements are required in all embodiments, or that features represented by such elements should not be included in or combined with other elements in some embodiments. Furthermore, when connecting elements such as solid or dashed lines or arrows are used in drawings to illustrate the connection, relationship, or association of two or more other schematic elements, the absence of such connecting elements is not intended to imply that the connection, relationship, or association cannot exist. In other words, some connections, relationships, or associations between elements are not shown in the drawings so as not to obscure the disclosure. Moreover, for the sake of simplicity of illustration, a single connecting element is used to represent multiple connections, relationships, or associations between elements. For example, if a connecting element represents the communication of signals, data, or instructions, a person skilled in the art should understand that such an element represents one or more signal paths as necessary to affect the communication. This disclosure illustrates various devices that can be supported by the IVAS system according to several embodiments of this disclosure. A is a block diagram of a system for converting a captured audio signal into a format ready for encoding, according to some embodiments of the present disclosure. B is a block diagram of a system for converting captured audio back into a preferred playback format, according to some embodiments of the present disclosure. This is a flowchart illustrating exemplary actions for converting an audio signal to a format supported by an encoding unit, according to some embodiments of the present disclosure. This is a flowchart illustrating exemplary actions for determining whether an audio signal is in a format supported by an encoding unit, according to some embodiments of the present disclosure. This is a flowchart illustrating exemplary actions for converting an audio signal into an usable playback format, according to some embodiments of the present disclosure. This is another flowchart illustrating exemplary actions for converting an audio signal into an usable playback format, according to some embodiments of the present disclosure. These are block diagrams of hardware architectures for implementing the features described with reference to Figures 1-6, according to some embodiments of the present disclosure. The following description includes numerous specific details for explanatory purposes, in order to provide a full understanding of this disclosure. However, it will be apparent that this disclosure can be implemented without these specific details. Herein, we will refer in detail to embodiments illustrated in the accompanying drawings. The following detailed description includes numerous specific details to provide a full understanding of the various described embodiments. However, it will be apparent to those skilled in the art that the various described embodiments can be carried out without these specific details. On the othe