US-12621626-B2 - Method and apparatus for generating audio signal, and method and apparatus for reproducing audio signal

US12621626B2US 12621626 B2US12621626 B2US 12621626B2US-12621626-B2

Abstract

A method and device for generating an audio signal and a method and device for reproducing an audio signal are provided. The method of reproducing an audio signal includes obtaining a type of a stereophonic sound signal determined according to characteristics of the stereophonic sound signal and determining a rendering mode to reproduce the stereophonic sound signal, based on the type of the stereophonic sound signal and a reproduction environment of the stereophonic sound signal.

Inventors

Dae Young Jang
Kyeongok Kang
Jae-Hyoun Yoo
Yong Ju Lee

Assignees

ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Dates

Publication Date: 20260505
Application Date: 20240110
Priority Date: 20230111

Claims (15)

1 . A method of generating an audio signal, the method comprising: determining a type of a stereophonic sound signal based on characteristics of the stereophonic sound signal; and generating metadata of a sound source for generating the stereophonic sound signal, based on the determined type of the stereophonic sound signal, wherein the determining of the type of the stereophonic sound signal comprises: when the format of the sound source is an object-based sound source, determining the stereophonic sound signal as foreground sound.
2 . The method of claim 1 , wherein the characteristics of the stereophonic sound signal comprise the format of the sound source and a user reachable region corresponding to a region where the stereophonic sound signal may be experienced.
3 . The method of claim 2 , wherein the determining of the type of the stereophonic sound signal further comprises: when the format of the sound source is a channel-based sound source, determining the stereophonic sound signal as background sound.
4 . A method of reproducing an audio signal, the method comprising: obtaining a type of a stereophonic sound signal determined according to characteristics of the stereophonic sound signal; and determining a rendering mode to reproduce the stereophonic sound signal, based on the type of the stereophonic sound signal and a reproduction environment of the stereophonic sound signal, wherein the type of the stereophonic sound signal comprises foreground sound and background sound.
5 . The method of claim 4 , wherein the reproduction environment of the stereophonic sound signal comprises a position of a speaker to reproduce the stereophonic sound signal and a distance between a sound source for generating the stereophonic sound signal and a listener.
6 . The method of claim 5 , wherein the rendering mode comprises a multi-channel rendering mode and a binaural rendering mode.
7 . The method of claim 6 , wherein the determining of the rendering mode comprises: determining an initial value of the rendering mode based on the type of the stereophonic sound signal; and determining a final rendering mode to reproduce the stereophonic sound signal, based on the initial value of the rendering mode and the reproduction environment of the stereophonic sound signal.
8 . The method of claim 7 , wherein the determining the initial value of the rendering mode comprises: when the type of stereophonic sound signal is foreground sound, determining the binaural rendering mode to be an initial value; and when the type of stereophonic sound signal is background sound, determining the multi-channel rendering mode to be an initial value.
9 . The method of claim 8 , wherein the determining of the final rendering mode comprises determining whether to change the initial value of the rendering mode based on the distance between the sound source and the listener.
10 . An electronic device for reproducing an audio signal, the electronic device comprising: a processor; and a memory configured to store instructions, wherein the instructions, when executed by the processor, cause the electronic device to: obtain a type of a stereophonic sound signal determined according to characteristics of the stereophonic sound signal; and determine a rendering mode to reproduce the stereophonic sound signal, based on the type of the stereophonic sound signal and a reproduction environment of the stereophonic sound signal, wherein the type of the stereophonic sound signal comprises foreground sound and background sound.
11 . The electronic device of claim 10 , wherein the reproduction environment of the stereophonic sound signal comprises a position of a speaker to reproduce the stereophonic sound signal and a distance between a sound source for generating the stereophonic sound signal and a listener.
12 . The electronic device of claim 11 , wherein the rendering mode comprises a multi-channel rendering mode and a binaural rendering mode.
13 . The electronic device of claim 12 , wherein the instructions, when executed by the processor, cause the electronic device to: determine an initial value of the rendering mode based on the type of the stereophonic sound signal; and determine a final rendering mode to reproduce the stereophonic sound signal, based on the initial value of the rendering mode and the reproduction environment of the stereophonic sound signal.
14 . The electronic device of claim 13 , wherein the instructions, when executed by the processor, cause the electronic device to: when the type of stereophonic sound signal is foreground sound, determine the binaural rendering mode to be an initial value; and when the type of stereophonic sound signal is background sound, determine the multi-channel rendering mode to be an initial value.
15 . The electronic device of claim 14 , wherein the instructions, when executed by the processor, cause the electronic device to: determine whether to change the initial value of the rendering mode based on the distance between the sound source and the listener.

Description

CROSS REFERENCE TO RELATED APPLICATIONS This application claims the benefit under 35 USC § 119(a) of Korean Patent Application No. 10-2023-0004235, filed on Jan. 11, 2023, and Korean Patent Application No. 10-2024-0004241, filed on Jan. 10, 2024, in the Korean Intellectual Property Office, the entire disclosures of which are incorporated herein by reference for all purpose. TECHNICAL FIELD One or more embodiments relate to a method and device for generating an audio signal and a method and device for reproducing an audio signal. TECHNOLOGY BEHIND THE INVENTION Recently, attempts to provide more immersive stereophonic sound have been increasing, especially in digital cinema, such as ultra-high-definition television (UHDTV) and virtual reality (VR) games/attractions. In the case of digital cinema, Barco's AURO-3D has provided an opportunity to express stereophonic sound not only on a horizontal plane but also on a vertical plane by attempting to provide hemispherical stereophonic sound by adding four channels installed on the ceiling to the existing 5.1 channels. Afterwards, Dolby corporation has recognized the limitation of a multi-channel-based audio format and has commercialized Atmos technology for adapting to various audio reproduction environments by introducing audio technology of a hybrid format including an object-based audio format. Digital Theater Systems (DTS) has also entered the movie and home theater market using DTS-X technology, which is similar to Atmos, and is also competing with Dolby in the field of realistic media such as VR. In addition, standardization organizations are also establishing standardization for the audio technology of such hybrid formats. Audio definition model (ADM) according to international telecommunication union (ITU) specifies metadata for expressing information in various audio formats including an object-based audio format. Advanced television systems committee (ATSC) 3.0, a next-generation broadcasting standard in America, has standardized to include the audio technology of such hybrid formats and defines that Dolby's AC4 technology and Moving Picture Experts Group (MPEG)-H three-dimensional (3D) audio technology may be selected and used. Although standardization and technology have been developed to provide the audio technology of hybrid formats, the technologies are dependent on one of existing rendering modes and thus, immersive stereophonic sound may not be reproduced. The above description has been possessed or acquired by the inventor(s) in the course of conceiving the present disclosure and is not necessarily an art publicly known before the present application is filed. CONTENTS OF THE INVENTION Tasks to be Solved Embodiments provide technology of determining a rendering mode to reproduce a stereophonic sound signal, based on the type of the stereophonic sound signal and a reproduction environment of the stereophonic sound signal, and reproducing different stereophonic sound signals through a plurality of rendering modes. However, the technical aspects are not limited to the aforementioned aspects, and other technical aspects may be present. Means of Solving the Tasks According to an aspect, there is provided a method of generating an audio signal, the method including determining a type of a stereophonic sound signal based on characteristics of the stereophonic sound signal and generating metadata of a sound source for generating the stereophonic sound signal, based on the determined type of the stereophonic sound signal. The characteristics of the stereophonic sound signal may include a format of the sound source and a user reachable region corresponding to a region where the stereophonic sound signal may be experienced. The determining of the type of the stereophonic sound signal may include, when the format of the sound source is an object-based sound source, determining the stereophonic sound signal as foreground sound and when the format of the sound source is a channel-based sound source, determining the stereophonic sound signal as background sound. According to another aspect, there is provided a method of reproducing an audio signal, the method including obtaining a type of a stereophonic sound signal determined according to characteristics of the stereophonic sound signal and determining a rendering mode to reproduce the stereophonic sound signal, based on the type of the stereophonic sound signal and a reproduction environment of the stereophonic sound signal. The type of the stereophonic sound signal may include foreground sound and background sound. The reproduction environment of the stereophonic sound signal may include a position of a speaker to reproduce the stereophonic sound signal and a distance between a sound source for generating the stereophonic sound signal and a listener. The rendering mode may include a multi-channel rendering mode and a binaural rendering mode. The determining of the rendering mode may include determining an in