EP-4738892-A1 - AUDIO SYSTEM AND METHOD

EP4738892A1EP 4738892 A1EP4738892 A1EP 4738892A1EP-4738892-A1

Abstract

A audio system comprises an analysis unit, configured to receive and analyze a first number of audio signals, wherein analyzing the first number of audio signals comprises obtaining semantic information of the first number of audio signals, an ambient sound unit, configured to determine one or more ambient sounds related to the semantic information of the first number of audio signals obtained by the analysis unit, a sound placement unit, configured to determine a placement of the one or more ambient sounds determined by the ambient sound unit in one or more audio signals of the first number of audio signals, or in one or more processed audio signals of a second number of processed audio signals, and a combiner, configured to, based on the placement determined by the sound placement unit, combine the one or more ambient sounds determined by the ambient sound unit with one or more audio signals of the first number of audio signals or with one or more processed audio signals of the second number of processed audio signals, in order to generate a second number of audio output signals.

Inventors

OATES, CHRISTOPHER
GOPAKUMAR, Sreejith
VON TÜRCKHEIM, Friedrich

Assignees

Harman Becker Automotive Systems GmbH

Dates

Publication Date: 20260506
Application Date: 20241030

Claims (13)

An audio system (100) comprises an analysis unit (110), configured to receive and analyze a first number of audio signals (IN N ), wherein analyzing the first number of audio signals (IN N ) comprises obtaining semantic information of the first number of audio signals (IN N ), an ambient sound unit (112), configured to determine one or more ambient sounds related to the semantic information of the first number of audio signals (IN N ) obtained by the analysis unit (110), a sound placement unit (114), configured to determine a placement of the one or more ambient sounds determined by the ambient sound unit (112) in one or more audio signals (IN N ) of the first number of audio signals (IN N ), or in one or more processed audio signals (IN M *) of a second number of processed audio signals (IN M *), and a combiner (116), configured to, based on the placement determined by the sound placement unit (114), combine the one or more ambient sounds determined by the ambient sound unit (112) with one or more audio signals (IN N ) of the first number of audio signals (IN N ) or with one or more processed audio signals (IN M *) of the second number of processed audio signals (IN M *), in order to generate a second number of audio output signals (OUT M ).
The audio system (100) of claim 1, wherein obtaining semantic information of the first number of audio signals (IN N ) comprises determining at least one of a genre of, a rhythm of, a melody of, a structure of, lyrics of, a tempo of, a level of noise in, an energy in, and a cultural background of the first number of audio signals (IN N ).
The audio system (100) of claim 1 or 2, wherein the ambient sound unit (112) is further configured to generate the one or more ambient sounds related to the semantic information of the first number of audio signals (IN N ) obtained by the analysis unit (110).
The audio system (100) of claim 3, wherein the ambient sound unit (112) is configured to generate the one or more ambient sounds by means of a generative AI model.
The audio system (100) of claim 1 or 2, wherein the ambient sound unit (112) is further configured to retrieve the one or more ambient sounds related to the semantic information of the first number of audio signals (IN N ) obtained by the analysis unit (110) from a database of ambient sounds.
The audio system (100) of any of the preceding claims, wherein the one or more ambient sounds comprise at least one of murmurs, loud conversation, whispered conversation, clinking of glasses, shuffling of chairs, footsteps, foot tapping, clapping, applause, coughing, whispering, page turning sounds, near-field voices, far-field voices, cheering, chanting, whistling, rhythmic clapping, and screams.
The audio system (100) of any of the preceding claims, further comprising a processing unit (210), wherein the processing unit (210) is configured to process the first number of audio signals (IN N ) and output the second number of processed audio signals (IN M *).
The audio system (100) of claim 7, wherein the number of audio signals (IN N ) included in the first number of audio signals (IN N ) equals the number of processed audio signals (IN M *) included in the second number of processed audio signals (IN M *).
The audio system (100) of claim 7, wherein the number of audio signals included in the first number of audio signals (IN N ) is less than the number of audio signals included in the second number of processed audio signals (IN M *).
The audio system (100) of claim 9, wherein the first number of audio signals (IN N ) consists of two channels (L, R) of a stereo audio signal, and wherein the second number of processed audio signals (IN M *) consists of five channels (FL, FR, C, LS, RS) of an upmixed 5.1 surround signal.
The audio system (100) of claim 9 or 10, wherein the processing unit (210) is further configured to add reverberation to at least one processed audio signal (IN M *) of the second number of processed audio signals (IN M *).
The audio system (100) of claim 11, wherein the audio system (100) is configured to output the second number of audio output signals (OUT M ) to an audio reproduction unit arranged in a listening environment, and wherein the processing unit (210) is configured to add reverberation to at least one processed audio signal (IN M *) of the second number of processed audio signals (IN M *) based on a microphone signal (MIC), wherein the microphone signal (MIC) is obtained by a microphone arranged in the listening environment.
A method comprising: receiving and analyzing, at an analysis unit (110) of an audio system (100), a first number of audio signals (IN N ), wherein analyzing the first number of audio signals (IN N ) comprises obtaining semantic information of the first number of audio signals (IN N ), determining, at an ambient sound unit (112) of the audio system (100), one or more ambient sounds related to the semantic information of the first number of audio signals (IN N ) obtained by the analysis unit (110), determining, at a sound placement unit (114) of the audio system (100), a placement of the one or more ambient sounds determined by the ambient sound unit (112) in one or more audio signals (IN N ) of the first number of audio signals (IN N ), or in one or more processed audio signals (IN M *) of a second number of processed audio signals (IN M *), and at a combiner (116) of the audio system, based on the placement determined by the sound placement unit (114), combining the one or more ambient sounds determined by the ambient sound unit (112) with one or more audio signals (IN N ) of the first number of audio signals (IN N ) or with one or more processed audio signals (IN M *) of the second number of processed audio signals (IN M *), in order to generate a second number of audio output signals (OUT M ).

Description

TECHNICAL FIELD The disclosure relates to an audio system and related method, in particular an audio system and method for adding 3D information to an audio signal. BACKGROUND There is an increasing demand for Augmented Reality, AR, features in audio content. By adding AR features such as, e.g., ambient sounds, to an audio signal, thereby simulating a certain listening environment, the listening experience of a user to whom the audio signal is presented can be significantly increased. Simple stereo content can be enhanced to realistic 3D audio content. Acoustically simulating a specific kind of listening space by suitably adding and reproducing matching ambient sounds, however, can be challenging. There is a need for an audio system and related method that add AR features to an audio signal to simulate a listening environment by extending an audio signal with 3D information, resulting in a highly satisfying listening experience for a listener, while requiring comparably little computational load. SUMMARY An audio system includes an analysis unit, configured to receive and analyze a first number of audio signals, wherein analyzing the first number of audio signals includes obtaining semantic information of the first number of audio signals, an ambient sound unit, configured to determine one or more ambient sounds related to the semantic information of the first number of audio signals obtained by the analysis unit, a sound placement unit, configured to determine a placement of the one or more ambient sounds determined by the ambient sound unit in one or more audio signals of the first number of audio signals, or in one or more processed audio signals of a second number of processed audio signals, and a combiner, configured to, based on the placement determined by the sound placement unit, combine the one or more ambient sounds determined by the ambient sound unit with one or more audio signals of the first number of audio signals or with one or more processed audio signals of the second number of processed audio signals, in order to generate a second number of audio output signals. A method incudes receiving and analyzing, at an analysis unit of an audio system, a first number of audio signals, wherein analyzing the first number of audio signals includes obtaining semantic information of the first number of audio signals, determining, at an ambient sound unit of the audio system, one or more ambient sounds related to the semantic information of the first number of audio signals obtained by the analysis unit, determining, at a sound placement unit of the audio system, a placement of the one or more ambient sounds determined by the ambient sound unit in one or more audio signals of the first number of audio signals, or in one or more processed audio signals of a second number of processed audio signals, and, at a combiner of the audio system, based on the placement determined by the sound placement unit, combining the one or more ambient sounds determined by the ambient sound unit with one or more audio signals of the first number of audio signals or with one or more processed audio signals of the second number of processed audio signals, in order to generate a second number of audio output signals. Other systems, methods, features and advantages will be or will become apparent to one with skill in the art upon examination of the following detailed description and figures. It is intended that all such additional systems, methods, features and advantages be included within this description, be within the scope of the invention and be protected by the following claims. BRIEF DESCRIPTION OF THE DRAWINGS The arrangements may be better understood with reference to the following description and drawings. The components in the figures are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the invention. Moreover, in the figures, like referenced numerals designate corresponding parts throughout the different views. Figure 1 schematically illustrates an audio system according to embodiments of the disclosure.Figure 2 schematically illustrates an audio system according to further embodiments of the disclosure.Figure 3, in a flow chart, schematically illustrates a method according to embodiments of the disclosure. DETAILED DESCRIPTION As required, detailed embodiments of the present invention are disclosed herein; however, it is to be understood that the disclosed embodiments are merely examples of the invention that may be embodied in various and alternative forms. The figures are not necessarily to scale; some features may be exaggerated or minimized to show details of particular components. Therefore, specific structural and functional details disclosed herein are not to be interpreted as limiting, but merely as a representative basis for teaching one skilled in the art to variously employ the present invention. It is recognized that directional terms that may be noted herein (