Search

EP-4742533-A2 - SYSTEM AND METHOD FOR NON-DESTRUCTIVELY NORMALIZING LOUDNESS OF AUDIO SIGNALS WITHIN PORTABLE DEVICES

EP4742533A2EP 4742533 A2EP4742533 A2EP 4742533A2EP-4742533-A2

Abstract

Many portable playback devices cannot decode and playback encoded audio content having wide bandwidth and wide dynamic range with consistent loudness and intelligibility unless the encoded audio content has been prepared specially for these devices. This problem can be overcome by including with the encoded content some metadata that specifies a suitable dynamic range compression profile by either absolute values or differential values relative to another known compression profile. A playback device may also adaptively apply gain and limiting to the playback audio. Implementations in encoders, in transcoders and in decoders are disclosed.

Inventors

  • RIEDMILLER, JEFFREY
  • MUNDT, HARALD
  • SCHUG, MICHAEL
  • WOLTERS, MARTIN

Assignees

  • Dolby International AB
  • Dolby Laboratories Licensing Corporation

Dates

Publication Date
20260513
Application Date
20110203

Claims (6)

  1. A method comprising: receiving, by a decoding device, encoded audio information and metadata associated with an audio signal, the metadata including one or more decoding-control parameters, a measure of a loudness of the encoded audio information, wherein the measure of loudness is a level of dialogue in the audio signal, and one or more first parameter values specifying dynamic range compression (DRC) according to a first profile associated with a first reference reproduction level, and one or more second parameter values specifying DRC according to a second profile associated with a second reference reproduction level higher than the first reference reproduction level and within a range of reference reproduction levels; specifying, for the decoding device, a reference reproduction level; applying, by the decoding device, a decoding process to the encoded audio information to obtain subband signals representing spectral content of the audio signal; modifying, by the decoding device, the subband signals using the one or more second DRC parameter values specifying DRC according to the second profile to obtain modified subband signals with changed dynamic range characteristics, in response to specifying the reference reproduction level, for the decoding device, to the second reference reproduction level; applying, by the decoding device, a synthesis filter bank to the modified subband signals to obtain a time-domain audio signal; and using, by the decoding device, the loudness measure to adjust amplitudes of the time-domain audio signal to achieve the reference reproduction level for the decoding device.
  2. The method of claim 1, wherein the first reference reproduction level is -31 dBFS or -20 dBFS.
  3. The method of claim 1 or claim 2, wherein the range of reference reproduction levels is between -14 dBFS and -8 dBFS.
  4. The method of claim 1 or claim 2, wherein the second reference reproduction level is -11 dBFS.
  5. An apparatus comprising: a processor; a memory coupled to the processor and configured to store instructions, which when executed by the processor, cause the processor to perform the method of any of the previous claims.
  6. A computer program product including a data carrier storing instructions for performing the method of any one of claims 1 to 4.

Description

CROSS-REFERENCE TO RELATED APPLICATION This application is a European divisional application of European patent application EP 25163661.9 (reference: D10006EP08), which EPO Form 1001 was filed 13 March 2025. TECHNICAL FIELD The present invention pertains generally to encoding and decoding audio signals and pertains more specifically to techniques that may be used to encode and decode audio signals for a wider range of playback devices and listening environments. BACKGROUND ART The increasing popularity of handheld and other types of portable devices has created new opportunities and challenges for the creators and distributors of media content for playback on those devices, as well as for the designers and manufacturers of the devices. Many portable devices are capable of playing back a broad range of media content types and formats including those often associated with high-quality, wide bandwidth and wide dynamic range audio content for HDTV, Blu-ray or DVD. Portable devices may be used to play back this type of audio content either on their own internal acoustic transducers or on external transducers such as headphones; however, they generally cannot reproduce this content with consistent loudness and intelligibility across varying media format and content types. DISCLOSURE OF INVENTION The present invention is directed toward providing improved methods for encoding and decoding audio signals for playback on a variety of devices including handheld and other types of portable devices. Various aspects of the present invention are set forth in the independent claims shown below. The various features of the present invention and its preferred embodiments may be better understood by referring to the following discussion and the accompanying drawings in which like reference numerals refer to like elements in the several figures. The contents of the following discussion and the drawings are set forth as examples only and should not be understood to represent limitations upon the scope of the present invention. BRIEF DESCRIPTION OF DRAWINGS Fig. 1 is a schematic block diagram of a playback device.Fig. 2 is a schematic block diagram of an encoding device.Figs. 3 to 5 are schematic block diagrams of transcoding devices.Fig. 6 is a schematic block diagram of a device that may be used to implement various aspects of the present invention. MODES FOR CARRYING OUT THE INVENTION A. Introduction The present invention is directed toward the encoding and decoding of audio information for playback in challenging listening environments such as those encountered by users of handheld and other types of portable devices. A few examples of audio encoding and decoding are described by published standards such as those described in the "Digital Audio Compression Standard (AC-3, E-AC-3)," Revision B, Document A/52B, 14 June 2005 published by the Advanced Television Systems Committee, Inc. (referred to herein as the "ATSC Standard"), and in ISO/IEC 13818-7, Advanced Audio Coding (AAC) (referred to herein as the "MPEG-2 AAC Standard") and ISO/IEC 14496-3, subpart 4 (referred to herein as the "MPEG-4 Audio Standard") published by the International Standards Organization (ISO). The encoding and decoding processes that conform to these standards are mentioned only as examples. Principles of the present invention may be used with coding systems that conform to other standards as well. The inventors discovered that the available features of devices that conform to some coding standards are often not sufficient for applications and listening environments that are typical of handheld and other types of portable devices. When these types of devices are used to decode the audio content of encoded input signals that conform to these standards, the decoded audio content is often reproduced at loudness levels that are significantly lower than the loudness levels for audio content obtained by decoding encoded input signals that were specially prepared for playback on these devices. Encoded input signals that conform to the ATSC Standard (referred to herein as "ATSC-compliant encoded signals"), for example, contain encoded audio information and metadata that describe how this information can be decoded. Some of the metadata parameters identify a dynamic range compression profile that specifies how the dynamic range of the audio information may be compressed when the encoded audio information is decoded. The full dynamic range of the decoded signal can be retained or it can be compressed by varying degrees at the time of decoding to satisfy the demands of different applications and listening environments. Other metadata identify some measure of loudness of the encoded audio information such as an average program level or level of dialog in the encoded signal. This metadata may be used by a decoder to adjust amplitudes of the decoded signal to achieve a specified loudness or reference reproduction level during playback. In some applications, one or