JP-2026076303-A - Audio signal downmixing method, audio signal downmixing device, program

JP2026076303AJP 2026076303 AJP2026076303 AJP 2026076303AJP-2026076303-A

Abstract

[Problem] To provide an audio signal downmixing method and an audio signal downmixing device that estimates the phase difference spectrum of two channel signals with less computational processing than conventional methods and with processing suitable for fixed-point arithmetic. [Solution] In an audio signal downmixing device, if u(k) and v(k) are the real and imaginary parts of the product Y(k) of the complex conjugate ￣X2 (k) of the frequency spectrum X1 (k) and frequency spectrum X2 (k), respectively, the phase difference spectrum estimation unit selects one of several representative values of phase difference spectra, which are values on the circumference of the unit circle in the complex plane and have different argument angles in the complex plane, based on a combination of signs indicating whether u(k) is positive or negative and signs indicating whether v(k) is positive or negative, and obtains it as the phase difference spectrum φ(k). [Selection Diagram] Figure 1

Inventors

守谷健弘
鎌本優
杉浦亮介

Assignees

ＮＴＴ株式会社

Dates

Publication Date: 20260511
Application Date: 20260210

Claims (4)

A phase difference spectrum estimation step is performed to estimate the phase difference spectrum φ(k) of the frequency spectrum X1 (k) of the input signal of the first channel and the frequency spectrum X2 (k) of the input signal of the second channel for a given frequency k. A channel relationship information acquisition step that uses the estimated frequency k phase difference spectrum φ(k) to obtain a value representing the correlation between the input sound signal of the first channel and the input sound signal of the second channel, and leading channel information which is information indicating which channel, the first channel or the second channel, is leading, A downmix step that uses the correlation value and the preceding channel information to obtain a downmix signal from the input sound signal of the first channel and the input sound signal of the second channel, A method for downmixing audio signals, including, Let u(k) and v(k) be the real and imaginary parts of Y(k), which is the product of the complex conjugates of frequency spectra X1 (k) and X2 (k) , respectively. The phase difference spectrum estimation step is, One of several representative phase difference spectra, which are values that lie on the circumference of the unit circle in the complex plane and whose arguments in the complex plane are distinct from each other, is selected as the phase difference spectrum φ(k) based on the combination of signs indicating whether u(k) is positive or negative and the signs indicating whether v(k) is positive or negative. Audio signal downmixing method.
A method for downmixing an audio signal according to claim 1, The phase difference spectrum estimation step is, If the sign indicating whether u(k) is positive or negative and the sign indicating whether v(k) is positive or negative both indicate a positive value, then the representative value of the phase difference spectrum in the first quadrant is obtained as the phase difference spectrum φ(k). If the sign indicating whether u(k) is positive or negative is a negative sign, and the sign indicating whether v(k) is positive or negative is a positive sign, then the representative value of the phase difference spectrum in the second quadrant is obtained as the phase difference spectrum φ(k). If the sign indicating whether u(k) is positive or negative and the sign indicating whether v(k) is positive or negative both indicate negative values, then the representative value of the phase difference spectrum in the third quadrant is obtained as the phase difference spectrum φ(k). A method for downmixing audio signals in which, if the sign indicating whether u(k) is positive or negative is a sign indicating a positive value, and the sign indicating whether v(k) is positive or negative is a sign indicating a negative value, the representative value of the phase difference spectrum in the fourth quadrant is obtained as the phase difference spectrum φ(k).
A phase difference spectrum estimation unit estimates the phase difference spectrum φ(k) of the frequency spectrum X1 (k) of the input signal of the first channel and the frequency spectrum X2 (k) of the input signal of the second channel for a given frequency k. A channel relationship information acquisition unit obtains, using the phase difference spectrum φ(k) for the estimated frequency k, a value representing the correlation between the input sound signal of the first channel and the input sound signal of the second channel, and leading channel information which indicates which channel, the first channel or the second channel, is preceding. A downmixing unit that uses the correlation value and the preceding channel information to obtain a downmix signal from the input sound signal of the first channel and the input sound signal of the second channel, An audio signal downmixing device including, Let u(k) and v(k) be the real and imaginary parts of Y(k), which is the product of the complex conjugates of frequency spectra X1 (k) and X2 (k) , respectively. The phase difference spectrum estimation unit is, One of several representative phase difference spectra, which are values that lie on the circumference of the unit circle in the complex plane and whose arguments in the complex plane are distinct from each other, is selected as the phase difference spectrum φ(k) based on the combination of signs indicating whether u(k) is positive or negative and the signs indicating whether v(k) is positive or negative. Audio signal downmixing device.
A program for causing a computer to execute the audio signal downmixing method described in claim 1.

Description

This invention relates to a technique for obtaining the phase difference spectrum of two channel signals in order to mix, encode, or process the two channel signals using the relationship between the signals of the two channels. One technique for obtaining the phase difference spectrum of two-channel audio signals is described in Patent Document 1. Patent Document 1 mainly describes a technique for mixing multiple-channel audio signals to obtain a single audio signal. Specifically, it describes a technique for obtaining a downmix signal by weighting and adding the two-channel audio signals so that the input audio signal of the leading channel is included in larger quantities as the correlation magnitude value increases, after obtaining a value representing the magnitude of the correlation between the two-channel input audio signals and which of the two channels is leading. Patent Document 1 describes a technique for obtaining the time difference between two-channel audio signals in order to determine which of the two channels is leading. As an example of a technique for obtaining the time difference between two-channel audio signals, it describes a technique for obtaining the phase difference spectrum in the frequency domain of the two-channel audio signals, obtaining a phase difference signal for each time difference by performing an inverse Fourier transform on each candidate time difference applied to the phase difference spectrum, and obtaining the time difference with the largest phase difference signal among the candidate time differences as the time difference between the two-channel audio signals. This technology allows for obtaining the time difference between two audio signals by using the phase difference spectra of each frequency of the two audio signals, while minimizing the influence of the harmonic structure and pitch components of the audio signals. In other words, the technique for obtaining the phase difference spectra of two audio signals described in Patent Document 1 is useful for obtaining the time difference between two audio signals, determining which of the two audio signals precedes which, and for mixing, encoding, or processing signals using the relationship between any of these two audio signals. International Publication No. 2021/181974 This is a block diagram showing the audio signal downmixing device 100 according to the first and second embodiments.This is a flowchart showing the processing of the audio signal downmixing device 100 in the first and second embodiments.This figure illustrates representative values for the first example of the phase difference spectrum estimation unit 122.This figure illustrates the representative values of the first quadrant in the second example of the phase difference spectrum estimation unit 122.This is a block diagram of the inter-channel relationship information estimation device 120 according to the third embodiment.This is a flowchart showing the processing of the inter-channel relationship information estimation device 120 of the third embodiment.This is a block diagram showing the phase difference spectrum estimation device 200 of the fourth embodiment.This is a flowchart showing the processing of the phase difference spectrum estimation device 200 according to the fourth embodiment.This is a block diagram of the signal encoding device 300 according to the fifth embodiment.This is a flowchart showing the processing of the signal encoding device 300 according to the fifth embodiment.This is a block diagram of the signal processing device 400 according to the sixth embodiment.This is a flowchart showing the processing of the signal processing device 400 according to the sixth embodiment.This figure shows an example of the functional configuration of a computer that implements each device in the embodiments of the present invention. <First Embodiment> In the first embodiment, the phase difference spectrum estimation process of the present invention is described in which the process is applied to an audio signal downmixing device that performs downmixing processing considering the relationship between a first channel input audio signal and a second channel input audio signal in order to obtain a monaural signal useful for signal processing such as encoding processing. Two-channel audio signals that are the target of signal processing such as encoding are often digital audio signals obtained by A/D conversion of sounds picked up by a left-channel microphone and a right-channel microphone placed in a certain space. In this case, the input to the signal processing device, such as encoding, is a first-channel input audio signal, which is a digital audio signal obtained by A/D conversion of the sound picked up by the left-channel microphone placed in the space, and a second-channel input audio signal, which is a digital audio signal obtained by A/D conversion of the sound picked up by the right-channel microphone placed in the space. These firs