US-20260129395-A1 - EFFICIENT MODELING OF FILTERS

US20260129395A1US 20260129395 A1US20260129395 A1US 20260129395A1US-20260129395-A1

Abstract

A method for modelling of a set of filters is provided. The method comprises acquiring a set of feature values each of which is associated with an index within an index range of the filters and dividing the index range into multiple segments using the acquired set of feature values. The method also comprises determining a filter model for at least one segment of the multiple segments and outputting the determined filter model.

Inventors

Mengqiu ZHANG
Erlendur Karlsson

Assignees

Telefonaktiebplaget LM Ericsson (publ)

Dates

Publication Date: 20260507
Application Date: 20251229

Claims (20)

1 . A method for modelling of a set of head-related (HR) filters to be used for rendering a spatial audio signal, the method comprising: obtaining an HR filter dataset, 0 , and data analysis specification, X; acquiring a set of feature values specified in the data analysis specification, X, each of the feature values being associated with an index of a filter tap within an index range of the filter taps of the HR filters in 0 ; clustering the feature values into a plurality of clusters, and dividing the index range into multiple segments using the plurality of clusters; determining a filter model for at least one segment of the multiple segments; and outputting the determined model of the filter set.
2 . The method of claim 1 , wherein the acquiring of the set of feature values comprises calculating a feature value associated with each index included in the index range.
3 . The method of claim 2 , wherein the feature value associated with each index included in the index range is calculated using a mathematical value associated with filter values obtained at a plurality of sample angles of head-related transfer functions.
4 . The method of claim 3 , wherein the mathematical value is: a mean value of the filter values obtained at a plurality of sample angles, a maximum value among the filter values obtained at a plurality of sample angles, a minimum value among the filter values obtained at a plurality of sample angles, or a variance value of the filter values obtained at a plurality of sample angles.
5 . The method of claim 1 , wherein clustering the feature values into a plurality of clusters, and dividing the index range into the multiple segments comprises: comparing each feature value included in the set of feature values to a threshold value; and dividing the index range into the multiple segments based on the comparison of each feature value to the threshold value.
6 . The method of claim 5 , wherein the method further comprising: analyzing a distribution of the feature values along the index range; obtaining a feature amount value indicating a particular number of feature values to be included in a particular segment of the index range; and setting the threshold value such that the number of feature values that are greater than or equal to the threshold value is greater than or equal to the feature amount value.
7 . The method of claim 1 , wherein dividing the index range into the multiple segments comprises dividing the index range into a first segment and a second segment, and determining the filter model for said at least one segment comprises determining a first filter model for the first segment and a second filter model for the second segment.
8 . The method of claim 7 , wherein the first filter model and/or the second filter model is a function of basis functions, and the number of basis functions for the first filter model is different from the number of basis functions for the second filter model.
9 . The method of claim 7 , wherein the first filter model and/or the second filter model is a function of basis functions, and the order of the basis functions for the first filter model is different from the order of the basis functions for the second filter model.
10 . The method of claim 7 , wherein the first filter model and/or the second filter model is a function of basis functions, and the order of the basis functions for the first filter model and the order of the basis functions for the second filter model are the same.
11 . The method of claim 7 , wherein the method further comprises: calculating a first variability level for the first segment; and calculating a second variability level for the second segment, the first filter model is determined for the first segment based on the first variability level, and the second filter model is determined for the second segment based on the second variability level.
12 . The method of claim 11 , wherein the first variability level is determined based on one or more feature values associated with the first segment, and the second variability level is determined based on one or more feature values associated with the second segment.
13 . The method of claim 1 , wherein the method further comprises obtaining a set of segmented datasets including a first set of segmented dataset and a second set of segmented dataset, the first set of segmented dataset comprises a first set of segmented filter parameters associated with a first segment of the multiple segments, the second set of segmented dataset comprises a second set of segmented filter parameters associated with a second segment of the multiple segments, and the first segment and the second segment do not overlap each other.
14 . An apparatus, the apparatus comprising: memory; and processing circuitry, wherein the apparatus is configured to perform a method for modelling of a set of head-related (HR) filters to be used for rendering a spatial audio signal, the method comprising: obtaining an HR filter dataset, 0 , and data analysis specification, X; acquiring a set of feature values specified in the data analysis specification, X, each of the feature values being associated with an index of a filter tap within an index range of the filter taps of the HR filters in 0 ; clustering the feature values into a plurality of clusters, and dividing the index range into multiple segments using the plurality of clusters; determining a filter model for at least one segment of the multiple segments; and outputting the determined model of the filter set.
15 . The apparatus of claim 14 , wherein the acquiring of the set of feature values comprises calculating a feature value associated with each index included in the index range.
16 . The apparatus of claim 15 , wherein the feature value associated with each index included in the index range is calculated using a mathematical value associated with filter values obtained at a plurality of sample angles of head-related transfer functions.
17 . The apparatus of claim 16 , wherein the mathematical value is: a mean value of the filter values obtained at a plurality of sample angles, a maximum value among the filter values obtained at a plurality of sample angles, a minimum value among the filter values obtained at a plurality of sample angles, or a variance value of the filter values obtained at a plurality of sample angles.
18 . The apparatus of claim 14 , wherein clustering the feature values into a plurality of clusters, and dividing the index range into the multiple segments comprises: comparing each feature value included in the set of feature values to a threshold value; and dividing the index range into the multiple segments based on the comparison of each feature value to the threshold value.
19 . The apparatus of claim 18 , wherein the method further comprising: analyzing a distribution of the feature values along the index range; obtaining a feature amount value indicating a particular number of feature values to be included in a particular segment of the index range; and setting the threshold value such that the number of feature values that are greater than or equal to the threshold value is greater than or equal to the feature amount value.
20 . The apparatus of claim 14 , wherein dividing the index range into the multiple segments comprises dividing the index range into a first segment and a second segment, and determining the filter model for said at least one segment comprises determining a first filter model for the first segment and a second filter model for the second segment.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application is a continuation of U.S. patent application Ser. No. 18/690,503, having a section 371 date of 2024 Mar. 8, which is the section 371 National Stage of International Patent Application No. PCT/EP2022/074787, filed 2022 Sep. 7, which claims priority to U.S. Provisional Application No. 63/242,223, filed 2021 Sep. 9. The above identified applications are incorporated by this reference. TECHNICAL FIELD This disclosure relates to methods and apparatus for efficient modeling of filters. BACKGROUND We are equipped with two ears that capture sound waves propagating towards us. FIG. 1 illustrates a sound wave propagating towards a listener from a direction of arrival (DOA) specified by a pair of elevation and azimuth angles in the spherical coordinate system. On the propagation path towards us each sound wave interacts with our upper torso, head, outer ears, and the surrounding matter before reaching our left and right ear drums. This interaction results in temporal and spectral changes of the waveforms reaching the left and right eardrums, some of which are DOA dependent. Our auditory system has learned to interpret these changes to infer various spatial characteristics of the sound wave itself as well as the acoustic environment in which the listener finds himself/herself. This capability is called spatial hearing, which concerns how we evaluate spatial cues embedded in the binaural signal (i.e., the sound signals in the right and the left ear canals) to infer the location of an auditory event elicited by a sound event (a physical sound source) and acoustic characteristics caused by the physical environment (e.g., small room, tiled bathroom, auditorium, cave) we are in. This human capability, spatial hearing, can in turn be exploited to create a spatial audio scene by reintroducing the spatial cues in the binaural signal that would lead to a spatial perception of a sound. The main spatial cues include 1) angular-related cues: binaural cues, i.e., the interaural level difference (ILD) and the interaural time difference (ITD), and monaural (or spectral) cues; 2) distance-related cues: intensity and direct-to-reverberant (D/R) energy ratio. FIG. 2 illustrates an example of ITD and spectral cues of a sound wave propagating towards a listener. The two plots illustrate the magnitude responses of a pair of HR filters obtained at an elevation of 0 degrees and an azimuth of 40 degrees (The data is from CIPIC database: subject-ID 28. The database is publicly available, which can be access from the URL www.ece.ucdavis.edu/cipic/spatial-sound/hrtf-data/). A mathematical representation of the short time DOA dependent temporal and spectral changes (1-5 msec) of the waveform are the so-called HR filters. The frequency domain (FD) representations of those filters are the so-called head-related transfer functions (HRTFs) and the time domain (TD) representations are the head-related impulse responses (HRIRs). An HR filter based binaural rendering approach has been gradually established, where a spatial audio scene is generated by directly filtering audio source signals with a pair of HR filters of desired locations. This approach is particularly attractive for many emerging applications, e.g., virtual reality (VR), augmented reality (AR), mixed reality (MR), or extended reality (XR), and mobile communication systems, where headsets are commonly used. HR filters are often estimated from measurements as the impulse response of a linear dynamic system that transforms the original sound signal (input signal) into the left and right ear signals (output signals) that can be measured inside the ear channels of a listening subject at a predefined set of elevation and azimuth angles on a spherical surface of constant radius from a listening subject (e.g., an artificial head, a manikin/mannequin or human subjects). The estimated HR filters are often provided as finite impulse response (FIR) filters and can be used directly in that format. To achieve an efficient binaural rendering, a pair of HRTFs may be converted to Interaural Transfer Function (ITF) or modified ITF to prevent abrupt spectral peaks. Alternatively, HRTFs may be described by a parametric representation. Such parameterized HRTFs are easy to be integrated with parametric multichannel audio coders, e.g., MPEG surround and Spatial Audio Object Coding (SAOC). Rendering a spatial audio signal to provide a convincing spatial perception of a sound at an arbitrary location in space requires a pair of HR filters at the corresponding location, and therefore, a set of HR filters at finely sampled locations on a two dimensional (2D) sphere is needed. Minimum audible angle (MAA) characterizes the sensitivity of our auditory system to an angular displacement of a sound event. Regarding localization in azimuth, MAA was reported to be the smallest in the front and back (about 1 degree), and much greater for lateral sound sources (about 1