US-12621629-B2 - HRTF determination using a headset and in-ear devices

US12621629B2US 12621629 B2US12621629 B2US 12621629B2US-12621629-B2

Abstract

Techniques for determining personalized head-related transfer functions (HRTFs) using a head-mounted device and in-ear devices include: receiving, from a sensor array of the head-mounted device, a first sound signal associated with a sound from a sound source in a local environment of a user of the head-mounted device; determining that reverberation characteristics and spectral characteristics of the sound meet predetermined criteria based on the first sound signal; determining that the sound source is stationary within a time period; determining a relative location of the sound source with respect to the user; receiving, from an in-ear device in an ear of the user, a second sound signal associated with the sound from the sound source; and determining, based on at least the second sound signal, an HRTF or one or more parameters of the HRTF associated with the relative location of the sound source for the user.

Inventors

Andrew Francl
Tobias Daniel KABZINSKI
HAO LU
Antje Ihlefeld
William Owen Brimijoin, II

Assignees

META PLATFORMS TECHNOLOGIES, LLC

Dates

Publication Date: 20260505
Application Date: 20240306

Claims (20)

1 . A method comprising: receiving, from a sensor array of a head-mounted device, a first sound signal associated with a sound from a sound source in a local environment of a user of the head-mounted device; determining, based on the first sound signal, that reverberation characteristics and spectral characteristics of the sound meet predetermined criteria; determining that the sound source is stationary within a time period; determining a relative location of the sound source with respect to the user; receiving, from an in-ear device in an ear of the user, a second sound signal associated with the sound from the sound source; and determining, based on at least the second sound signal, a head-related transfer function (HRTF) or one or more parameters of the HRTF associated with the relative location of the sound source for the user, the determining of the HRTF including, at least in part, determining a reference signal based on the first sound signal and the determined relative location of the sound source.
2 . The method of claim 1 , wherein determining the relative location of the sound source with respect to the user includes determining an azimuth angle of the sound source, an elevation angle of the sound source, or a combination thereof with respect to the user.
3 . The method of claim 1 , wherein determining the relative location of the sound source with respect to the user includes: determining a direction of arrival of the sound based on the first sound signal from the senor array and locations of two or more sensors in the sensor array; determining the relative location of the sound source with respect to the user based on images captured by one or more cameras on the head-mounted device; or a combination thereof.
4 . The method of claim 3 , wherein determining the relative location of the sound source with respect to the user includes determining a confidence level of the determined relative location of the sound source with respect to the user.
5 . The method of claim 1 , wherein determining the HRTF or the one or more parameters of the HRTF associated with the relative location of the sound source for the user further includes: determining the HRTF or the one or more parameters of the HRTF based on a spectrum of the reference sound signal and a spectrum of the second sound signal.
6 . The method of claim 5 , wherein determining the reference sound signal includes beamforming in a direction of the relative location of the sound source based on the first sound signal.
7 . The method of claim 1 , further comprising determining, based on data from one or more position sensors of the head-mounted device, a relative position of the torso of the user with respect to the head of the user.
8 . The method of claim 1 , further comprising saving the HRTF or the one or more parameters of the HRTF and the relative location of the sound source to a data store that stores a plurality of HRTFs for the user.
9 . The method of claim 1 , wherein the reverberation characteristics and spectral characteristics of the sound include a signal-to-noise ratio, a frequency range, a reverberation level, a reverberation time, or a combination thereof.
10 . The method of claim 1 , further comprising generating a model or a look-up table for mapping the relative location of the sound source to the one or more parameters of the HRTF.
11 . The method of claim 1 , wherein the one or more parameters of the HRTF include parameters of one or more filters or frequency scaling factors for implementing the HRTF.
12 . The method of claim 1 , further comprising performing operations of the method of claim 1 iteratively to determine HRTFs or parameters of the HRTFs associated with a plurality of sound source directions with respect to the user.
13 . The method of claim 1 , wherein the time period is greater than 10 milliseconds.
14 . A system comprising: an in-ear device configured to generate a first sound signal associated with a sound from a sound source in a local environment of a user; and a head-mounted device comprising: a sensor array configured to generate a second sound signal associated with the sound; and an audio controller configured to: determine, based on the second sound signal, that reverberation characteristics and spectral characteristics of the sound meet predetermined criteria; determine that the sound source is stationary within a time period; determine a relative location of the sound source with respect to the user; and determine, based on at least the first sound signal, a head-related transfer function (HRTF) or one or more parameters of the HRTF associated with the relative location of the sound source for the user, the determining of the HRTF including, at least in part, determining a reference signal based on the first sound signal and the determined relative location of the sound source.
15 . The system of claim 14 , wherein the audio controller is configured to determine an azimuth angle of the sound source, an elevation angle of the sound source, or a combination thereof with respect to the user.
16 . The system of claim 14 , wherein the audio controller is configured to determine the relative location of the sound source with respect to the user by performing operations including: determining a direction of arrival of the sound based on the first sound signal from the senor array and locations of two or more sensors in the sensor array; determining the relative location of the sound source with respect to the user based on images captured by one or more cameras on the head-mounted device; or a combination thereof.
17 . The system of claim 14 , wherein the audio controller is configured to determine the HRTF or the one or more parameters of the HRTF associated with the relative location of the sound source for the user by performing operations including: determining the HRTF or the one or more parameters of the HRTF based on a spectrum of the reference sound signal and a spectrum of the second sound signal.
18 . The system of claim 17 , wherein the reference sound signal is a sound signal at a center of the head of the user determined by beamforming in a direction of the relative location of the sound source based on the second sound signal.
19 . The system of claim 14 , wherein the one or more parameters of the HRTF include parameters of one or more filters or frequency scaling factors for implementing the HRTF.
20 . A system comprising: one or more processors; and one or more processor-readable media storing instructions which, when executed by the one or more processors, cause the one or more processors to: receive, from a sensor array of a head-mounted device, a first sound signal associated with a sound from a sound source in a local environment of a user of the head-mounted device; determine, based on the first sound signal, that reverberation characteristics and spectral characteristics of the sound meet predetermined criteria; determine that the sound source is stationary within a time period; determine a relative location of the sound source with respect to the user; receive, from an in-ear device in an ear of the user, a second sound signal associated with the sound from the sound source; and determine, based on at least the second sound signal, a head-related transfer function (HRTF) or one or more parameters of the HRTF associated with the relative location of the sound source for the user, the determining of the HRTF including, at least in part, determining a reference signal based on the first sound signal and the determined relative location of the sound source.

Description

CROSS-REFERENCE TO RELATED APPLICATION This application claims the benefit of and priority to U.S. Provisional Application No. 63/488,895, filed Mar. 7, 2023, entitled “HRTF DETERMINATION USING A HEADSET AND IN-EAR DEVICES,” which is hereby incorporated by reference in its entirety. BACKGROUND An artificial reality system, such as a head-mounted display (HMD) or heads-up display (HUD) system, generally includes a near-eye display system in the form of a headset or a pair of glasses and configured to present content to a user via an electronic or optic display that is within, for example, about 10-20 mm in front of the user's eyes. The near-eye display system may display virtual objects or combine images of real objects with virtual objects, as in virtual reality (VR), augmented reality (AR), or mixed reality (MR) applications. A near-eye display generally includes an optical system configured to form an image of a computer-generated image displayed by an image source (e.g., a display panel). For example, the optical system may relay the image generated by the image source to create a virtual image that appears to be further than just a few centimeters away from the user's eyes. In addition to displaying virtual images at target image planes, spatial sound or three-dimensional (3D) sound rendering such that the user may perceive the sound of a virtual object that appears to originate from the target location of the virtual object may also be needed in AR/VR systems in order to enhance the immersive user experience for successful realization of the VR/AR systems. Personalized transfer functions describing the way sound interacts with the user's head and torso before reaching user's ear canals may be used to render high-fidelity spatial sound. SUMMARY This disclosure relates generally to determining head-related transfer functions (HRTFs), and more specifically, to HRTFs or HRTF parameters determination using a head-mounted device (e.g., a headset) and in-ear devices. Various inventive embodiments are described herein, including devices, systems, methods, structures, processes, and the like. According to certain embodiments disclosed herein, a method may include: receiving, from a sensor array of a head-mounted device, a first sound signal associated with a sound from a sound source in a local environment of a user of the head-mounted device; determining, based on the first sound signal, that reverberation characteristics and spectral characteristics of the sound meet predetermined criteria; determining that the sound source is stationary within a time period; determining a relative location of the sound source with respect to the user; receiving, from an in-ear device in an ear of the user, a second sound signal associated with the sound from the sound source; and determining, based on at least the second sound signal, a head-related transfer function (HRTF) or one or more parameters of the HRTF associated with the relative location of the sound source for the user. According to certain embodiments disclosed herein, a system for HRTF measurement may include an in-ear device and a head-mounted device. The in-ear device may be configured to generate a first sound signal associated with a sound from a sound source in a local environment of a user. The head-mounted device may include a sensor array configured to generate a second sound signal associated with the sound, and an audio controller. The audio controller may be configured to: determine, based on the second sound signal, that reverberation characteristics and spectral characteristics of the sound meet predetermined criteria; determine that the sound source is stationary within a time period; determine a relative location of the sound source with respect to the user; and determine, based on at least the first sound signal, a head-related transfer function (HRTF) or one or more parameters of the HRTF associated with the relative location of the sound source for the user. This summary is neither intended to identify key or essential features of the claimed subject matter, nor is it intended to be used in isolation to determine the scope of the claimed subject matter. The subject matter should be understood by reference to appropriate portions of the entire specification of this disclosure, any or all drawings, and each claim. The foregoing, together with other features and examples, will be described in more detail below in the following specification, claims, and accompanying drawings. BRIEF DESCRIPTION OF THE DRAWINGS Illustrative embodiments are described in detail below with reference to the following figures. FIG. 1 is a perspective view of an example of a near-eye display in the form of a pair of glasses for implementing some of the examples disclosed herein. FIG. 2 a perspective view of an example of a near-eye display in the form of a head-mounted display (HMD) device for implementing some of the examples disclosed herein. FIG. 3 is a block diagram