US-12627937-B1 - Open ear system using artificial intelligence (AI) driven audio signal processing

US12627937B1US 12627937 B1US12627937 B1US 12627937B1US-12627937-B1

Abstract

A system and associated processes include a left and right eyewear stem, each including a microphone array comprising a plurality of microphones, and a digital hearing aid that receives content from the microphone array and extracts a desired signal from the content, the digital hearing aid applying frequency-dependent gain to the desired signal to compensate for a user's hearing loss profile. The system further includes processes that receive the desired signal modified with the frequency-dependent gains to acoustically render the desired modified signal proximate to the ear of the user without anything physical being placed within an entrance to an ear canal. A rechargeable battery may be included, along with a front face of the eyewear configured to hold a pair of eyewear lenses, where the front face lacks an electrical conductor connecting the right and left eyewear stem.

Inventors

Mehul Trivedi
Anand Parthasarathi

Assignees

Legato Audio, Inc.

Dates

Publication Date: 20260512
Application Date: 20250409

Claims (20)

1 . An apparatus, comprising: a left and right eyewear stem, each comprising: a microphone array comprising a plurality of microphones; one or more processors containing an inference model, wherein at least one of the processors receives content from the microphone array and processes the content through the inference model to extract a desired signal from the content, the one or more processors further applies frequency-dependent gain to the desired signal to compensate for a user's hearing loss profile, and the one or more processors expands the desired signal in one or more frequency bands if the desired signal meets a compression criteria and compresses the desired signal in one or more frequency bands if the desired signal meets an expansion criteria; circuitry that receives the desired signal modified with the frequency-dependent gains to acoustically render the desired modified signal proximate to an ear of a user without anything physical being placed within an entrance to an ear canal; a battery; and a front face capable of holding a pair of eyewear lenses, wherein the front face lacks an electrical conductor connecting the right and left eyewear stem.
2 . The apparatus of claim 1 , wherein the at least one processor comprises a neural network loaded with the inference model and configured to process content through the inference model at a power consumption of less than or equal to 5 mW.
3 . The apparatus of claim 1 , wherein the one or more processors in each of the left and right eyewear stems expands the desired signal based at least in part on a probability provided by the one or more processors that the desired signal is speech.
4 . The apparatus of claim 1 , wherein the one or more processors in each of the left and right eyewear stems expands the desired signal in a high frequency band that is centered at equal or greater to 1000 Hz based at least in part on a probability provided by the processor that content in the desired signal in the high frequency band is speech.
5 . The apparatus of claim 4 , wherein the one or more processors in each of the left and right eyewear stems compresses the desired signal based at least in part on the level of the desired signal being above a predetermined threshold in one or more frequency bands.
6 . The apparatus of claim 5 , wherein the one or more processors in each of the left and right eyewear stems compresses the desired signal in at least one low frequency band centered at less than 1000 Hz responsive to the desired signal being above the predetermined threshold sound pressure level in that band.
7 . The apparatus of claim 1 , wherein the microphone arrays in each of the left and right eyewear stem are configured to form a first beamforming pattern comprising: a narrow beamforming pattern that is at an angle that less than or equal to 180 degrees normal to the front face; and an omnidirectional beamforming pattern that is 360 degrees normal to the front face.
8 . The apparatus of claim 7 , wherein the apparatus automatically switches to: the omnidirectional beamforming pattern in a low noise environment below a specified noise level; and the narrow beamforming pattern in a high noise environment above a specified noise level.
9 . The apparatus of claim 1 , wherein the front face is interchangeable with a replacement front face that is capable of being connected to the left and right eyewear stems.
10 . The apparatus of claim 1 , wherein the one or more processors in each of the left and right eyewear stems compresses based at least in part on the level of the desired signal being above a predetermined threshold.
11 . The apparatus of claim 1 , further comprising: an environmental classifier, the environmental classifier configured to detect at least an ambient sound level of a user's environment, wherein the one or more processors of the left and right eyewear stems are configured to automatically adjust an operation of the processors in response to a change in the ambient sound level.
12 . The apparatus of claim 11 , wherein the processors of each of the eyewear stem adjusts a volume in response to the change in the ambient sound level.
13 . The apparatus of claim 11 , wherein the processors of each of the eyewear stem further shifts an expansion setting from a first frequency band to a second frequency band when the ambient sound level exceeds a predetermined amount, wherein the second frequency band is centered at a frequency higher than a center frequency of the first frequency band.
14 . An apparatus, comprising: a left and right eyewear stem, each comprising: a microphone array comprising a plurality of microphones; at one or more processors containing an inference model, wherein at least one of the processors receives content from the microphone array and processes the content through the inference model to extract a desired signal from the content, the one or more processors further applies frequency-dependent gain to the desired signal to compensate for a user's hearing loss profile; circuitry that receives the desired signal modified with the frequency-dependent gains to acoustically render the desired modified signal proximate to an ear of a user without anything physical being placed within an entrance to an ear canal; a battery; a self-voice sensor for detecting a voice signal from the user, wherein detection of the voice signal from the user causes the circuitry to suppress acoustic output of the user's voice; and a front face capable of holding a pair of eyewear lenses, wherein the front face lacks an electrical conductor connecting the right and left eyewear stem.
15 . The apparatus of claim 14 , wherein suppression of the acoustic output of the user's voice comprises the circuitry reducing or preventing all the acoustic output while the voice signal from the user is detected.
16 . The apparatus of claim 14 , wherein the one or more processors in each of the left and right eyewear stems is further configured to expand the desired signal in a frequency band if the desired signal meets an expansion criteria.
17 . The apparatus of claim 16 , wherein the one or more processors in each of the left and right eyewear steams expands the desired signal based at least in part of a probability provided by the one or more processors that the desired signal is speech.
18 . The apparatus of claim 16 , wherein the one or more processors in each of the left and right eyewear steams expands the desired signal in a high frequency band that is centered at equal or greater to 1000 Hz based at least in part on a probability provided by the processor that content in the desired signal in the high frequency band is speech.
19 . An apparatus, comprising: a left and right eyewear stem, each comprising: a microphone array comprising a plurality of microphones; one or more processors containing an inference model, wherein at least one of the processors receives content from the microphone array and processes the content through the inference model to extract a desired signal from the content, the one or more processors applying frequency-dependent gain to the desired signal to compensate for a user's hearing loss profile; circuitry that receives the desired signal modified with the frequency-dependent gains to acoustically render the desired modified signal proximate to an ear of a user without anything physical being placed within an entrance to an ear canal; a battery; an environmental classifier, the environmental classifier configured to detect at least an ambient sound level of a user's environment, wherein the processors of the left and right eyewear stems are configured to automatically adjust an operation of the processors in response to a change in the ambient sound level; and a front face capable of holding a pair of eyewear lenses, wherein the front face lacks an electrical conductor connecting the right and left eyewear stem.
20 . The apparatus of claim 19 , wherein the one or more processors of each of the eyewear stem adjusts a volume in response to a change in the ambient sound level.

Description

CROSS REFERENCE TO RELATED APPLICATIONS The present application claims priority to U.S. Provisional Patent Application No. 63/749,579, filed Jan. 25, 2025, titled “AMBIENT AWARE AUTOMATED AMPLIFICATION LEVELING FOR COGNITIVE [LEAD] LOAD REDUCTION”, U.S. Provisional Patent Application No. 63/749,574, filed Jan. 25, 2025, titled “SYSTEM FOR IMPROVING INTELLIGIBILITY FOR OPEN-EAR HEARING ASSISTANCE DEVICES USING HIGH FREQUENCY BASED INDIRECT PATH AMPLIFICATION,” and U.S. Provisional Patent Application No. 63/750,331 filed Jan. 28, 2025, titled “MULTIBAND DYNAMIC RANGE EXPANSION FRAMEWORK.” The entire disclosures of the applications listed above are hereby incorporated by reference, in their entirety, for all that they teach and for all purposes. BACKGROUND Conventional hearing aids require the reproduction of an audio environment alongside amplification of the desired speech content because they at least partially occlude the user's direct path access to sound. More particularly, traditional hearing assistance relies on passive occlusion that seals the ear canal and reinserts a full spectrum signal with enhanced voice-band frequencies. Low frequency content is amplified in an attempt to maintain the naturalness of speech content. Subsequently, hearing aids create a constant cognitive load for the user due to their non-perfect reproduction of their environment. Moreover, conventional systems fail to adequately address higher order auditory object formations due to poor spectro-temporal coding and impaired selective spatial attention. Supra-threshold hearing deficits arise from an inability to discriminate speech in noise, even when some audibility is restored. To help address these challenges, users must acclimate to their hearing aids for weeks to allow for neuroadaptation. However, compliance can be difficult, leading new users to return or not use devices. Even after neuroadaptation, hearing aids still struggle with listener intelligibility issues that hinder speech perception. BRIEF DESCRIPTION OF THE DRAWINGS The Detailed Description is described with reference to the accompanying figures. FIG. 1 is a first perspective view of an example of a pair of glasses comprising an open ear system having internal electronics packaging configured to modulate an audio signal. FIG. 2 is a block diagram of an example of an open-ear, AI-assisted audio processing system that is trained to extract desired signals from ambient audio based on modeled frequencies, probabilities, and empirical data. FIG. 3A-3B are a block diagrams of another example of an open-ear, AI-assisted audio processing system. FIG. 4 is a block diagram of an open-ear audio system that includes right and left audio circuits paths. FIG. 5 is a block diagram illustrating direct and augmented path transfer functions. FIG. 6 is a plot of the average speech levels of selected TIMITS in six octave bands. FIG. 7 is a graph scoring degradation of clean speech processed through the audiograms. FIG. 8 is a chart depicting compression and expansion parameters. FIG. 9 is a graph that plots HASPI scores of the unprocessed signal (s(t)+n(t)) and the processed signal (x′(t)) for various SNR levels. FIG. 10 is a graph depicting a change in HASPI scores. FIG. 11 illustrates an example machine learning model consistent with some implementations of the present concepts. FIG. 12 illustrates an example computer model consistent with some implementations of the present concepts. DETAILED DESCRIPTION The present concepts are directed to an open-ear audio system integrated into eyewear that leverages sparse artificial intelligence (AI) to augment or, in some cases, fully replace digital signal processing (DSP), along with natural, ambient sound to render audio to the cars of a user. Traditional (or dense) AI neural networks are designed to optimize for performance and often use tens or hundreds of billions of parameters. While highly powerful, these traditional dense AI networks also consume considerable memory and power. A sparse AI network may be designed to optimize hearing aid performance with an operational constraint such as power consumption. In one example, a sparse AI system is created to optimize for performance (e.g., intelligibility as measured by Hearing Aid Speech Intelligibility Index (HASPI) or the Hearing Aid Speech Quality Index (HASQI) scores or speech extraction as measured by probability that resulting signal is speech) while maintaining a power draw of 5 milliwatts (mW) or less. For example, a sparse AI network can be optimized to operate at 5 mW of power while delivering 0.8 HASPI score in a 0 dB SNR environment for most common hearing loss profiles. A sparse AI network can leverage sparse weights (i.e., only storing and computing on weights that are necessary to reliably extract speech from noise) and can effectively extract speech from noise (as well as other DSP functions) using 100× fewer operations and with 100× less power than a non-sparse/dense AI