US-20260128028-A1 - BINAURAL DATA SHARING IN EAR-WORN DEVICES USING NEURAL NETWORKS

US20260128028A1US 20260128028 A1US20260128028 A1US 20260128028A1US-20260128028-A1

Abstract

Described herein is binaural data sharing technology for ear-worn devices to improve audio processing performance. Different embodiments may include sharing of various data types, such as processed microphone signals, beamformed signals, neural network products (e.g., masks), and environmental metrics. For beamforming, devices may combine signals from both ears for improved directional selectivity or process separate beamformed signals independently. Devices may be configured to generate identical masks or average mask magnitude portions while preserving device-specific phase components. Neural networks may be trained to handle mixed-latency data, processing current local data with “stale” data from the other device. Environmental metrics like signal-to-noise ratios may be shared for coordinated responses to acoustic conditions. The technology may also apply to integrated devices like eyeglasses.

Inventors

Igor LOVCHINSKY
Nathan Agmon
Philip Meyers IV
Israel Malkin
Nicholas Morris
Mark Berry

Assignees

Fortell Research Inc.

Dates

Publication Date: 20260507
Application Date: 20251104

Claims (20)

1 . A system, comprising: a first ear-worn device comprising: first neural network circuitry, and first communication circuitry; and a second ear-worn device comprising: second neural network circuitry, and second communication circuitry; wherein: the first communication circuitry and the second communication circuitry are configured to communicate over a wireless communication link; the first neural network circuitry is configured to: receive one or more first audio signals generated by the first ear-worn device, and implement one or more first neural network layers, wherein the first neural network circuitry is configured to use the one or more first neural network layers to generate a first neural network product based on the one or more first audio signals; the second neural network circuitry is configured to: receive one or more second audio signals generated by the second ear-worn device, and implement one or more second neural network layers, wherein the second neural network circuitry is configured to use the one or more second neural network layers to generate a second neural network product based on the one or more second audio signals; the first communication circuitry is configured to: transmit, to the second communication circuitry over the wireless communication link, first data comprising or originating from the first neural network product, and receive, from the second communication circuitry over the wireless communication link, second data comprising or originating from the second neural network product; the first data comprises a first mask and the second data comprises a second mask, or the first data comprises a processed version of the first mask and the second data comprises a processed version of the second mask; the first ear-worn device is configured to combine the first mask with the second mask, thereby generating a first combined mask; the first ear-worn device is configured, when combining the first mask with the second mask, to combine a magnitude portion of the first mask with a magnitude portion of the second mask; and the first combined mask comprises: a magnitude portion based on combining the magnitude portion of the first mask with the magnitude portion of the second mask, and a phase portion based on a phase portion of the first mask.
2 - 6 . (canceled)
7 . The system of claim 1 , wherein the first ear-worn device is configured, when combining the magnitude portion of the first mask with the magnitude portion of the second mask, to average the magnitude portion of the first mask with the magnitude portion of the second mask.
8 . The system of claim 1 , wherein the second ear-worn device is configured to combine the first mask with the second mask, thereby generating a second combined mask.
9 . The system of claim 8 , wherein the first combined mask and the second combined mask are the same.
10 . The system of claim 8 , wherein magnitude portions of the first combined mask and the second combined mask are the same.
11 . The system of claim 8 , wherein: the first ear-worn device is configured to apply the first combined mask to one of the one or more first audio signals; the second ear-worn device is configured to apply the second combined mask to one of the one or more second audio signals; the one of the one or more first audio signals comprises a beamformed audio signal; and the one of the one or more second audio signals comprises a beamformed audio signal.
12 . The system of claim 8 , wherein: the first ear-worn device is configured to apply the first combined mask to one of the one or more first audio signals; the second ear-worn device is configured to apply the second combined mask to one of the one or more second audio signals; and the one of the one or more first audio signals and the one of the one or more second audio signals are different.
13 . The system of claim 1 , wherein the first ear-worn device is configured to apply the first combined mask to an audio signal received by the first ear-worn device subsequently to when the one or more first audio signals are received.
14 . A system, comprising: a first ear-worn device comprising: first neural network circuitry, and first communication circuitry; and a second ear-worn device comprising: second neural network circuitry, and second communication circuitry; wherein: the first communication circuitry and the second communication circuitry are configured to communicate over a wireless communication link; the first neural network circuitry is configured to: receive one or more first audio signals generated by the first ear-worn device, and implement one or more first neural network layers, wherein the first neural network circuitry is configured to use the one or more first neural network layers to generate a first neural network product based on the one or more first audio signals; the second neural network circuitry is configured to: receive one or more second audio signals generated by the second ear-worn device, and implement one or more second neural network layers, wherein the second neural network circuitry is configured to use the one or more second neural network layers to generate a second neural network product based on the one or more second audio signals; the first communication circuitry is configured to: transmit, to the second communication circuitry over the wireless communication link, first data comprising or originating from the first neural network product, and receive, from the second communication circuitry over the wireless communication link, second data comprising or originating from the second neural network product; the first data comprises a first mask and the second data comprises a second mask, or the first data comprises a processed version of the first mask and the second data comprises a processed version of the second mask; the first ear-worn device is configured to compare the first mask with the second mask; the first ear-worn device further comprises mixing circuitry configured to perform mixing of at least two audio signals, thereby generating an output audio signal; and based on the comparison, the mixing circuitry is further configured to modulate weighting of the at least two audio signals in the mixing.
15 . The system of claim 1 , wherein: the second data comprises the processed version of the second mask; and the first ear-worn device is configured to generate the second mask from the second data using decoding or interpolation.
16 . A system, comprising: a first ear-worn device comprising: first neural network circuitry, and first communication circuitry; and a second ear-worn device comprising: second neural network circuitry, and second communication circuitry; wherein: the first communication circuitry and the second communication circuitry are configured to communicate over a wireless communication link; the first neural network circuitry is configured to: receive one or more first audio signals generated by the first ear-worn device, and implement one or more first neural network layers, wherein the first neural network circuitry is configured to use the one or more first neural network layers to generate a first neural network product based on the one or more first audio signals; the second neural network circuitry is configured to: receive one or more second audio signals generated by the second ear-worn device, and implement one or more second neural network layers, wherein the second neural network circuitry is configured to use the one or more second neural network layers to generate a second neural network product based on the one or more second audio signals; the first communication circuitry is configured to: transmit, to the second communication circuitry over the wireless communication link, first data comprising or originating from the first neural network product, and receive, from the second communication circuitry over the wireless communication link, second data comprising or originating from the second neural network product; and the first neural network circuitry is configured to input the second data or a processed version thereof to at least one of the one or more first neural network layers.
17 . The system of claim 16 , wherein the first neural network circuitry is configured to input the second data or the processed version thereof to the at least one of the one or more first neural network layers when processing audio signals received subsequent to the one or more first audio signals.
18 . The system of claim 16 , wherein the first neural network circuitry is configured to use the one or more first neural network layers to decode the second data.
19 . A system, comprising: a first ear-worn device comprising: first neural network circuitry, and first communication circuitry; and a second ear-worn device comprising: second neural network circuitry, and second communication circuitry; wherein: the first communication circuitry and the second communication circuitry are configured to communicate over a wireless communication link; the first neural network circuitry is configured to: receive one or more first audio signals generated by the first ear-worn device, and implement one or more first neural network layers, wherein the first neural network circuitry is configured to use the one or more first neural network layers to generate a first neural network product based on the one or more first audio signals; the second neural network circuitry is configured to: receive one or more second audio signals generated by the second ear-worn device, and implement one or more second neural network layers, wherein the second neural network circuitry is configured to use the one or more second neural network layers to generate a second neural network product based on the one or more second audio signals; the first communication circuitry is configured to: transmit, to the second communication circuitry over the wireless communication link, first data comprising or originating from the first neural network product, and receive, from the second communication circuitry over the wireless communication link, second data comprising or originating from the second neural network product; and the second data comprises some but not all frequencies of the second neural network product.
20 . The system of claim 1 , wherein the second data comprises an encoded version of the second neural network product.
21 . The system of claim 14 , wherein: the second data comprises the processed version of the second mask; and the first ear-worn device is configured to generate the second mask from the second data using decoding or interpolation.
22 . The system of claim 14 , wherein the second data comprises an encoded version of the second neural network product.
23 . The system of claim 16 , wherein the second data comprises an encoded version of the second neural network product.
24 . The system of claim 19 , wherein the second data comprises an encoded version of the second neural network product.

Description

BACKGROUND Field The present disclosure relates to ear-worn devices. Some aspects relate to binaural data sharing in ear-worn devices using neural networks. Related Art Ear-worn devices, such as hearing aids, may be used to help those who have trouble hearing to hear better. Typically, ear-worn devices amplify received sound. Some ear-worn devices may attempt to reduce noise in received sound. SUMMARY The inventors have recognized that for systems including two ear-worn devices, one worn on each ear, sharing data between the ear-worn devices may improve the performance of each of the ear-worn devices. For example, by sharing data between the two ear-worn devices, each device may leverage information from both ears to make better decisions about audio processing, noise reduction, and/or spatial focusing. This binaural approach may result in improved speech clarity, better noise suppression, and/or enhanced directional hearing compared to each device operating independently with only its own microphone data. The shared information may enable neural network processing that can take advantage of the spatial separation between the two ears, allowing for better localization of sound sources and more effective separation of desired speech from background noise. Additionally, the binaural data sharing may help reduce inconsistencies between the two ears that might otherwise create unnatural or distracting auditory experiences for the user. The data shared may include, for example, processed microphone signals, beamformed microphone signals, masks, neural network products, and/or values for certain metrics. One important implementation challenge with binaural sharing is latency, as there may be a delay due to wireless transmission of data from ear-worn device to ear-worn device, in addition to audio processing delay. Latency that becomes too high may result in an intolerable experience for the wearer, for example due to the delay between the wearer hearing the direct path of sound versus the amplified path of sound resulting in echoes and/or due to lag between movement of lips and perception of sound. As a first matter, the wireless communication protocol used may depend on latency considerations. For example, a lower latency protocol like near-field magnetic induction (NFMI)) may be preferable than a higher latency protocol like Bluetooth. Furthermore, data transfer considerations may affect what kind of data may be shared. Wireless communication protocols may feature a data budget that must be satisfied in order to realize a tolerable latency. Audio signals may exceed the data budget, but neural network products such as masks may not. Furthermore, neural network products such as masks may be more resilient for use as “stale” features (i.e., used for processing later audio frames). On the other hand, shared audio signals may contain more useful data than neural network products, may allow for forming sophisticated beam patterns, and may be more natural inputs to neural networks. Accordingly, the inventors have developed technology enabling transmission of different types of data. For scenarios in which latency constraints make transmitting audio signals impractical, the inventors have developed technology for enabling sharing of neural network products such as masks. One potential drawback of sharing masks rather than audio signals is that the neural network running on each ear-worn device might not receive the benefit of input data generated by the other ear-worn device. Accordingly, the inventors have developed technology enabling input of a shared mask to a neural network, thus providing the neural network with input data from the other ear-worn device. The inventors have recognized that in some scenarios, even sharing neural network products such as masks may be impractical due to latency constraints. Accordingly, the inventors have developed technology enabling “stale” neural network products (e.g., generated by the other ear-worn device from a previous frame of audio) from one ear-worn device to be input into the neural network of another ear-worn device. As described above, a neural network may be able to provide higher quality output when it receives, as input, data from both ear-worn devices. Therefore, for this consideration, sharing data upstream of the neural network may be helpful. However, another consideration is binaural consistency. As described above, inconsistencies between the sound output from the device on each ear may create unnatural or distracting auditory experiences for the wearer. Sharing data upstream of the neural networks might not necessarily result in the same outputs, and thus might not ensure binaural consistency. While sharing and combining downstream data such as masks may be one method for ensuring binaural consistency (as described in more detail in the description below), sharing data both upstream and downstream of the neural network may be prohibitive in terms of latency. A