CN-121600946-B - Audio echo cancellation method, module and related equipment

CN121600946BCN 121600946 BCN121600946 BCN 121600946BCN-121600946-B

Abstract

The embodiment of the application provides an audio echo cancellation method, an audio echo cancellation module and related equipment. The method comprises the steps of identifying a target application scene, determining a corresponding target interface according to the target application scene, and executing target echo cancellation operation on an input original audio signal according to the target interface so as to output a target audio signal. In this way, the application scene self-adaption of the audio echo cancellation algorithm can be realized, and the interface strategy can be flexibly adjusted according to different equipment types, communication protocols and/or acoustic environments, so that the compatibility of the audio echo cancellation technology in various application scenes is improved, and the output quality of audio signals in specific application scenes is improved pertinently.

Inventors

PAN YIHONG

Assignees

深圳时空壶技术有限公司

Dates

Publication Date: 20260512
Application Date: 20260122

Claims (9)

1. An audio echo cancellation method, comprising: identifying a target application scene; Determining a corresponding target interface according to the target application scene; According to the target interface, performing a target echo cancellation operation on an input original audio signal to output a target audio signal, including: According to the target root mean square filtering parameter, executing target filtering operation on an audio signal block corresponding to the original audio signal so that the convergence degree corresponding to a first target frequency band is greater than or equal to a preset convergence degree; and/or the stability corresponding to the second target frequency band is greater than or equal to the preset stability; the method comprises the steps of receiving a target root mean square filtering parameter, determining a dynamic adjustment self-adaptive step length of the target root mean square filtering parameter according to the root mean square corresponding to the original audio signal, and executing target filtering operation on the audio signal block according to the dynamic adjustment self-adaptive step length; the energy value corresponding to the first target frequency band is larger than the energy value corresponding to the second target frequency band.
2. The method of claim 1, wherein the determining the corresponding target interface according to the target application scenario comprises: Under the condition that the target application scene is judged to correspond to the target non-WebRTC application scene, determining that the target interface corresponds to the first target interface; and determining that the target interface corresponds to the second target interface under the condition that the target application scene corresponds to the target WebRTC application scene.
3. The method of claim 2, wherein the first target interface is adapted for a first set of target interface functions, comprising: a target instance creation function, a target audio parameter initialization function, a target input output function, and/or a target instance destruction function.
4. The method of claim 2, wherein the second target interface is adapted for a second set of target interface functions, comprising: The target interface translates the calling function and/or the target instance returns the function.
5. The method as recited in claim 1, further comprising: performing a target dynamic delay matching operation on a plurality of target intervals of the audio signal block to dynamically update a target delay compensation amount so that a waveform difference between the original audio signal and a target reference signal is smaller than a preset difference; the target intervals comprise an acquisition interval and/or a rendering interval.
6. The method as recited in claim 5, further comprising: Under the condition that the target abnormality exists, correcting the target dynamic delay matching operation according to a historical delay sequence smoothing result and/or a confidence evaluation result; The target abnormality comprises delay mutation, data packet loss and/or signal energy abnormality.
7. An audio echo cancellation module for implementing audio echo cancellation according to the method of any one of claims 1-6, comprising: The identification unit is used for identifying the target application scene; The determining unit is used for determining a corresponding target interface according to the target application scene; an execution unit, configured to execute a target echo cancellation operation on an input original audio signal according to the target interface, to output a target audio signal, including: According to the target root mean square filtering parameter, executing target filtering operation on an audio signal block corresponding to the original audio signal so that the convergence degree corresponding to a first target frequency band is greater than or equal to a preset convergence degree; and/or the stability corresponding to the second target frequency band is greater than or equal to the preset stability; the method comprises the steps of receiving a target root mean square filtering parameter, determining a dynamic adjustment self-adaptive step length of the target root mean square filtering parameter according to the root mean square corresponding to the original audio signal, and executing target filtering operation on the audio signal block according to the dynamic adjustment self-adaptive step length; the energy value corresponding to the first target frequency band is larger than the energy value corresponding to the second target frequency band.
8. An electronic device comprising a memory and a processor, the memory having stored thereon a computer program, characterized in that the processor, when executing the computer program, implements the method according to any of claims 1-6.
9. A computer readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, implements the method according to any one of claims 1-6.

Description

Audio echo cancellation method, module and related equipment Technical Field Embodiments of the present application relate to the field of audio signal processing, and in particular, to an audio echo cancellation method, module and related device. Background The audio echo cancellation technique (Acoustic Echo Cancellation, AEC) is mainly applied to real-time audio communication scenes such as video conferences, network phones, online game voices, etc. At present, the related art has the problem of poor compatibility for various application scenes and the like. Disclosure of Invention According to the embodiment of the application, the method, the module and the related equipment for audio echo cancellation are provided, the corresponding target interface is determined by identifying the target application scene, the target echo cancellation operation is carried out on the input original audio signal according to the target interface so as to output the target audio signal, the application scene self-adaption of an audio echo cancellation algorithm can be realized, the interface strategy is flexibly adjusted according to different equipment types, communication protocols and/or acoustic environments, the compatibility of the audio echo cancellation technology in various application scenes is improved, and the output quality of the audio signal in a specific application scene is improved in a targeted manner. In a first aspect of the present application, there is provided an audio echo cancellation method comprising: identifying a target application scene; determining a corresponding target interface according to the target application scene; according to the target interface, a target echo cancellation operation is performed on the input original audio signal to output a target audio signal. In some possible embodiments, the determining the corresponding target interface according to the target application scenario includes: Determining that the target interface corresponds to the first target interface under the condition that the target application scene corresponds to the target non-WebRTC application scene; And determining that the target interface corresponds to the second target interface under the condition that the target application scene corresponds to the target WebRTC application scene. In some possible embodiments, the first target interface is applicable to a first target interface function set, including: a target instance creation function, a target audio parameter initialization function, a target input output function, and/or a target instance destruction function. In some possible embodiments, the second target interface is applicable to a second target interface function set, including: The target interface translates the calling function and/or the target instance returns the function. In some possible embodiments, the performing, according to the target interface, the target echo cancellation operation on the input original audio signal to output a target audio signal includes: determining a target root mean square filtering parameter according to the original audio signal; according to the target root mean square filtering parameter, performing target filtering operation on an audio signal block corresponding to the original audio signal so that the convergence corresponding to the first target frequency band is greater than or equal to the preset convergence; and/or the stability corresponding to the second target frequency band is greater than or equal to the preset stability; The energy value corresponding to the first target frequency band is larger than the energy value corresponding to the second target frequency band. In some possible embodiments, the method further comprises: Performing a target dynamic delay matching operation on a plurality of target intervals of the audio signal block to dynamically update a target delay compensation amount so that a waveform difference between an original audio signal and a target reference signal is smaller than a preset difference; the plurality of target intervals comprise an acquisition interval and/or a rendering interval. In some possible embodiments, the method further comprises: Under the condition that the target abnormality is judged to exist, correcting the target dynamic delay matching operation according to the historical delay sequence smoothing result and/or the confidence evaluation result; The target abnormality comprises delay mutation, data packet loss and/or signal energy abnormality. In a second aspect of the present application, there is provided an audio echo cancellation module comprising: The identification unit is used for identifying the target application scene; the determining unit is used for determining a corresponding target interface according to the target application scene; And the execution unit is used for executing target echo cancellation operation on the input original audio signal according to the t