Search

CN-121999799-A - Sound activity detection method and device and electronic equipment

CN121999799ACN 121999799 ACN121999799 ACN 121999799ACN-121999799-A

Abstract

The embodiment of the invention provides a sound activity detection method, a sound activity detection device and electronic equipment. The method comprises the steps of carrying out windowing processing on received audio data, calculating a first value of a current window and calculating a second value of the current window, wherein the first value of the current window is a root mean square value of the audio data in the current window, the second value of the current window is a difference value between the first value of the current window and the first value of a previous window plus the second value of the previous window, and a wake-up result of the current window is obtained according to a comparison result of the second value of the current window and a wake-up threshold value. The embodiment of the invention can improve the robustness of sound activity detection, avoid noise interference and abrupt change interference of signals, and further improve the accuracy of sound activity detection.

Inventors

  • YIN XUXIAN
  • LING MING
  • AI GUO
  • YANG ZUOXING

Assignees

  • 深圳比特微电子科技有限公司

Dates

Publication Date
20260508
Application Date
20241106

Claims (10)

  1. 1. A method of detecting voice activity, applied to an electronic device, the method comprising: windowing is carried out on the received audio data; Calculating a first value of a current window and a second value of the current window, wherein the first value of the current window is a root mean square value of audio data in the current window, and the second value of the current window is a difference value between the first value of the current window and the first value of a previous window plus the second value of the previous window; And obtaining a wake-up result of the current window according to a comparison result of the second value of the current window and the wake-up threshold value.
  2. 2. The method according to claim 1, wherein the method further comprises: determining a target wake-up threshold corresponding to the current window according to the first value of the current window; And obtaining a wake-up result of the current window according to the comparison result of the second value of the current window and the wake-up threshold, wherein the wake-up result comprises the following steps: If the second value of the current window is larger than the target wake-up threshold corresponding to the current window, the wake-up result of the current window is obtained to be wake-up, otherwise, the wake-up result of the current window is obtained to be not wake-up.
  3. 3. The method according to claim 2, wherein determining the target wake-up threshold corresponding to the current window according to the first value of the current window includes: substituting the first numerical value of the current window into a target linear function for calculating a wake-up threshold value to obtain a target wake-up threshold value corresponding to the current window.
  4. 4. The method according to claim 1, wherein the method further comprises: Constructing a plurality of first sample data and a plurality of second sample data, and splicing the first sample data and the second sample data according to a preset rule to obtain training data, wherein the first sample data is historical audio data with sound activity events, and the second sample data is historical audio data without sound activity events; the training data is windowed, and a target result is calculated, wherein the target result comprises a first numerical value and a second numerical value of a window where each first sample data is located, and a first numerical value and a second numerical value of a window where each second sample data is located; And carrying out linear regression on the target result to obtain a target linear function for calculating the wake-up threshold value.
  5. 5. The method of claim 4, wherein constructing a plurality of first sample data and a plurality of second sample data, and splicing the first sample data and the second sample data according to a preset rule, to obtain training data, comprises: Collecting a number of historical audio data comprising sound activity events as first sample data; Respectively inserting mute data before each constructed first sample data, wherein the mute data is used as second sample data; And after the first sample data inserted with the mute data are spliced in sequence, playing through the audio playing equipment and then collecting through the audio collecting equipment to obtain training data.
  6. 6. The method according to claim 1, wherein the method further comprises: If the wake-up results of the continuous N windows are wake-up, outputting a detection result that the voice activity event exists, otherwise outputting a detection result that the voice activity event does not exist.
  7. 7. The method according to claim 1, wherein the method further comprises: if the output detection result is that the voice activity event exists, the electronic equipment is awakened to execute a target task, wherein the target task comprises any one of video recording, video image target detection and abnormal voice event alarm.
  8. 8. A voice activity detection apparatus for use with an electronic device, the apparatus comprising: the windowing processing module is used for carrying out windowing processing on the received audio data; The system comprises a numerical value calculation module, a numerical value calculation module and a display module, wherein the numerical value calculation module is used for calculating a first numerical value of a current window and calculating a second numerical value of the current window, the first numerical value of the current window is a root mean square value of audio data in the current window, and the second numerical value of the current window is a difference value between the first numerical value of the current window and the first numerical value of a previous window plus the second numerical value of the previous window; And the numerical comparison module is used for obtaining the awakening result of the current window according to the comparison result of the second numerical value of the current window and the awakening threshold value.
  9. 9. The apparatus of claim 8, wherein the apparatus further comprises: And the result output module is used for outputting a detection result that the sound activity event exists when the awakening results of the continuous N windows are all awakening, and outputting a detection result that the sound activity event does not exist otherwise.
  10. 10. An electronic device comprising a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface are in communication with each other via the communication bus, and wherein the memory is configured to store at least one executable instruction that causes the processor to perform the steps of the voice activity detection method according to any one of claims 1 to 7.

Description

Sound activity detection method and device and electronic equipment Technical Field The present invention relates to the field of audio processing technologies, and in particular, to a method and an apparatus for detecting sound activity, and an electronic device. Background Sound activity detection (AAD, acoustic Activity Detection) for detecting whether a sound activity event, i.e. a change in the sustained sound, is present in the current ambient sound. The AAD detection may be used as a wake-up source for low power devices, which may wake-up the device to perform a corresponding operation upon detecting the presence of a voice activity event by the AAD. Generally, AAD detection is performed by judging whether or not a sudden change occurs in the current frame sound and the previous frame sound as a wake-up source. For example, if the AAD detects that the current frame sound and the previous frame sound are suddenly changed, the device is awakened, otherwise, the device is not awakened, and the device is still in a low power consumption mode. However, by detecting whether the current frame sound and the previous frame sound are suddenly changed as wake-up sources, it is difficult to identify the transient noise and the signal suddenly changed interference, and there are problems of low detection accuracy, sensitivity to noise and the like, which results in mistakenly waking up the device and increases the power consumption of the device. Disclosure of Invention The embodiment of the invention provides a sound activity detection method, a sound activity detection device and electronic equipment, which can improve the robustness of sound activity detection, avoid noise interference and abrupt change interference of signals, and further improve the accuracy of sound activity detection. In a first aspect, an embodiment of the present invention discloses a voice activity detection method, which is applied to an electronic device, and the method includes: windowing is carried out on the received audio data; Calculating a first value of a current window and a second value of the current window, wherein the first value of the current window is a root mean square value of audio data in the current window, and the second value of the current window is a difference value between the first value of the current window and the first value of a previous window plus the second value of the previous window; And obtaining a wake-up result of the current window according to a comparison result of the second value of the current window and the wake-up threshold value. In a second aspect, an embodiment of the present invention discloses a voice activity detection apparatus, which is applied to an electronic device, and the apparatus includes: the windowing processing module is used for carrying out windowing processing on the received audio data; The system comprises a numerical value calculation module, a numerical value calculation module and a display module, wherein the numerical value calculation module is used for calculating a first numerical value of a current window and calculating a second numerical value of the current window, the first numerical value of the current window is a root mean square value of audio data in the current window, and the second numerical value of the current window is a difference value between the first numerical value of the current window and the first numerical value of a previous window plus the second numerical value of the previous window; And the numerical comparison module is used for obtaining the awakening result of the current window according to the comparison result of the second numerical value of the current window and the awakening threshold value. In a third aspect, an embodiment of the present invention discloses an electronic device, including a processor, a memory, a communication interface, and a communication bus, where the processor, the memory, and the communication interface complete communication with each other through the communication bus, and the memory is configured to store at least one executable instruction, where the executable instruction causes the processor to execute steps of a voice activity detection method as described above. The embodiment of the invention has the following advantages: The embodiment of the invention performs windowing processing on the received audio data, and performs voice activity detection based on the root mean square value (first numerical value) of the current window and the cumulative residual root mean square value (second numerical value) of the current window. Specifically, the second value of the current window is compared with the wake-up threshold, the second value of the current window is obtained by accumulating the second value of the previous window with Δx rms, and Δx rms is the difference (increment) between the root mean square value of the current window and the root mean square value of the previous window