CN-121617408-B - Narrow-band satellite voice communication noise reduction system and method

CN121617408BCN 121617408 BCN121617408 BCN 121617408BCN-121617408-B

Abstract

The invention relates to the technical field of satellite communication and discloses a narrow-band satellite voice communication noise reduction system and a method, wherein the noise reduction system comprises a preprocessing module, a noise reduction module and a noise reduction module, wherein the preprocessing module is used for acquiring an original voice signal of a satellite receiving end, preprocessing the original voice signal and obtaining a preprocessed voice signal; the device comprises a separation module, an enhancement module and a post-processing module, wherein the separation module is used for establishing a voice separation model and separating an effective voice signal and a noise signal from a preprocessed voice signal according to the voice separation model, the enhancement module is used for establishing an AI complement model and carrying out abnormal filling on the effective voice signal according to the AI complement model to obtain an enhanced voice signal, and the post-processing module is used for carrying out post-processing on the enhanced voice signal to obtain a final voice signal. The method can effectively inhibit noise and accurately restore voice tone under the constraint of narrow-band transmission, and improves satellite voice communication quality.

Inventors

ZHANG FA
SHI FENG

Assignees

柒星通信科技（北京）有限公司

Dates

Publication Date: 20260512
Application Date: 20251217

Claims (7)

1. A narrowband satellite voice communication noise reduction system, comprising: the preprocessing module is used for acquiring an original voice signal of the satellite receiving end, preprocessing the original voice signal and obtaining a preprocessed voice signal; The separation module is used for establishing a voice separation model and separating effective voice signals and noise signals from the preprocessed voice signals according to the voice separation model; The enhancement module is used for establishing an AI complement model, and carrying out abnormal filling on the effective voice signal according to the AI complement model to obtain an enhanced voice signal; The post-processing module is used for carrying out post-processing on the enhanced voice signal to obtain a final voice signal; the enhancement module performs abnormal filling on the effective voice signal according to the AI complement model to obtain an enhanced voice signal, and the method comprises the following steps: Identifying tone characteristics of the effective voice signals, and carrying out vectorization processing on the tone characteristics to obtain tone characteristic vectors; collecting tone characteristic vector change conditions among all audio frames of an effective voice signal, and identifying voice abnormal sections according to the tone characteristic vector change conditions; inputting the voice normal section adjacent to the voice abnormal section into an AI voice complement model to obtain the filling voice of the voice abnormal section; performing voice filling on all voice abnormal sections to obtain an enhanced voice signal; the voice abnormal section identification method based on the tone characteristic vector change condition comprises the following steps: calculating vector variation of each audio frame and adjacent frames in the effective voice signal according to the tone characteristic vector variation condition to form a vector variation sequence; establishing a time window in the vector change sequence by taking any vector change as a center, calculating the average vector change in the time window, and determining an abnormal threshold value corresponding to the vector change according to the average vector change in the time window; counting the abnormal thresholds of all vector variation, screening out audio frames with vector variation larger than the corresponding abnormal threshold, obtaining abnormal audio frames, and determining a voice abnormal section according to all abnormal audio frames; The determining the speech abnormal section according to all abnormal audio frames comprises the following steps: Acquiring time intervals of various abnormal audio frames, and merging the abnormal audio frames with the time intervals smaller than a preset time threshold to obtain a plurality of initial abnormal segments; And obtaining vector variation average values of the initial abnormal segments, clustering the initial abnormal segments according to the vector variation average values, and obtaining a plurality of voice abnormal segments according to clustering results.
2. The narrowband satellite voice communication noise reduction system of claim 1, wherein the preprocessing module preprocesses the original voice signal, comprising: acquiring a preset standard amplitude interval, and adjusting the amplitude of an original voice signal to be in the preset standard amplitude interval based on automatic gain control; high-pass filtering is carried out on the calibrated original voice signal, and direct current components in the original voice signal are filtered; And carrying out band-pass filtering on the original voice signal after the high-pass filtering to filter high-frequency noise and low-frequency noise in the original voice signal.
3. The narrowband satellite voice communication noise reduction system of claim 1, wherein the separation module establishes a voice separation model, separates the valid voice signal and the noise signal from the preprocessed voice signal according to the voice separation model, comprising: collecting voice sample data and noise sample data, and mixing voice audio in the voice sample data and noise audio in the noise sample data to obtain mixed audio; Establishing a training sample set according to the mixed audio and the voice label corresponding to the mixed audio, and establishing and training a voice separation model according to the training sample set to obtain a trained voice separation model; and inputting the preprocessed voice signals into a trained voice separation model to obtain effective voice signals and noise signals.
4. The narrowband satellite voice communication noise reduction system of claim 3, wherein the building and training a voice separation model from a training sample set comprises: and training a voice separation model by adopting a scale-invariant signal-to-noise ratio loss function.
5. The noise reduction system of claim 1, wherein the post-processing module performs post-processing on the enhanced speech signal to obtain a final speech signal, comprising: performing moving average filtering on the enhanced voice signal to obtain an enhanced voice signal after moving average filtering; and carrying out self-adaptive threshold filtering on the enhanced voice signal after the moving average filtering to obtain a final voice signal.
6. The narrowband satellite voice communication noise reduction system of claim 5, wherein the post-processing module post-processes the enhanced voice signal to obtain a final voice signal, further comprising: and carrying out normalization processing on the voice signal amplitude after the self-adaptive threshold value filtering to obtain a final voice signal.
7. A method for noise reduction in narrowband satellite voice communications, comprising: Acquiring an original voice signal of a satellite receiving end, and preprocessing the original voice signal to obtain a preprocessed voice signal; Establishing a voice separation model, and separating effective voice signals and noise signals from the preprocessed voice signals according to the voice separation model; Establishing an AI complement model, and performing abnormal filling on the effective voice signal according to the AI complement model to obtain an enhanced voice signal; Post-processing the enhanced voice signal to obtain a final voice signal; the abnormal filling is carried out on the effective voice signal according to the AI complement model to obtain an enhanced voice signal, which comprises the following steps: Identifying tone characteristics of the effective voice signals, and carrying out vectorization processing on the tone characteristics to obtain tone characteristic vectors; collecting tone characteristic vector change conditions among all audio frames of an effective voice signal, and identifying voice abnormal sections according to the tone characteristic vector change conditions; inputting the voice normal section adjacent to the voice abnormal section into an AI voice complement model to obtain the filling voice of the voice abnormal section; performing voice filling on all voice abnormal sections to obtain an enhanced voice signal; the voice abnormal section identification method based on the tone characteristic vector change condition comprises the following steps: calculating vector variation of each audio frame and adjacent frames in the effective voice signal according to the tone characteristic vector variation condition to form a vector variation sequence; establishing a time window in the vector change sequence by taking any vector change as a center, calculating the average vector change in the time window, and determining an abnormal threshold value corresponding to the vector change according to the average vector change in the time window; counting the abnormal thresholds of all vector variation, screening out audio frames with vector variation larger than the corresponding abnormal threshold, obtaining abnormal audio frames, and determining a voice abnormal section according to all abnormal audio frames; The determining the speech abnormal section according to all abnormal audio frames comprises the following steps: Acquiring time intervals of various abnormal audio frames, and merging the abnormal audio frames with the time intervals smaller than a preset time threshold to obtain a plurality of initial abnormal segments; And obtaining vector variation average values of the initial abnormal segments, clustering the initial abnormal segments according to the vector variation average values, and obtaining a plurality of voice abnormal segments according to clustering results.

Description

Narrow-band satellite voice communication noise reduction system and method Technical Field The application relates to the technical field of satellite communication, in particular to a narrow-band satellite voice communication noise reduction system and method. Background In a satellite voice communication scenario, there is a key feature of bandwidth limitation (1.2k2.4k4.8k) of the core communication channel, which itself forms a natural constraint on the transmission bandwidth of the voice signal. Meanwhile, satellite communication links are susceptible to factors such as spatial electromagnetic interference, signal attenuation, multipath effects and the like, so that signal-to-noise ratio (SNR) is obviously reduced in the communication process. Disclosure of Invention The invention provides a narrow-band satellite voice communication noise reduction system and a method, which are used for solving the problem of poor communication quality of satellite voice communication under narrow-band transmission constraint in the prior art, and comprise the following steps: the preprocessing module is used for acquiring an original voice signal of the satellite receiving end, preprocessing the original voice signal and obtaining a preprocessed voice signal; The separation module is used for establishing a voice separation model and separating effective voice signals and noise signals from the preprocessed voice signals according to the voice separation model; The enhancement module is used for establishing an AI complement model, and carrying out abnormal filling on the effective voice signal according to the AI complement model to obtain an enhanced voice signal; And the post-processing module is used for carrying out post-processing on the enhanced voice signal to obtain a final voice signal. Further, the preprocessing module performs preprocessing on the original voice signal, including: acquiring a preset standard amplitude interval, and adjusting the amplitude of an original voice signal to be in the preset standard amplitude interval based on automatic gain control; high-pass filtering is carried out on the calibrated original voice signal, and direct current components in the original voice signal are filtered; And carrying out band-pass filtering on the original voice signal after the high-pass filtering to filter high-frequency noise and low-frequency noise in the original voice signal. Further, the separation module establishes a voice separation model, separates an effective voice signal and a noise signal from the preprocessed voice signal according to the voice separation model, and includes: collecting voice sample data and noise sample data, and mixing voice audio in the voice sample data and noise audio in the noise sample data to obtain mixed audio; Establishing a training sample set according to the mixed audio and the voice label corresponding to the mixed audio, and establishing and training a voice separation model according to the training sample set to obtain a trained voice separation model; and inputting the preprocessed voice signals into a trained voice separation model to obtain effective voice signals and noise signals. Further, the establishing and training the voice separation model according to the training sample set includes: and training a voice separation model by adopting a scale-invariant signal-to-noise ratio loss function. Further, the enhancement module performs abnormal filling on the effective voice signal according to the AI complement model to obtain an enhanced voice signal, including: Identifying tone characteristics of the effective voice signals, and carrying out vectorization processing on the tone characteristics to obtain tone characteristic vectors; collecting tone characteristic vector change conditions among all audio frames of an effective voice signal, and identifying voice abnormal sections according to the tone characteristic vector change conditions; inputting the voice normal section adjacent to the voice abnormal section into an AI voice complement model to obtain the filling voice of the voice abnormal section; and performing voice filling on all the voice abnormal sections to obtain an enhanced voice signal. Further, the identifying the abnormal voice segment according to the tone characteristic vector change condition includes: calculating vector variation of each audio frame and adjacent frames in the effective voice signal according to the tone characteristic vector variation condition to form a vector variation sequence; establishing a time window in the vector change sequence by taking any vector change as a center, calculating the average vector change in the time window, and determining an abnormal threshold value corresponding to the vector change according to the average vector change in the time window; And counting the abnormal thresholds of all vector variation, screening out the audio frames with vector variation larger than the correspon