CN-121999802-A - Method and device for compliance processing of audio and electronic equipment

CN121999802ACN 121999802 ACN121999802 ACN 121999802ACN-121999802-A

Abstract

The invention provides a method and a device for processing audio in a compliance manner and electronic equipment. The method comprises the steps of obtaining original audio data to be processed, detecting the original audio data through an audio compliance detection model to generate content information related to illegal content fragments of the original audio data, processing the illegal content of the original audio data through an audio compliance enhancement model based on the content information related to the illegal content fragments to obtain compliance audio data, and outputting the compliance audio data. The invention realizes real-time, accurate and intelligent combination rule processing of the audio in the content production stage, and solves the problems of difficult identification of the audio rule violations, experience interruption caused by hard processing mode and low manual auditing efficiency.

Inventors

WU LI
ZHANG HAO
WENG WENMIN

Assignees

瑞芯微电子股份有限公司

Dates

Publication Date: 20260508
Application Date: 20260123

Claims (13)

1. A method of compliance processing audio, comprising: Acquiring original audio data to be processed; Detecting the original audio data through an audio compliance detection model to generate content information associated with offending content segments of the original audio data; Processing the offending content segments of the original audio data through an audio compliance enhancement model based on content information associated with the offending content segments to obtain compliant audio data, and And outputting the compliance audio data.
2. The method of claim 1, wherein detecting the raw audio data by an audio compliance detection model comprises: And carrying out content understanding based on audio features and text semantics on the original audio data through the audio compliance detection model serving as a voice and text bimodal model, wherein the voice and text bimodal model fuses the audio feature extraction and text semantic analysis capability.
3. The method of claim 1, wherein detecting the original audio data by an audio compliance detection model to generate content information associated with offending content segments of the original audio data comprises: converting the original audio data into a word sequence aligned with the sound and the text by an audio encoder, and Inputting the word element sequence into a large language model to perform compliance analysis on the original audio data through the large language model, and generating content information associated with illegal content segments of the original audio data.
4. The method of claim 3, wherein converting the original audio data into a sequence of phonetically aligned tokens by an audio encoder comprises: and associating the audio features of the original audio data with text semantics through the audio encoder to generate the word element sequence corresponding to the audio features and the text semantics.
5. The method of claim 1, wherein the content information associated with the offending content segments of the original audio data includes a time stamp of offending content, an audio segment location, and a type of offending.
6. The method of claim 1, wherein processing the offending content segments of the original audio data through an audio compliance enhancement model based on content information associated with the offending content segments to obtain compliant audio data comprises: Performing voice separation processing on the original audio data to obtain an independent voice channel and a background voice channel; Performing processing operations on the offending content segments in the vocal tract through the audio compliance enhancement model based on the audio segment locations of the offending content segments, and And merging the processed human voice channel with the background voice channel to obtain the compliance audio data.
7. The method of claim 6, wherein performing processing operations on the offending content segments in the vocal tract by the audio compliance enhancement model based on the audio segment locations of the offending content segments comprises: performing at least one of a fade-out process and a compliance content enhancement implant process on the offending content segments in the human voice channel through the audio compliance enhancement model, Wherein the fade-out process includes reducing the sharpness and intelligibility of offensive content and preserving the fundamental characteristics of human voice using a deep-learning audio enhancement model, Wherein the compliant content enhancement implantation process comprises implanting preset compliant audio content at a specific audio segment location.
8. The method of claim 7, wherein reducing the sharpness and intelligibility of offending content and preserving voice basic features using a deep learning audio enhancement model comprises: and carrying out selective frequency attenuation and phase disturbance on the audio signals corresponding to the illegal content segments in the voice channel, wherein the selective frequency attenuation is used for attenuating high-frequency and transient frequency components representing semantic definition in the audio signals.
9. The method of claim 7, wherein implanting the pre-defined compliant audio content at the particular audio clip location comprises: performing acoustic environment matching processing on the preset compliance audio content, and And dynamically adjusting the loudness of the human voice channel in the audio segment where the preset compliant audio content and the human voice coexist.
10. The method of claim 1, wherein outputting the compliant audio data comprises: Audio encoding and streaming packaging the compliant audio data to generate a compliant live stream, and Pushing the compliant live stream to a live server.
11. The method as recited in claim 2, further comprising: obtaining a newly added violation audio sample library and a corresponding violation text label, and And performing incremental training on the pre-trained audio compliance detection model based on the newly added illegal audio sample library.
12. An apparatus for compliance processing audio, comprising: the audio acquisition module is configured to acquire original audio data to be processed; an audio compliance detection module configured to detect the original audio data through an audio compliance detection model to generate content information associated with offending content segments of the original audio data; An audio compliance enhancement module configured to process the offending content segments of the original audio data by an audio compliance enhancement model based on content information associated with the offending content segments to obtain compliant audio data, and An audio output module configured to output the compliant audio data.
13. An electronic device, comprising: a memory configured to store an executable program, and A processor configured to execute the program to perform the method according to any one of claims 1 to 11.

Description

Method and device for compliance processing of audio and electronic equipment Technical Field The present invention relates to the field of audio processing technologies, and in particular, to a method and an apparatus for compliance processing of audio, and an electronic device. Background With the rapid development of the network live broadcast technology, the network red live broadcast industry rapidly rises and is widely applied to a plurality of technical scenes such as social media, entertainment platforms, e-commerce live broadcast, education live broadcast, live broadcast with goods, news media live broadcast, product release meeting and the like, and the application of the network live broadcast is increasingly abundant and deep. However, there are still challenges in the administration of audio and video content. For example, problems such as illegal poor information propagation, false advertising, company sensitive information leakage, personal privacy exposure, weakness of consciousness of the user's discipline, and relative lag of technological development and supervision means are increasingly prominent. Particularly in the aspect of audio content, the information hiding performance is strong, the recognition difficulty is high, and a new test is brought to the content compliance. Disclosure of Invention The invention provides a method and a device for compliance processing of audio, and electronic equipment, which can realize compliance detection of audio data and enhance supervision capability of audio and video contents. In one aspect of the invention, a method of compliance processing audio is provided. The method comprises the steps of obtaining original audio data to be processed, detecting the original audio data through an audio compliance detection model to generate content information related to illegal content segments of the original audio data, processing the illegal content segments of the original audio data through an audio compliance enhancement model based on the content information related to the illegal content segments to obtain compliance audio data, and outputting the compliance audio data. In another aspect of the invention, an apparatus for compliance processing audio is provided. The device comprises an audio acquisition module, an audio compliance detection module, an audio output module and a control module, wherein the audio acquisition module is configured to acquire original audio data to be processed, the audio compliance detection module is configured to detect the original audio data through an audio compliance detection model to generate content information related to illegal content fragments of the original audio data, the illegal content fragments of the original audio data are processed through an audio compliance enhancement model to obtain compliance audio data, and the audio output module is configured to output the compliance audio data. In yet another aspect of the invention, an electronic device is provided. The electronic device comprises a memory configured to store an executable program and a processor configured to execute the program to perform a method of compliance processing audio as described above. According to the technical scheme, the original audio data are acquired, and the audio compliance detection model is utilized to conduct real-time and automatic compliance analysis on the original audio data, so that illegal content fragments which are difficult to cover in traditional manual auditing can be accurately identified based on semantic understanding, supervision blind areas caused by manpower and time limitation are eliminated, and the comprehensiveness of compliance coverage is ensured. Different from traditional one-cut shielding, based on recognition result, invoking the audio compliance enhancement model to perform accurate processing, the processing flow from point to point ensures that the conversion of the compliance audio is completed before the audio is output, and the content risk is controlled from the source. The complex compliance rules are solidified into an automatically executable model, stable and objective compliance management is realized under the condition that manual real-time intervention is not needed, the operation cost and the risk of human errors are reduced, and an efficient, reliable and automatic solution for both experience and safety is provided. Drawings FIG. 1 is a flow chart of steps of a method of compliance processing audio in accordance with an embodiment of the present invention; FIG. 2 is an overall environmental architecture diagram of a method of compliance processing audio in accordance with an embodiment of the present invention; FIG. 3 is a flowchart of a method for an electronic device with intelligent compliance processing of audio to perform compliance processing of audio in accordance with an embodiment of the present invention; FIG. 4 is a flow chart of an intelligent compliance audio