CN-117334205-B - Audio format conversion method and system based on multithreading
Abstract
The invention discloses a multithreading-based audio format conversion method and system, which are used for importing an audio file and selecting a conversion reference body, classifying the audio file based on an initial audio format and a target audio format of the conversion reference body, selecting a neural network learning model, carrying out segmentation format conversion on the conversion reference body according to fixed output parameters, acquiring a segmentation quality assessment result of the conversion reference body, adjusting the neural network learning model based on the quality assessment result until the segmentation quality assessment result meets mathematical expectations, starting the audio format conversion of the initial audio file, segmenting the audio file, sequentially carrying out format conversion on the audio fragment of the current segment according to the neural network learning real-time model, and determining and adjusting the output parameters according to the audio specification parameters of the next segment in advance.
Inventors
- HUANG ZEJIE
Assignees
- 江下信息科技(惠州)有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20230919
Claims (8)
- 1. A multi-threading based audio format conversion method, comprising the steps of: step 100, importing audio files and identifying initial audio formats of the audio files, and classifying different audio files based on the initial audio format of each audio file and a target audio format of each audio file; Step 200, selecting a neural network learning model based on an initial audio format and a target audio format of a conversion reference body, selectively intercepting the imported audio file to form the conversion reference body, generating output parameters by the neural network learning model according to general audio specification parameters, and sequentially converting the audio format of the conversion reference body according to the output parameters; in the step 200, the implementation steps of selectively intercepting the imported audio file to form a conversion reference body are as follows: Analyzing the imported audio file, sequentially intercepting the audio file to form a plurality of audio segments, and acquiring audio specification parameters of each audio segment; selecting audio segments with the audio specification parameters in different combination forms to form a conversion reference body, wherein the conversion reference body is formed by combining the audio segments with different durations in the audio file; Step 300, performing quality authentication scoring on the conversion reference body converted into the target audio format, adjusting the neural network learning model based on a scoring result and audio specification parameters carried by the conversion reference body, establishing an association relationship between the audio specification parameters and the neural network learning model, and generating neural network learning real-time models corresponding to different groups of the audio specification parameters one by one so as to adjust output parameters in real time until the scoring result accords with expectations; step 400, starting audio format conversion of the initial audio files, obtaining audio specification parameters of each maximum audio segment, and correspondingly adjusting the output parameters by the neural network learning real-time model in advance according to the audio specification parameters of the audio segments of the next segment until each initial audio file is converted into a corresponding target format.
- 2. A multi-thread based audio format conversion method according to claim 1, In step 200, the neural network learning model is pre-saved, and is initially selected based on a conversion relationship between the initial audio format and the target audio format, and the neural network learning model uniformly adjusts the output parameters based on the universal audio specification parameters corresponding to the initial audio format.
- 3. A multi-thread based audio format conversion method according to claim 1, The audio specification parameters comprise audio, sampling frequency, sampling bit number, channel number and bit rate; The output parameters comprise the number of output channels and a coding format; The audio format includes MP3, WAV, AAC, FLAC, OGG.
- 4. A multi-thread based audio format conversion method according to claim 1, In step 300, the specific implementation steps of quality authentication scoring for the conversion reference body that has been converted into the target format are as follows: Disassembling the conversion reference body again into a plurality of audio segments according to the combination form of the audio specification parameters; The neural network learning model adjusts output parameters for each audio segment in sequence, and simultaneously obtains audio specification parameters of each audio segment; Respectively carrying out quality authentication scoring on each audio segment converted into a target audio format to obtain scoring results of a plurality of audio segments; Firstly selecting audio specification parameters of the audio segments with large scoring result differences to form a first data set, and importing the first data set into the neural network learning model for training for a plurality of times to adjust the output parameters until the scoring result after the audio segments with low scoring result are converted into a target audio format meets mathematical expectations; And selecting the audio specification parameters of the audio segments with small grading result difference to form a second data set, introducing the second data set into the neural network learning model for testing, and verifying the stability of the adjusted output parameters by utilizing the grading result after the audio segments are reconverted into the target audio format.
- 5. A multi-thread based audio format conversion method according to claim 4, wherein, Establishing an association relation between the audio specification parameters of the audio segments and the neural network learning real-time model to form a learning rule, and adjusting the output parameters based on the learning rule and the audio specification parameters of the audio segments until the grading result of each audio segment in the conversion reference body meets mathematical expectations; The input value of the neural network learning real-time model is the audio specification parameter, and the output value of the neural network learning real-time model is the output parameter.
- 6. A multi-thread based audio format conversion method according to claim 5, wherein, Establishing an auditory model simulating human ears by using PESQ to predict subjective scores of listeners on audio quality of the conversion reference body converted into a target audio format; wherein the score ranges from-0.5 to 4.5, and the higher the score, the higher the audio quality is, and the audio quality after audio format conversion is evaluated.
- 7. A multi-thread based audio format conversion method according to claim 5, wherein, In steps 100 to 300, the audio format conversion is suspended by the initial audio file, and an audio specification conversion test is performed on the combined conversion reference body; In the step 400, the audio format conversion of the initial audio file is started based on the learning rule of correspondingly adjusting the output parameters of the acquired different audio specification parameters.
- 8. A multi-thread based audio format conversion method according to claim 7, The specific implementation steps for starting the audio format conversion of the initial audio file are as follows: Acquiring audio specification parameters of audio with set time length according to the grabbing frequency, and dividing the audio with the set time length into different audio fragments according to the learning real-time model of the neural network with different matching standards; Utilizing the neural network learning real-time model to adjust output parameters based on the current audio specification parameters of the audio fragment; and based on the audio specification parameters and the learning rules of the next audio fragment, determining an adjustment target of the neural network learning real-time model in advance so as to determine an adjustment mode of the output parameters corresponding to the next audio fragment in advance, and performing timely audio format conversion on the next audio fragment.
Description
Audio format conversion method and system based on multithreading Technical Field The invention relates to the technical field of audio format conversion, in particular to an audio format conversion method and system based on multithreading. Background Audio format conversion refers to the process of converting one audio format into another, typically in order to enable different devices or software to play, edit or process audio files. The principle of audio format conversion is to decode an audio file of a different format into an intermediate format and then re-encode it into the target format. Common audio formats have different characteristics such as compression ratio, sound quality, file size, etc. Therefore, it is important to select a suitable audio format, which may be selected according to different requirements. Most of the existing audio format conversion methods select fixed output parameters for the audio file, and as the audio specification parameters in the audio file are not identical, they are not identical, but are changed according to different time periods, the quality of the audio file is damaged after the audio file is converted into the audio format. Disclosure of Invention The invention aims to provide a multithreading-based audio format conversion method and a multithreading-based audio format conversion system, which are used for solving the technical problem that the quality of an audio file is damaged after the audio file is converted by selecting fixed output parameters aiming at the audio file in the prior art. In order to solve the technical problems, the invention specifically provides the following technical scheme: a multi-threading based audio format conversion method comprising the steps of: step 100, importing audio files and identifying initial audio formats of the audio files, and classifying different audio files based on the initial audio format of each audio file and a target audio format of each audio file; Step 200, selecting a neural network learning model based on the initial audio format and the target audio format of the conversion reference body, selectively intercepting the imported audio file to form the conversion reference body, generating output parameters by the neural network learning model according to the general audio specification parameters, and sequentially performing audio format conversion by the conversion reference body according to the output parameters; Step 300, performing quality authentication scoring on the conversion reference body converted into the target audio format, adjusting the neural network learning model based on a scoring result and audio specification parameters carried by the conversion reference body, establishing an association relationship between the audio specification parameters and the neural network learning model, and generating neural network learning real-time models corresponding to different groups of the audio specification parameters one by one so as to adjust output parameters in real time until the scoring result accords with expectations; Step 400, starting audio format conversion of an initial audio file, obtaining audio specification parameters of each maximum audio fragment, and correspondingly adjusting the output parameters by the neural network learning real-time model in advance according to the audio specification parameters of the audio fragments of the next fragment until each audio file is converted into a corresponding target format. In a preferred embodiment of the present invention, in step 200, the neural network learning model is pre-saved, and the neural network learning model is initially selected based on a conversion relationship between an initial audio format and a target audio format, and the neural network learning model uniformly adjusts the output parameters based on a generic audio specification parameter corresponding to the initial audio format. In a preferred embodiment of the present invention, in the step 200, the implementation step of selectively intercepting the imported audio file to form a conversion reference body includes: Analyzing the imported audio file, sequentially intercepting the audio file to form a plurality of audio segments, and acquiring audio specification parameters of each audio segment; and selecting the audio segments with the audio specification parameters in different combination forms to form a conversion reference body, wherein the conversion reference body is formed by combining the audio segments with different durations in the audio file. As a preferred aspect of the present invention, the audio specification parameters include audio, sampling frequency, sampling bit number, channel number and bit rate; The output parameters comprise the number of output channels and a coding format; The audio format includes MP3, WAV, AAC, FLAC, OGG. As a preferred embodiment of the present invention, in step 300, the specific implementation steps of perfor